Hilbert’s Triumph, Church and Turing’s failure, and what it means (Post #2)

We’ve specified what the definition of the Church-Turing Thesis means, though we’ve really specified 2 Church-Turing Theses, and the purist one is addressed in the 2.5 case, while a modified version handles the 2.6 case. We also specified what we need to do in order to actually give a truth value to either thesis. Now we will show that either 2.5 or 2.6 as a condition will lead to a contradiction, in that they define a larger set of computable functions than the original definition, leading to it’s falsification. I also address the case where a real human tries to compute all the computable functions, and the short answer is no, because otherwise we could violate the Time and Space Hierarchy Theorems, so the set of computable functions by real world humans is not equal, and strictly smaller than all computable functions as traditionally defined. This will be a long post, so get a drink and a snack, so that you can read it all.

Edit: I’m not focusing on what is possible in reality in this post, for many reasons, so you will have to wait for that in the next post.

The 2.5 case

The 2.5 case is easy enough to see, and is the closest to a classical Church-Turing Thesis as we can extend the human with unlimited time or memory, or alternatively a UTM with a CTC, similar to how other attempted extensions of Turing Machines have been made in say, non-determinism, or perhaps quantum Turing Machines or other variants of Turing Machines. Computability theorists proved that a lot of extensions and variants of Turing Machines were equivalent in power, given unlimited time and memory.

The result below is essentially that once we allow arbitrary extensions that still support the conditions defined in previous posts, there is no longer an automatic equivalence in the functions Turing Machines can compute and the functions that arbitrary extensions of them can compute.

The link below gives the gory details on how to do it, but basically they managed to figure out a way to compute the halt function, as well as more than that, by getting the algorithm to show one of 2 unique fixed points: Either the arbitrary Turing Machine halts, and thus the unique fixed point is of the induced operation on y is a singleton distribution concentrated on y = σt, and outputs halt. If the arbitrary Turing Machine loops, then instead it outputs loop for every Turing Machine that loops, and the unique fixed point is a geometric distribution, where in this example it continually halves the probability.

The third step removes any other fixed points by essentially requiring any invalid execution history to go back to the start state, via step 3 of the algorithm.

https://​​www.scottaaronson.com/​​papers/​​ctchalt.pdf

We now need to verify whether the constraints were satisified:

2.1 is easy enough to verify, since the expected string length is always finite, and while there’s no computable upper bound, it is always finite, if arbitrarily long. Similarly, it might have to accept strings of unbounded length, but they’re not infinitely long. That means that it’s decidable in a finite, but unbounded time and memory.

2.2 is another easy property to verify, since they haven’t changed the definition of a UTM, and thus all the power is coming from the extension, as desired.

2.3 is yet again another property that we can verify to be correct, since it either answers Halt in the case where the Turing Machine halts, or it answers Loop in the case it doesn’t halt, and both answers have unique fixed points. Also, it accepts every string in the language, which is equivalent to answering yes to the original question, and it doesn’t accept any string not in the language, and thus the answer to the original question is no, so Rafael Harth’s condition is satisfied.

2.4 is the last property to verify, and the algorithm does consist of a finite number of exact instructions. Here’s the instructions below that are copied from the paper:

Lemma 2 Halt∈ ComputableCTC. Proof. We’re given a Turing machine hPi, and need to decide whether P ( ) halts. >Suppose P ( ) runs for at least t steps; then let σt be a string that encodes the first t steps of P ( )’s execution history in some canonical way. Thus, σ0 encodes a blank input tape that hasn’t yet been acted upon, with P in its initial configuration; and σt+1 can easily be obtained from σt by appending the result of one additional P step. Note that, given an arbitrary string y, together with knowledge of hPi, it’s easy to decide whether y= σt for some t, and if so which t. Call σt a halting history if it shows P halting at its tth step, and a non-halting history otherwise.

Now our TMCTC, M, takes hPi as input in its causality-respecting register RCR, and some string y as input in its CTC register RCTC, and does the following: If y is a halting history, then leave y unchanged in the CTC register, and output “HALT”

If y = σt is a non-halting history, then set y := σt+1 with probability 12 and y := σ0 with probability 12, and output “LOOP”

If y is not a valid execution history for P ( ), then set y := σ0, and output “LOOP”

Thus, we can immediately see that the algorithm, in conjunction with the extension, decides a function that is essentially the halting problem for Turing Machines. But UTMs can’t decide their own halting problem, and thus we have expanded the set of computable functions beyond it’s original definition, meaning that the Church-Turing Thesis that a UTM can compute everything an effective function could do, as defined in my previous post, can’t be correct.

The 2.6 case

The 2.6 case is quite different, but also has some similarities. It is not, properly speaking the Church-Turing Thesis, but a modified version, and thus for Church-Turing purists, the 2.5 case will be the relevant case for you. In this case, we want to show that a UTM can simulate a human with unlimited time and memory, that can compute more functions than the UTM itself is capable of computing, thus showing that the Church-Turing Thesis as defined in this sequence is false, in that a human with unlimited time and memory that computes the halt function, as well as more general oracle input function, is simulatable by a UTM, thus meaning that the set of functions that a UTM can simulate is not equal to the functions that a UTM can compute.

The way to do this is by using an infinite branching tree model, where there’s an uncountable infinity of branches, but only a countable infinity of vertices and edges, and each branch overlaps with an uncountable infinity of other branches, meaning that a UTM can simulate the seemingly unsimulatable.

The edges correspond to the simulated moments of time, and they are countable, meaning that our parallel simulation can get to all of them. The branches correspond to the entire infinite sequence of say, a light flash.

Essentially, we will use an oracle tape, but unlike the usual oracle tapes, which can say solve the halting problem, we will only be considering an oracle tape that only has 0s. This is trivially computable by a UTM. As the oracle tape consists of only 0s, this corresponds to a light that is always off. Now consider a Turing machine that simulates two such oracle machines in parallel, but before it simulates the nth timestep for each oracle machine, it alters the representation of the configuration of the second machine to put a 1 on the nth square on its tape. Since the simulated machine couldn’t have looked at a square that far away yet, it will just behave as if it were given an infinite bitstring of 1s. In this case it will simulate a light that is always on. So the main Turing machine will be simulating a light that is always on and a light that is always off in parallel.

Now let’s add the infinite branching, and copy the C-like pseudocode from the website

initialise_om_with_room_simulation_program(om[0]) copies = 1;

for (i = 0; ; i++) { for (j = 0; j < copies; j++) { fork(om[j], om[copies+j]); set_o_tape_square(om[copies+j], i, 1); }

    > copies *= 2;

    > for (j = 0; j <= copies; j++)
            compute_next_step(om[j]);

}

initialise_om_with_room_simulation_program(om) sets up the specified oracle machine with the transition table that will make it perform a simulation of the room with the flashing lightbulb (whose state in minute n will be determined by the nth square of its initially blank oracle tape).

set_o_tape_square(om, i, value) just alters the ith square of the oracle tape of the oracle machine to be equal to the given value.

fork(om1, om2) simply sets the entire configuration of the oracle machine om2 (its tapes, tape heads, current state, and transition table) to be the same as that of om1.

Voila! We have a finite algorithm/​effective method that can be run, and that a UTM can simulate an oracle later on that can solve the halting problem.

In essence, we now have our desired simulated oracle, and that this can run a human oracle that can solve the halting with unlimited memory and time, because we can initialize the function the starting oracle machine has to have any transition table we like, and any input we like, so we can code the halting problem in the original blank oracle tape UTM, and thus we can simulate someone calculating all of the busy beaver numbers, alongside the usual computable functions.

While it will go wrong in all but one branch, this is enough to refute the Church-Turing Thesis as defined, because there must exist a branch of the simulation where it calculated a larger set of functions than the UTM can compute, thus contradicting the Church-Turing Thesis in that universe, and since it’s essentially a universally quantified variable, or perhaps a for-all condition, and thus one instance of a condition being violated is enough to break the thesis.

This works because reductionism fails here, that is while most individual bitstrings have infinite Kolmogorov complexity, the set of all such sequences of bits has very little Kolmogorov complexity. In essence, sometimes you can’t reduce something to it’s component parts without incurring infinite complexity penalties, and holism is required.

Link is below:

http://​​www.amirrorclear.net/​​academic/​​ideas/​​simulation/​​index.html

Credit to Scott Aaronson, Toby Ord, Mohammad Bavarian, Giulio Gueltrini, David Hilbert, Alonzo Church, Alan Turing, Rafael Harth, myself and many other researchers for both formalizing what a computer model could do, what effective functions were, and in general formalizing enough about our mathematical notion of computation and math such that I could easily see how the thesis didn’t work, and importantly gave us the ability to talk about real life computers reliably.

Even with Church and Turing’s failure, comparable to how Euclid failed in getting only one possible model of geometry, since hyperbolic geometry exists, it still can spawn many interesting questions.

Hilbert’s triumph

There were certain problems that Hilbert thought were very important to solve, like say solving a Diophantine equation or decide whether a statement in first order logic is universally valid, and he wanted to find an algorithm for them to solve in the general case using effectively calculable functions.

Today, we will be showing that the effectively calculable functions, as defined in the sequence, and allowing arbitrary extensions to a UTM in condition 2.5 that respect the constraints, leads to the decidability of a few problems that would be undecidable on a basic Universal Turing Machine, and since it only asks for satisfiability, not validity (That is it must be true in 1 model that satisfies the constraints, but we don’t require it to be true in every model. In particular, it includes satisfiability and validity problems for first order logic, and the Diophantine Equations become solvable. I don’t know whether condition 2.6 would imply all the statements below, so I have used condition 2.5 below.

Hilbert’s 10th problem is decidable by an effective function

This is basically using the MRDP theorem, or Matiyasevich’s theorem, in that all computably enumerable sets are Diophantine Equations, and the converse is true, that is all Diophantine Equations are computably enumerable, and thus in the class RE.

Since the UTM with a CTC extension contains both RE and co-RE complexity classes as decidable problems, by a generalization of the algorithm/​effective function used in Scott, Mohammad and Giulio’s paper, we can immediately note that there is an algorithm for the 10th problem, which asks for the existence of an algorithm to solve the Diophantine equation, has a yes answer.

Hilbert’s Entscheidungsproblem, or alternatively the validity problem for first order logic, is decidable by an effective function

This one is trivial, as the CTC Turing Machine model can solve all problems that are Turing-reducible to the Halting problem, combined with the fact that the validity problem for first order logic is RE-complete, means that there is an algorithm for Hilbert’s Entscheidungsproblem or the validity problem for first-order logic.

Some bonus problems related to Hilbert, but that Hilbert didn’t actually declare as problems will be shown to be decidable below by me:

The unrestricted satisfiability problem for first order logic is decidable by an effective function

This is because the satisfiability problem for first order logic is decidable by an extension to the UTM, which gives us an effective function that satisfies the constraints I made back in post 1. In particular, it’s able to compute all of the co-RE complete functions, since there is a Turing reduction from the Recursively Enumerable class to the Co-Recursively Enumerable class, and the algorithm on a CTC UTM can solve all problems that are Turing reducible to the halting problem, and since the now solvable Halting problem is RE-complete, it follows that the satisfiability problem must be solvable by an effective function.

Unrestricted satisfiability in first order logic is co-RE-complete.

Finite satisfiability of first order logic is decidable by an effective function

This result essentially comes about because finite satisfiability is in the Recursively Enumerable complexity class, and we have a CTC algorithm for a UTM, because it can solve RE-complete problems like the halting problem.

Finite validity of first order logic is decidable by an effective function

Trakhtenbrot’s theorem is essentially a theorem that tells us that finite validity is not recursively enumerable, but it is co-recursively enumerable. That’s because if we could do so, we could put the problem of whether a machine does not halt in RE, which can’t be the case, and instead is in co-RE. However, since a CTC extension of a UTM can solve all problems Turing-reducible to the halting problem, which includes co-RE complete problems, finite validity of first-order logic is decidable.

Implications

The most important implications of the post today are the following, as well as some open questions for people to answer in the comments:

Computable!=Simulatable

This will have an impact on a future post, but for now, we can say that there’s a gap between what you can compute to solve a question, and what you can simulate, that is what you can do if you’re trying to just run the computation, with no expectation of an answer.

Are there other cases where simulatability doesn’t equal the ability to compute something? Are there general properties that are required in order to get this non-equivalence?

Decidability is relative to a machine/​computational model, and it’s constraints, and it’s not universal

This is probably another point that people may already know, but it’s useful to know this, since this can be important, and was important in why Hilbert shouldn’t have given up entirely once he heard that the Entscheidungsproblem was undecidable for Universal Turing Machines.

This also means any claimed decidability/​undecidability proof needs to make it’s assumptions clear, because otherwise you can confuse yourself.

It is not an absolute or unique concept, that is the intuitive definition splits apart into multiple non-equivalent definitions, unlike with the quotes below:

Gödel also said:

The resulting definition of the concept of mechanical by the sharp concept of “performable by a Turing machine” is both correct and unique. … Moreover it is absolutely impossible that anybody who understands the question and knows Turing’s definition should decide for a different concept. (Quoted in Wang 1996: 203)

Or this quote below:

Gödel said that “the great importance … [of] Turing’s computability” is

largely due to the fact that with this concept one has for the first time succeeded in giving an absolute definition of an interesting epistemological notion, i.e., one not depending on the formalism chosen. (Gödel 1946: 150)

More problems are decidable, but there’s an ever increasing tower of intractability

This is important, both for methods that use the basic effective functions/​algorithms and extensions of Turing Machines, and for people that want to focus on very different models.

Unfortunately, the tower of intractability of problems is vast, and we don’t know if it actually stops, that is things can get ever more complex and difficult the more you’ve scaled the tower.

Hilbert was right in that we could decide with an algorithm all of 1st order logic’s validity, as well as Diophantine equations, and it turns out satisfiability is also decidable, and when we restrict to finite cases of satisfiability or validity in first order logic, they are also decidable. However, there are more problems that aren’t solved, like the totality problem under condition 2.5.

Define what your model of computation is able to do, and be careful about extensions

This is very important, and IMO this is the most important takeaway: Define what your model of computation is capable of doing, what constraints if any, and what extensions are allowed, since you can’t assume that an extension will always lead to defining the same class of computable functions in your model.

In particular, unboundedness plus a lack of many constraints can easily lead to too much freedom for what your model can do.

Be very careful of implicit or explicit universal claims

This is very important, as we quite often underestimate the difficulty of making universal claims about X.

This can be strengthened into one subpoint:

When you make a universal claim, it is not enough to prove that certain examples exist, or even that certain seemly different things are equal. You also need to think about how your conjecture or thesis can be broken, and in particular you’d need to either have good reason to believe that the counterexamples do not work, modify your conjecture, or accept that it’s false.

A perfect example here is Amalthea’s comment below:

https://​​www.lesswrong.com/​​posts/​​fmj35iqfJgrgsi59F/​​?commentId=tAXkemXFcuvrQL6it

The problem, Amalthea, is that the edge cases are exactly the problem for universal statements, akin to how the Weierstrass function disproved the assumption that all continuous functions were differentiable, and the reason I modeled Church and Turing’s thesis as a universal claim is because otherwise, some statements from the SEP describing Church and Turing’s thesis become very suspect or even false if we assume they aren’t intended to be universal.

Here’s one good example:

The Church-Turing thesis is the assertion that this set S contains every function whose values can be obtained by a method satisfying the above conditions for effectiveness.

Another good example is a quote below:

Gödel also said:

The resulting definition of the concept of mechanical by the sharp concept of “performable by a Turing machine” is both correct and unique. … Moreover it is absolutely impossible that anybody who understands the question and knows Turing’s definition should decide for a different concept. (Quoted in Wang 1996: 203)

The best example is this quote, which expresses absolute confidence, which was very unwarranted in light of the results in the post:

Gödel said that “the great importance … [of] Turing’s computability” is

largely due to the fact that with this concept one has for the first time succeeded in giving an absolute definition of an interesting epistemological notion, i.e., one not depending on the formalism chosen. (Gödel 1946: 150)

This of course, does not work in light of the new results.

Critically, the every condition changes things, as if I can prove that I can find any way to extend the set of functions by extending the UTM that is consistent with the constraints of an effective function, or I can figure out a way to simulate a universe via a UTM where humans can compute more functions than a UTM can, I have broken the thesis, as I did here.

People focusing on different models of computation are doing real computability theory work, just as today we’d call hyperbolic geometry work real geometry

In a sense, this discovery by many people above reminiscent of how people thought that Euclidean geometry was the only valid model of geometry, and the discovery of hyperbolic geometry shattered a similar implicit thesis that the only valid geometries are Euclidean geometry, and both involved very smart people assuming that their intuition was good enough to substitute for a formal treatment of the subject. Somewhat similar remarks could be made of the Weierstrass function breaking the implicit assumption that all functions that are continuous must also be differentiable.

What about if we assumed the constraints of reality?

People have asked me to keep to the constraints of reality, and while I don’t think this was Turing’s goal, given the UTM has no bounds on the time or memory required to solve a problem, which is unlikely to be the case in real life, as well as this quote below, which explicitly states that it idealizes things away, I will address the question today. It’s false there too, because if humans could compute all the computable functions in constant time and space, as humans only have a constant memory capacity and constant time, that would violate the Time and Space Hierarchy theorems, because we could solve problems that are proven to require exponential time and space in constant time and space.

Either way, the set of effective functions is larger or smaller than the original definition, and thus cannot be universally correct.

Quote is below:

Turing introduced his machines with the intention of providing an idealized description of a certain human activity, the tedious one of numerical computation.

Open questions:

Open question: How far can we go with effective functions as defined in the sequence?

We know that effective functions as defined can at least get to the classic Halting oracle tier today. The real question is, how far can we go?

The conditions are stated for completeness below:

  1. Effectively calculable functions have the following constraints/​rules:

2.1 They must always give an answer in finite time, but the time bound can be arbitrarily large, and the space required to solve a problem may be unbounded, but never infinite, just arbitrarily large.

2.2 The human is assumed to have unlimited paper to act as a memory, and unlimited time to compute (They are immortal.)

2.3 The original condition has been replaced by the following condition below from Rafael Harth:

In this setting, we call a problem decidable iff there exists a single, fixed procedure that can take arbitrary instances of this problem and always behaves like this:

If the correct answer is ‘yes’, it outputs ‘yes’. If the correct answer is ‘no’, it outputs ‘no’.

The original statement is included below for completeness:

Its instructions need only to be followed rigorously to succeed. In other words, it requires no ingenuity to succeed.[4]

2.4 This condition is quoted below:

It consists of a finite number of exact, finite instructions.

2.5 Arbitrary extensions to the UTM are valid as long as they obey the constraints above.

2.6 Alternatively, any algorithm or effective method as defined in this post is allowed to be used so long as it is simulatable by a UTM, even multiple UTMs are allowed to simulate something.

This splits into 3 subquestions below:

Open question 1: Using the constraints on effective functions in the conditions below plus condition 2.5, how many functions can we solve?

Essentially, this is asking about how far we can go with extensions.

Open question 2: Using the constraints on effective functions in the conditions below plus condition 2.6, how many functions can we simulate?

Essentially, this asks how many effective functions we can simulate with a Universal Turing Machines.

Open question 3: Do they to different functions that are effective, or do they correspond to the same functions with the same power?

This is asking whether arbitrary extensions to a UTM are just as good as simulating all the functions on a UTM. Only one extension that is equal is necessary to give a yes answer, while all possible extensions that respect the constraints cannot equal a UTM in power is required for it to be a no answer.

I’d like for others to follow in my footsteps, to at least advance what we can do, in a theoretical sense.

Emil Post was rather right here when Church possibly smuggled in a definition as a hypothesis, and it was indeed needed to be verified that it worked, and the quote is below:

Emil Post referred to Church’s identification of effective calculability with recursiveness and lambda-definability as a ‘working hypothesis’, and he quite properly criticized Church for masking this hypothesis as a definition:

[T]o mask this identification under a definition … blinds us to the need of its continual verification. (Post 1936: 105)

It’s an okay definition, albeit there are better definitions now if you just want to focus on the UTM model and only the computable functions, but as a hypothesis of all computation, it is false.

My basic, fundamental problem is people keep switching between the correct motte of the Church-Turing thesis as a definition of computability, albeit there are better definitions, and then expand it into the incorrect bailey of implying absoluteness/​uniqueness/​only valid or intuitive definition of computability.

In the next post, I’ll try to state what we can actually compute, over the very long run, and address the practicalities more.

Aside: I finally found where the Church-Turing Thesis came from

There’s a link to the Stanford Encyclopedia of Philosophy, and I think they’re the best resource for the Church-Turing Thesis, and the link will be below here:

https://​​plato.stanford.edu/​​entries/​​church-turing/​​