There is a significant difference between uncertainty about mathematical truths in cases where there isn’t a known procedure for checking whether a mathematical claim is true or false, versus when there is but you do not have the computational resources to carry it out. Examples of the former include the Collatz and twin prime conjectures, and examples of the latter include whether or not a given large number is a semi-prime, and what the first decimal digit of Graham’s number is.

The former should not be called logical uncertainty, because it is about what is true, not about what can be proved; I’ll call it mathematical uncertainty instead. The latter really is uncertainty about logic, since we would know that the claim is either proved or refuted by whatever theory we used to prove the algorithm correct, and we would just be uncertain as to which one.

It’s well-known that standard probability theory is a poor fit for handling logical uncertainty because it assumes that the probabilities are logically coherent, and uncertainty about what the logical coherence constraints are is exactly what we want to model; there are no possible outcomes in which the truth-value of a decidable sentence is anything other than what it actually is. But this doesn’t apply to mathematical uncertainty; we could study the probability distribution over complete theories that we converge to as time goes to infinity, and reason about this probability distribution using ordinary probability theory. Possible sources of evidence about math that could be treated with ordinary probability theory include physical experiments and black-boxed human intuitions. But another important source of evidence about mathematical truth is checking examples, and this cannot be reasoned about in ordinary probability theory because each of the examples is assigned probability 1 or 0 in the limit probability distribution, since we can check it. So just because you can reason about mathematical uncertainty using ordinary probability theory doesn’t mean you should.

Logical induction, taken at face value, looks like an attempt at handling mathematical uncertainty, since logical inductors assign probabilities to every sentence, not just sentences known to be decided by the deductive process. But most of the desirable properties of logical inductors that have been proved refer to sequences of decidable sentences, so logical induction only seems potentially valuable for handling logical uncertainty. In fact, the logical induction criterion doesn’t even imply anything about what the probabilities of an undecidable sentence converge to, except that it is not 0 or 1.

Another reason not to trust logical induction too much about mathematical uncertainty is that logical induction gets all its evidence from the proofs of one formal system, and there isn’t one formal system whose proofs completely account for all sources of evidence about mathematical claims. But it seems to me that correctly characterizing how to handle all sources of evidence about mathematical truths in a way precise enough to be turned into an algorithm would be a quite shockingly huge advance in the philosophy of mathematics, and I don’t expect it to happen any time soon.

Fortunately, we might not need to come up with a better way of handling mathematical uncertainty. It seems plausible to me that only logical uncertainty, not mathematical uncertainty more broadly, has any practical use. For instance, in cryptography, an eavesdropper who is uncertain about what a given ciphertext decrypts to has a perfectly good exponential-time algorithm to determine the answer, but lacks the computational resources to run the algorithm. Someone wondering whether their cryptosystem is breakable might phrase their question in terms of existence of an efficient algorithm that accomplishes some task with probability that does not go to 0 as task size approaches infinity, and it might not be clear that the question can be answered by commonly-used powerful axiom systems; but it would be sufficient for practical purposes to know whether there is an efficient algorithm of at most some given size that will accomplish a given computational task with nontrivial probability for tasks of the size that are actually used (or will be used in the near future). This latter question there is a (very computationally expensive) algorithm to determine the answer to. More broadly, physics appears to be computable, so we should expect that in any situation in which we are uncertain about what will happen in the real world due to lack of mathematical knowledge rather than due to lack of physical knowledge, we will have a computation that would resolve our lack of mathematical knowledge if we were capable of running the computation.

Thus it might make sense to give up on assigning probabilities to all sentences, and just try to assign probabilities to sentences that are known to be resolvable by some proof system (for instance, claims about the outputs of programs that are known to halt). Logical induction can easily be modified to do this, by only opening the market for a sentence once the deductive process outputs a proof that the sentence is resolvable. But this alone doesn’t offer any improvement in logical induction. The hope is that by not even trying to assign probabilities to sentences that are not known to be resolvable, it might be possible to find ways of assigning probabilities to resolvable sentences in a way that avoids certain computational limitations that are otherwise inevitable. The properties of logical inductors are all asymptotic, with no computable bounds on rate of convergence, and it is impossible to do better than this in general. Asymptotic properties with no computable bounds on rate of convergence are useless for practical purposes, and I don’t see any reason we can’t do better than this when restricting only to sentences known in advance to be resolvable.

An edge case between logical and mathematical uncertainty is uncertainty about the truth of Σ01 sentences (or equivalently, of Π01 sentences). True Σ01 sentences are provable in Peano Arithmetic, so uncertainty about their truth can be expressed as uncertainty about the existence of a proof, so this could be described as a case of logical uncertainty. However, since false Σ01 sentences aren’t necessarily disprovable, handling uncertainty about Σ01 sentences run into similar difficulties as handling uncertainty about more general arithmetical sentences. And since the physical world doesn’t appear to be able to do hypercomputation, the same arguments that more general mathematical uncertainty is useless seem to apply to general Σ01 sentences. This is why I’ve been characterizing logical uncertainty as uncertainty about sentences known to be resolvable, even though it could be argued that the term should include uncertainty about sentences that are resolvable if true.

It seems to me that “high-level” sentences let us reason about “low-level” sentences faster, so dropping them isn’t a good idea. But maybe we could have a performance guarantee in terms of “low-level” sentences only, like “this program prints X in the first N steps”. That sounds interesting, do you have any thoughts?

Are you making the point that we often reason about how likely a sentence is to be true and then use our conclusion as evidence about how likely it is to have a proof of reasonable length? I think this is a good point. One possible response is that if we’re doing something like logical induction in that it listens to many heuristics and pays more attention to the ones that have been reliable, then some of those heuristics can involve performing computations that look like trying to estimate how likely a sentence is to be true, in the process of estimating how likely it is to have a short proof, and then we can just pay attention to the probabilities suggested for existence of a short proof. A possible counterobjection is that if you want to know how to be such a successful heuristic, rather than just how to aggregate successful heuristics, you might want to reason about probabilities of mathematical truth yourself. A possible response to this counterobjection is that yes, maybe you should think about how likely a mathematical claim is to be true, but it is not necessary for your beliefs about it to be expressed as a probability, since it is impossible to resolve bets made about the truth of a mathematical sentence, absent a procedure for checking it.

Hmm, I was thinking about something simpler. Take for example the fact that for any n there are n! permutations of n objects. That’s not in your class of sentences which come with an “obvious” decision procedure. Yet it’s pretty useful: if you have a program that counts permutations of 100 objects one by one, you can replace it with a program that calculates 100! which is much faster. If you had uncertainty about the first decimal digit of the result (a “low-level” fact), using the “high-level” fact has just helped you resolve it faster. So “high-level” facts are useful even if you’re only interested in “low-level” facts. You’re right that using probabilities for “high-level” facts might not work, but we need to reason about them somehow, or we’ll be outplayed by those who can.

Oh, I see. But for any particular value of n, the claim that there are n! permutations of n objects is something we can know in advance is resolvable (even if we haven’t noticed why this is always true), because we can count the permutations and check.

What is logical induction’s take on probabilistic algorithms? That should be the easiest test-case.

Say, before “PRIME is in P”, we had perfectly fine probabilistic algorithms for checking primality. A good theory of mathematical logic with uncertainty should permit us to use such an algorithm, without random oracle, for things you place as “logical uncertainty”. As far as I understood, the typical mathematician’s take is to just ignore this foundational issue and do what’s right (channeling Thurston: Mathematicians are in the business of producing human understanding, not formal proofs).

Logical induction does not take the outputs of randomized algorithms into account. But it does listen to deterministic algorithms that are defined by taking a randomized algorithm but making it branch pseudo-randomly instead of randomly. Because of this, I expect that modifying logical induction to include randomized algorithms would not lead to a significant gain in performance.

## Logical uncertainty and Mathematical uncertainty

There is a significant difference between uncertainty about mathematical truths in cases where there isn’t a known procedure for checking whether a mathematical claim is true or false, versus when there is but you do not have the computational resources to carry it out. Examples of the former include the Collatz and twin prime conjectures, and examples of the latter include whether or not a given large number is a semi-prime, and what the first decimal digit of Graham’s number is.

The former should not be called logical uncertainty, because it is about what is true, not about what can be proved; I’ll call it mathematical uncertainty instead. The latter really is uncertainty about logic, since we would know that the claim is either proved or refuted by whatever theory we used to prove the algorithm correct, and we would just be uncertain as to which one.

It’s well-known that standard probability theory is a poor fit for handling logical uncertainty because it assumes that the probabilities are logically coherent, and uncertainty about what the logical coherence constraints are is exactly what we want to model; there are no possible outcomes in which the truth-value of a decidable sentence is anything other than what it actually is. But this doesn’t apply to mathematical uncertainty; we could study the probability distribution over complete theories that we converge to as time goes to infinity, and reason about this probability distribution using ordinary probability theory. Possible sources of evidence about math that could be treated with ordinary probability theory include physical experiments and black-boxed human intuitions. But another important source of evidence about mathematical truth is checking examples, and this cannot be reasoned about in ordinary probability theory because each of the examples is assigned probability 1 or 0 in the limit probability distribution, since we can check it. So just because you can reason about mathematical uncertainty using ordinary probability theory doesn’t mean you should.

Logical induction, taken at face value, looks like an attempt at handling mathematical uncertainty, since logical inductors assign probabilities to every sentence, not just sentences known to be decided by the deductive process. But most of the desirable properties of logical inductors that have been proved refer to sequences of decidable sentences, so logical induction only seems potentially valuable for handling logical uncertainty. In fact, the logical induction criterion doesn’t even imply anything about what the probabilities of an undecidable sentence converge to, except that it is not 0 or 1.

Another reason not to trust logical induction too much about mathematical uncertainty is that logical induction gets all its evidence from the proofs of one formal system, and there isn’t one formal system whose proofs completely account for all sources of evidence about mathematical claims. But it seems to me that correctly characterizing how to handle all sources of evidence about mathematical truths in a way precise enough to be turned into an algorithm would be a quite shockingly huge advance in the philosophy of mathematics, and I don’t expect it to happen any time soon.

Fortunately, we might not need to come up with a better way of handling mathematical uncertainty. It seems plausible to me that only logical uncertainty, not mathematical uncertainty more broadly, has any practical use. For instance, in cryptography, an eavesdropper who is uncertain about what a given ciphertext decrypts to has a perfectly good exponential-time algorithm to determine the answer, but lacks the computational resources to run the algorithm. Someone wondering whether their cryptosystem is breakable might phrase their question in terms of existence of an efficient algorithm that accomplishes some task with probability that does not go to 0 as task size approaches infinity, and it might not be clear that the question can be answered by commonly-used powerful axiom systems; but it would be sufficient for practical purposes to know whether there is an efficient algorithm of at most some given size that will accomplish a given computational task with nontrivial probability for tasks of the size that are actually used (or will be used in the near future). This latter question there is a (very computationally expensive) algorithm to determine the answer to. More broadly, physics appears to be computable, so we should expect that in any situation in which we are uncertain about what will happen in the real world due to lack of mathematical knowledge rather than due to lack of physical knowledge, we will have a computation that would resolve our lack of mathematical knowledge if we were capable of running the computation.

Thus it might make sense to give up on assigning probabilities to all sentences, and just try to assign probabilities to sentences that are known to be resolvable by some proof system (for instance, claims about the outputs of programs that are known to halt). Logical induction can easily be modified to do this, by only opening the market for a sentence once the deductive process outputs a proof that the sentence is resolvable. But this alone doesn’t offer any improvement in logical induction. The hope is that by not even trying to assign probabilities to sentences that are not known to be resolvable, it might be possible to find ways of assigning probabilities to resolvable sentences in a way that avoids certain computational limitations that are otherwise inevitable. The properties of logical inductors are all asymptotic, with no computable bounds on rate of convergence, and it is impossible to do better than this in general. Asymptotic properties with no computable bounds on rate of convergence are useless for practical purposes, and I don’t see any reason we can’t do better than this when restricting only to sentences known in advance to be resolvable.

An edge case between logical and mathematical uncertainty is uncertainty about the truth of Σ01 sentences (or equivalently, of Π01 sentences). True Σ01 sentences are provable in Peano Arithmetic, so uncertainty about their truth can be expressed as uncertainty about the existence of a proof, so this could be described as a case of logical uncertainty. However, since false Σ01 sentences aren’t necessarily disprovable, handling uncertainty about Σ01 sentences run into similar difficulties as handling uncertainty about more general arithmetical sentences. And since the physical world doesn’t appear to be able to do hypercomputation, the same arguments that more general mathematical uncertainty is useless seem to apply to general Σ01 sentences. This is why I’ve been characterizing logical uncertainty as uncertainty about sentences known to be resolvable, even though it could be argued that the term should include uncertainty about sentences that are resolvable if true.

It seems to me that “high-level” sentences let us reason about “low-level” sentences faster, so dropping them isn’t a good idea. But maybe we could have a performance guarantee in terms of “low-level” sentences only, like “this program prints X in the first N steps”. That sounds interesting, do you have any thoughts?

Are you making the point that we often reason about how likely a sentence is to be true and then use our conclusion as evidence about how likely it is to have a proof of reasonable length? I think this is a good point. One possible response is that if we’re doing something like logical induction in that it listens to many heuristics and pays more attention to the ones that have been reliable, then some of those heuristics can involve performing computations that look like trying to estimate how likely a sentence is to be true, in the process of estimating how likely it is to have a short proof, and then we can just pay attention to the probabilities suggested for existence of a short proof. A possible counterobjection is that if you want to know how to be such a successful heuristic, rather than just how to aggregate successful heuristics, you might want to reason about probabilities of mathematical truth yourself. A possible response to this counterobjection is that yes, maybe you should think about how likely a mathematical claim is to be true, but it is not necessary for your beliefs about it to be expressed as a probability, since it is impossible to resolve bets made about the truth of a mathematical sentence, absent a procedure for checking it.

Hmm, I was thinking about something simpler. Take for example the fact that for any n there are n! permutations of n objects. That’s not in your class of sentences which come with an “obvious” decision procedure. Yet it’s pretty useful: if you have a program that counts permutations of 100 objects one by one, you can replace it with a program that calculates 100! which is much faster. If you had uncertainty about the first decimal digit of the result (a “low-level” fact), using the “high-level” fact has just helped you resolve it faster. So “high-level” facts are useful even if you’re only interested in “low-level” facts. You’re right that using probabilities for “high-level” facts might not work, but we need to reason about them somehow, or we’ll be outplayed by those who can.

Oh, I see. But for any particular value of n, the claim that there are n! permutations of n objects is something we can know in advance is resolvable (even if we haven’t noticed why this is always true), because we can count the permutations and check.

What is logical induction’s take on probabilistic algorithms? That should be the easiest test-case.

Say, before “PRIME is in P”, we had perfectly fine probabilistic algorithms for checking primality. A good theory of mathematical logic with uncertainty should permit us to use such an algorithm, without random oracle, for things you place as “logical uncertainty”. As far as I understood, the typical mathematician’s take is to just ignore this foundational issue and do what’s right (channeling Thurston: Mathematicians are in the business of producing human understanding, not formal proofs).

Logical induction does not take the outputs of randomized algorithms into account. But it does listen to deterministic algorithms that are defined by taking a randomized algorithm but making it branch pseudo-randomly instead of randomly. Because of this, I expect that modifying logical induction to include randomized algorithms would not lead to a significant gain in performance.