Approaching Logical Probability

Followup To: Logic as Probability

If we design a robot that acts as if it’s uncertain about mathematical statements, that violates some desiderata for probability. But realistic robots cannot prove all theorems; they have to be uncertain about hard math problems.

In the name of practicality, we want a foundation for decision-making that captures what it means to make a good decision, even with limited resources. “Good” means that even though our real-world robot can’t make decisions well enough to satisfy Savage’s theorem, we want to approximate that ideal, not throw it out. Although I don’t have the one best answer to give you, in this post we’ll take some steps forward.

The objects we call probabilities are specified by desiderata that tell us how they behave. Any uncertainty about math problems violates those desiderata, but we still want to be able to assign logical probabilities that behave a lot like probabilities. The basic principles—not making up or ignoring information, not throwing away money or being inconsistent—should be deviated from as little as possible even when computing power is scarce. We want to develop a foundation for logical probabilities by starting from the rules governing ordinary probability, and then minimally restricting the application of those rules.
As we do this, it’s important to keep track of what changes we make and why. Sometimes people just define logical probabilities, without worrying about desiderata. This is fine, when it works, and is often patchable if it doesn’t have the right properties. But if you use it for something important and and get a surprise failure, it’s really bad. My hope here is to construct logical probabilities that have the good properties, while keeping handwaving and mysterious assumptions to a minimum.
The perils of handwaving are more dire than they appear, and they are at their most dire in the hardest and most confusing reaches of physics. After better approaches fail, many theorists resort to just making up approximations and then trying to justify them. Doing this is known colloquially as a “1/​ego expansion.” Simply put, it doesn’t work; there are too many vital little details. It’s why even condensed matter theorists tell you not to trust condensed matter theorists about high temperature superconductivity.
We must abandon regular probabilities because our robot has limited time, but other parts of the decision-making process can also go over the time limit. If the robot’s resources are limited, expected utility maximization breaks down at many points: there might be too many strategies to search through, too many outcomes to foresee, there might be probabilities that are too hard to find, and the utility of the outcomes might be too complicated.
The logical probabilities considered in this sequence will help approximate hard math problems, but they don’t seem to help much when there are too many outcomes to consider, or if you want to make the best use of limited computational resources. They are only a part of the full solution.
Time for a desideratum: we want our robot to only assign a logical probability of 1 or 0 to a statement after it’s actually checked the proof of that statement.
We can think of this as limiting what statements our robot is allowed to be certain about—only statements with short proofs can be found by our agent. However, this desideratum is not just about proof length, because a real robot won’t check every checkable proof—it will spend time generating proofs, maybe trying to prove some specific statement, and will end up only checking some subset of short proofs.
Logical probabilities, unlike probabilities, are not determined just by the starting information. If our real robot only verifies some small collection of proofs, the robot’s logical probabilities depend heavily upon what proof-steps were checked. One proof-step is just one truth-preserving step by our robot, like one application of modus ponens—it’s a little proof one step long. The import is that they’re the atomic unit of proofs, and once all the steps of a proof are checked, the proof is checked.
If we condition on which proof-steps get checked, does that determine the logical probabilities?
For any statement our robot is going to prove or disprove, we can use the checked proof steps to find whether it’s logical probability 1 or 0. This gives the same answer as a real robot that checks steps according to some process and then returns 1 or 0 if it manages to prove or disprove the statement we give it. We just have to take the steps that the real robot ends up checking, and say that those are the proved steps for our abstract mathematical definition.
There’s a problem, though. We haven’t changed the old axioms, so they’re still only satisfied if we get the right answer for everything. Meanwhile our new desideratum says we can’t get the right answer for everything—we’ve made our axioms internally contradictory. In order to talk about the logical probabilities of unproven statements, we’ll need to weaken the original axioms so that they no longer require certainty about everything. We’ll explore ways to do this next time. Then we can assign numbers to statements in the usual way, by using our weakened axioms to find constraints, then maximizing entropy subject to those constraints.

Part of the sequence Logical Uncertainty

Previous Post: Logic as Probability

Next post: Solutions and Open Problems