Logic as Probability

Followup To: Putting in the Numbers

Before talking about logical uncertainty, our final topic is the relationship between probabilistic logic and classical logic. A robot running on probabilistic logic stores probabilities of events, e.g. that the grass is wet outside, P(wet), and then if they collect new evidence they update that probability to P(wet|evidence). Classical logic robots, on the other hand, deduce the truth of statements from axioms and observations. Maybe our robot starts out not being able to deduce whether the grass is wet, but then they observe that it is raining, and so they use an axiom about rain causing wetness to deduce that “the grass is wet” is true.

Classical logic relies on complete certainty in its axioms and observations, and makes completely certain deductions. This is unrealistic when applied to rain, but we’re going to apply this to (first order, for starters) math later, which a better fit for classical logic.

The general pattern of the deduction “It’s raining, and when it rains the grass is wet, therefore the grass is wet” was modus ponens: if ‘U implies R’ is true, and U is true, then R must be true. There is also modus tollens: if ‘U implies R’ is true, and R is false, then U has to be false too. Third, there is the law of non-contradiction: “It’s simultaneously raining and not-raining outside” is always false.

We can imagine a robot that does classical logic as if it were writing in a notebook. Axioms are entered in the notebook at the start. Then our robot starts writing down statements that can be deduced by modus ponens or modus tollens. Eventually, the notebook is filled with statements deducible from the axioms. Modus tollens and modus ponens can be thought of as consistency conditions that apply to the contents of the notebook.

Doing math is one important application of our classical-logic robot. The robot can read from its notebook “If variable A is a number, A=A+0” and “SS0 is a number,” and then write down “SS0=SS0+0.”

Note that this requires the robot to interpret variable A differently than symbol SS0. This is one of many upgrades we can make to the basic robot so that it can interpret math more easily. We also want to program in special responses to symbols like ‘and’, so that if A and B are in the notebook our robot will write ‘A and B’, and if ‘A and B’ is in the notebook it will add in A and B. In this light, modus ponens is just the robot having a programmed response to the ‘implies’ symbol.

Certainty about our axioms is what lets us use classical logic, but you can represent complete certainty in probabilistic logic too, by the probabilities 1 and 0. These two methods of reasoning shouldn’t contradict each other—if a classical logic robot can deduce that it’s raining out, a probabilistic logic robot with the same information should assign P(rain)=1.

If it’s raining out, then my grass is wet. In the language of probabilities, this is P(wet|rain)=1. If I look outside and see rain, P(rain)=1, and then the product rule says that P(wet and rain) = P(rain)·P(wet|rain), and that’s equal to 1, so my grass must be wet too. Hey, that’s modus ponens!

The rules of probability can also behave like modus tollens (if P(B)=0, and P(B|A)=1, P(A)=0) and the law of the excluded middle (P(A|not-A)=0). Thus, when we’re completely certain, probabilistic logic and classical logic give the same answers.

There’s a very short way to prove this, which is that one of Cox’s desiderata for how probabilities must behave was “when you’re completely certain, your plausibilities should satisfy the rules of classical logic.”

In Foundations of Probability, I alluded to the idea that we should be able to apply probabilities to math. Dutch book arguments work because our robot must act as if it had probabilities in order to avoid losing money. Savage’s theorem applies because the results of our robot’s actions might depend on mathematical results. Cox’s theorem applies because beliefs about math behave like other beliefs.

This is completely correct. Math follows the rules of probability, and thus can be described with probabilities, because classical logic is the same as probabilistic logic when you’re certain.

We can even use this correspondence to figure out what numbers the probabilities take on:

1 for every statement that follows from the axioms, 0 for their negations.

This raises an issue: what about betting on the last digit of the 3^^^3′th prime? We dragged probability into this mess because it was supposed to help our robot stop trying to prove the answer and just bet as if P(last digit is 1)=1/​4. But it turns out that there is one true probability distribution over mathematical statements, given the axioms. The right distribution is obtained by straightforward application of the product rule—never mind that it takes 4^^^3 steps—and if you deviate from the right distribution that means you violate the product rule at some point.

This is why logical uncertainty is different. Even though our robot doesn’t have enough resources to find the right answer, using logical uncertainty violates Savage’s theorem and Cox’s theorem. If we want our robot to act as if it has some “logical probability,” it’s going to need a stranger sort of foundation.

Part of the sequence Logical Uncertainty

Previous Post: Putting in the Numbers

Next post: Approaching Logical Uncertainty