Proofs, Implications, and Models

Fol­lowup to: Causal Reference

From a math pro­fes­sor’s blog:

One thing I dis­cussed with my stu­dents here at HCSSiM yes­ter­day is the ques­tion of what is a proof.

They’re smart kids, but com­pletely new to proofs, and they of­ten have ques­tions about whether what they’ve writ­ten down con­sti­tutes a proof. Here’s what I said to them.

A proof is a so­cial con­struct – it is what we need it to be in or­der to be con­vinced some­thing is true. If you write some­thing down and you want it to count as a proof, the only real is­sue is whether you’re com­pletely con­vinc­ing.

This is not quite the defi­ni­tion I would give of what con­sti­tutes “proof” in math­e­mat­ics—per­haps be­cause I am so used to iso­lat­ing ar­gu­ments that are con­vinc­ing, but ought not to be.

Or here again, from “An In­tro­duc­tion to Proof The­ory” by Sa­muel R. Buss:

There are two dis­tinct view­points of what a math­e­mat­i­cal proof is. The first view is that proofs are so­cial con­ven­tions by which math­e­mat­i­ci­ans con­vince one an­other of the truth of the­o­rems. That is to say, a proof is ex­pressed in nat­u­ral lan­guage plus pos­si­bly sym­bols and figures, and is suffi­cient to con­vince an ex­pert of the cor­rect­ness of a the­o­rem. Ex­am­ples of so­cial proofs in­clude the kinds of proofs that are pre­sented in con­ver­sa­tions or pub­lished in ar­ti­cles. Of course, it is im­pos­si­ble to pre­cisely define what con­sti­tutes a valid proof in this so­cial sense; and, the stan­dards for valid proofs may vary with the au­di­ence and over time. The sec­ond view of proofs is more nar­row in scope: in this view, a proof con­sists of a string of sym­bols which satisfy some pre­cisely stated set of rules and which prove a the­o­rem, which it­self must also be ex­pressed as a string of sym­bols. Ac­cord­ing to this view, math­e­mat­ics can be re­garded as a ‘game’ played with strings of sym­bols ac­cord­ing to some pre­cisely defined rules. Proofs of the lat­ter kind are called “for­mal” proofs to dis­t­in­guish them from “so­cial” proofs.

In mod­ern math­e­mat­ics there is a much bet­ter an­swer that could be given to a stu­dent who asks, “What ex­actly is a proof?“, which does not match ei­ther of the above ideas. So:

Med­i­ta­tion: What dis­t­in­guishes a cor­rect math­e­mat­i­cal proof from an in­cor­rect math­e­mat­i­cal proof—what does it mean for a math­e­mat­i­cal proof to be good? And why, in the real world, would any­one ever be in­ter­ested in a math­e­mat­i­cal proof of this type, or obey­ing what­ever good­ness-rule you just set down? How could you use your no­tion of ‘proof’ to im­prove the real-world effi­cacy of an Ar­tifi­cial In­tel­li­gence?

...
...
...

Con­sider the fol­low­ing syl­l­o­gism:

  1. All kit­tens are lit­tle;

  2. Every­thing lit­tle is in­no­cent;

  3. There­fore all kit­tens are in­no­cent.

Here’s four math­e­mat­i­cal uni­verses, aka “mod­els”, in which the ob­jects col­lec­tively obey or di­s­obey these three rules:

little innocent kitten, little innocent kitten, big evil dog, little innocent dog little innocent kitten, big evil kitten, big evil dog, little innocent dog
little innocent kitten, little evil kitten, big innocent dog, little innocent dog little innocent kitten, big innocent kitten, big evil dog, little evil dog

There are some mod­els where not all kit­tens are lit­tle, like mod­els B and D. And there are mod­els where not ev­ery­thing lit­tle is in­no­cent, like mod­els C and D. But there are no mod­els where all kit­tens are lit­tle, and ev­ery­thing lit­tle is in­no­cent, and yet there ex­ists a guilty kit­ten. Try as you will, you won’t be able to imag­ine a model like that. Any model con­tain­ing a guilty kit­ten has at least one kit­ten that isn’t lit­tle, or at least one lit­tle en­tity that isn’t in­no­cent—no way around it.

Thus, the jump from 1 & 2, to 3, is truth-pre­serv­ing: in any uni­verse where premises (1) and (2) are true to start with, the con­clu­sion (3) is true of the same uni­verse at the end.

Which is what makes the fol­low­ing im­pli­ca­tion valid, or, as peo­ple would usu­ally just say, “true”:

“If all kit­tens are lit­tle and ev­ery­thing lit­tle is in­no­cent, then all kit­tens are in­no­cent.”

The ad­vanced main­stream view of logic and math­e­mat­ics (i.e., the main­stream view among pro­fes­sional schol­ars of logic as such, not nec­es­sar­ily among all math­e­mat­i­ci­ans in gen­eral) is that when we talk about math, we are talk­ing about which con­clu­sions fol­low from which premises. The “truth” of a math­e­mat­i­cal the­o­rem—or to not over­load the word ‘true’ mean­ing com­par­i­son-to-causal-re­al­ity, the val­idity of a math­e­mat­i­cal the­o­rem—has noth­ing to do with the phys­i­cal truth or falsity of the con­clu­sion in our world, and ev­ery­thing to do with the in­evita­bil­ity of the im­pli­ca­tion. From the stand­point of val­idity, it doesn’t mat­ter a fig whether or not all kit­tens are in­no­cent in our own uni­verse, the con­nected causal fabric within which we are em­bed­ded. What mat­ters is whether or not you can prove the im­pli­ca­tion, start­ing from the premises; whether or not, if all kit­tens were lit­tle and all lit­tle things were in­no­cent, it would fol­low in­evitably that all kit­tens were in­no­cent.


To para­phrase Mayor Daley, logic is not there to cre­ate truth, logic is there to pre­serve truth. Let’s illus­trate this point by as­sum­ing the fol­low­ing equa­tion:

x = y = 1

...which is true in at least some cases. E.g. ‘x’ could be the num­ber of thumbs on my right hand, and ‘y’ the num­ber of thumbs on my left hand.

Now, start­ing from the above, we do a lit­tle alge­bra:

1
x = y = 1
start­ing premise
2
x2 = xy
mul­ti­ply both sides by x
3
x2 - y2 = xy—y2
sub­tract y2 from both sides
4
(x + y)(x—y) = y(x—y)
fac­tor
5
x + y = y
can­cel
6
2 = 1
sub­sti­tute 1 for x and 1 for y

We have reached the con­clu­sion that in ev­ery case where x and y are equal to 1, 2 is equal to 1. This does not seem like it should fol­low in­evitably.

You could try to find the flaw just by star­ing at the lines… maybe you’d sus­pect that the er­ror was be­tween line 3 and line 4, fol­low­ing the heuris­tic of first mis­trust­ing what looks like the most com­pli­cated step… but an­other way of do­ing it would be to try eval­u­at­ing each line to see what it said con­cretely, for ex­am­ple, mul­ti­ply­ing out x2 = xy in line 2 to get (12) = (1 * 1) or 1 = 1. Let’s try do­ing this for each step, and then af­ter­ward mark whether each equa­tion looks true or false:

1
x = y = 1
1 = 1
true
2
x2 = xy
1 = 1
true
3
x2 - y2 = xy—y2
0 = 0
true
4
(x + y)(x—y) = y(x—y)
0 = 0
true
5
x + y = y
2 = 1
false
6
2 = 1
2 = 1
false

Logic is there to pre­serve truth, not to cre­ate truth. When­ever we take a log­i­cally valid step, we can’t guaran­tee that the premise is true to start with, but if the premise is true the con­clu­sion should always be true. Since we went from a true equa­tion to a false equa­tion be­tween step 4 and step 5, we must’ve done some­thing that is in gen­eral in­valid.

In par­tic­u­lar, we di­vided both sides of the equa­tion by (x—y).

Which is in­valid, i.e. not uni­ver­sally truth-pre­serv­ing, be­cause (x—y) might be equal to 0.

And if you di­vide both sides by 0, you can get a false state­ment from a true state­ment. 3 * 0 = 2 * 0 is a true equa­tion, but 3 = 2 is a false equa­tion, so it is not al­low­able in gen­eral to can­cel any fac­tor if the fac­tor might equal zero.

On the other hand, adding 1 to both sides of an equa­tion is always truth-pre­serv­ing. We can’t guaran­tee as a mat­ter of logic that x = y to start with—for ex­am­ple, x might be my num­ber of ears and y might be my num­ber of fingers. But if x = y then x + 1 = y + 1, always. Logic is not there to cre­ate truth; logic is there to pre­serve truth. If a scale starts out bal­anced, then adding the same weight to both sides will re­sult in a scale that is still bal­anced:

I will re­mark, in some hor­ror and ex­as­per­a­tion with the mod­ern ed­u­ca­tional sys­tem, that I do not re­call any math-book of my youth ever once ex­plain­ing that the rea­son why you are always al­lowed to add 1 to both sides of an equa­tion is that it is a kind of step which always pro­duces true equa­tions from true equa­tions.

What is a valid proof in alge­bra? It’s a proof where, in each step, we do some­thing that is uni­ver­sally al­lowed, some­thing which can only pro­duce true equa­tions from true equa­tions, and so the proof grad­u­ally trans­forms the start­ing equa­tion into a fi­nal equa­tion which must be true if the start­ing equa­tion was true. Each step should also—this is part of what makes proofs use­ful in rea­son­ing—be lo­cally ver­ifi­able as al­lowed, by look­ing at only a small num­ber of pre­vi­ous points, not the en­tire past his­tory of the proof. If in some pre­vi­ous step I be­lieved x2 - y = 2, I only need to look at that sin­gle step to get the con­clu­sion x2 = 2 + y, be­cause I am always al­lowed to add y to both sides of the equa­tion; be­cause I am always al­lowed to add any quan­tity to both sides of the equa­tion; be­cause if the two sides of an equa­tion are in bal­ance to start with, adding the same quan­tity to both sides of the bal­ance will pre­serve that bal­ance. I can know the in­evita­bil­ity of this im­pli­ca­tion with­out con­sid­er­ing all the sur­round­ing cir­cum­stances; it’s a step which is lo­cally guaran­teed to be valid. (Note the similar­ity—and the differ­ences—to how we can com­pute a causal en­tity know­ing only its im­me­di­ate par­ents, and no other an­ces­tors.)


You may have read—I’ve cer­tainly read—some philos­o­phy which en­deav­ors to score points for counter-in­tu­itive cyn­i­cism by say­ing that all math­e­mat­ics is a mere game of to­kens; that we start with a mean­ingless string of sym­bols like:

...and we fol­low some sym­bol-ma­nipu­la­tion rules like “If you have the string ‘A ∧ (A → B)’ you are al­lowed to go to the string ‘B’“, and so fi­nally end up with the string:

...and this ac­tivity of string-ma­nipu­la­tion is all there is to what math­e­mat­i­ci­ans call “the­o­rem-prov­ing”—all there is to the glo­ri­ous hu­man en­deavor of math­e­mat­ics.

This, like a lot of other cyn­i­cism out there, is need­lessly defla­tion­ary.

There’s a fam­ily of tech­niques in ma­chine learn­ing known as “Monte Carlo meth­ods” or “Monte Carlo simu­la­tion”, one of which says, roughly, “To find the prob­a­bil­ity of a propo­si­tion Q given a set of premises P, simu­late ran­dom mod­els that obey P, and then count how of­ten Q is true.” Stanis­law Ulam in­vented the idea af­ter try­ing for a while to calcu­late the prob­a­bil­ity that a ran­dom Can­field soli­taire lay­out would be solv­able, and fi­nally re­al­iz­ing that he could get bet­ter in­for­ma­tion by try­ing it a hun­dred times and count­ing the num­ber of suc­cess­ful plays. This was dur­ing the era when com­put­ers were first be­com­ing available, and the thought oc­curred to Ulam that the same tech­nique might work on a cur­rent neu­tron diffu­sion prob­lem as well.

Similarly, to an­swer a ques­tion like, “What is the prob­a­bil­ity that a ran­dom Can­field soli­taire is solv­able, given that the top card in the deck is a king?” you might imag­ine simu­lat­ing many 52-card lay­outs, throw­ing away all the lay­outs where the top card in the deck was not a king, us­ing a com­puter al­gorithm to solve the re­main­ing lay­outs, and count­ing what per­centage of those were solv­able. (It would be more effi­cient, in this case, to start by di­rectly plac­ing a king on top and then ran­domly dis­tribut­ing the other 51 cards; but this is not always effi­cient in Monte Carlo simu­la­tions when the con­di­tion to be fulfilled is more com­plex.)

Okay, now for a harder prob­lem. Sup­pose you’ve wan­dered through the world a bit, and you’ve ob­served the fol­low­ing:

(1) So far I’ve seen 20 ob­jects which have been kit­tens, and on the 6 oc­ca­sions I’ve paid a penny to ob­serve the size of some­thing that’s a kit­ten, all 6 kit­ten-ob­jects have been lit­tle.

(2) So far I’ve seen 50 ob­jects which have been lit­tle, and on the 30 oc­ca­sions where I’ve paid a penny to ob­serve the moral­ity of some­thing lit­tle, all 30 lit­tle ob­jects have been in­no­cent.

(3) This ob­ject hap­pens to be a kit­ten. I want to know whether it’s in­no­cent, but I don’t want to pay a cent to find out di­rectly. (E.g., if it’s an in­no­cent kit­ten, I can buy it for a cent, sell it for two cents, and make a one-cent profit. But if I pay a penny to ob­serve di­rectly whether the kit­ten is in­no­cent, I won’t be able to make a profit, since gath­er­ing ev­i­dence is costly.)

Your pre­vi­ous ex­pe­riences have led you to sus­pect the gen­eral rule “All kit­tens are lit­tle” and also the rule “All lit­tle things are in­no­cent”, even though you’ve never be­fore di­rectly checked whether a kit­ten is in­no­cent.

Fur­ther­more...

You’ve never heard of logic, and you have no idea how to play that ‘game of sym­bols’ with K(x), I(x), and L(x) that we were talk­ing about ear­lier.

But that’s all right. The prob­lem is still solv­able by Monte Carlo meth­ods!

First we’ll gen­er­ate a large set of ran­dom uni­verses. Then, for each uni­verse, we’ll check whether that uni­verse obeys all the rules we cur­rently sus­pect or be­lieve to be true, like “All kit­tens are lit­tle” and “All lit­tle things are in­no­cent” and “The force of grav­ity goes as the square of the dis­tance be­tween two ob­jects and the product of their masses”. If a uni­verse passes this test, we’ll check to see whether the in­quiry of in­ter­est, “Is the kit­ten in front of me in­no­cent?“, also hap­pens to be true in that uni­verse.

We shall re­peat this test a large num­ber of times, and at the end we shall have an ap­prox­i­mate es­ti­mate of the prob­a­bil­ity that the kit­ten in front of you is in­no­cent.

On this al­gorithm, you perform in­fer­ence by vi­su­al­iz­ing many pos­si­ble uni­verses, throw­ing out uni­verses that dis­agree with gen­er­al­iza­tions you already be­lieve in, and then check­ing what’s true (prob­a­ble) in the uni­verses that re­main. This al­gorithm doesn’t tell you the state of the real phys­i­cal world with cer­tainty. Rather, it gives you a mea­sure of prob­a­bil­ity—i.e., the prob­a­bil­ity that the kit­ten is in­no­cent—given ev­ery­thing else you already be­lieve to be true.

And if, in­stead of vi­su­al­iz­ing many imag­in­able uni­verses, you checked all pos­si­ble log­i­cal mod­els—which would take some­thing be­yond magic, be­cause that would in­clude mod­els con­tain­ing un­countably large num­bers of ob­jects—and the in­quiry-of-in­ter­est was true in ev­ery model match­ing your pre­vi­ous be­liefs, you would have found that the con­clu­sion fol­lowed in­evitably if the gen­er­al­iza­tions you already be­lieved were true.

This might take a whole lot of rea­son­ing, but at least you wouldn’t have to pay a cent to ob­serve the kit­ten’s in­no­cence di­rectly.

But it would also save you some com­pu­ta­tion if you could play that game of sym­bols we talked about ear­lier—a game which does not cre­ate truth, but pre­serves truth. In this game, the steps can be lo­cally pro­nounced valid by a mere ‘syn­tac­tic’ check that doesn’t re­quire us to vi­su­al­ize all pos­si­ble mod­els. In­stead, if the mere syn­tax of the proof checks out, we know that the con­clu­sion is always true in a model when­ever the premises are true in that model.

And that’s a math­e­mat­i­cal proof: A con­clu­sion which is true in any model where the ax­ioms are true, which we know be­cause we went through a se­ries of trans­for­ma­tions-of-be­lief, each step be­ing li­censed by some rule which guaran­tees that such steps never gen­er­ate a false state­ment from a true state­ment.

The way we would say it in stan­dard math­e­mat­i­cal logic is as fol­lows:

A col­lec­tion of ax­ioms X se­man­ti­cally im­plies a the­o­rem Y, if Y is true in all mod­els where X are true. We write this as X Y.

A col­lec­tion of ax­ioms X syn­tac­ti­cally im­plies a the­o­rem Y within a sys­tem S, if we can get to Y from X us­ing trans­for­ma­tion steps al­lowed within S. We write this as X Y.

The point of the sys­tem S known as “clas­si­cal logic” is that its syn­tac­tic trans­for­ma­tions pre­serve se­man­tic im­pli­ca­tion, so that any syn­tac­ti­cally al­lowed proof is se­man­ti­cally valid:

If X Y, then X Y.

If you make this idea be about proof steps in alge­bra do­ing things that always pre­serve the bal­ance of a pre­vi­ously bal­anced scale, I see no rea­son why this idea couldn’t be pre­sented in eighth grade or ear­lier.

I can at­test by spot-check­ing for small N that even most math­e­mat­i­ci­ans have not been ex­posed to this idea. It’s the stan­dard con­cept in math­e­mat­i­cal logic, but for some odd rea­son, the knowl­edge seems con­strained to the study of “math­e­mat­i­cal logic” as a sep­a­rate field, which not all math­e­mat­i­ci­ans are in­ter­ested in (many just want to do Dio­phan­tine anal­y­sis or what­ever).


So far as real life is con­cerned, math­e­mat­i­cal logic only tells us the im­pli­ca­tions of what we already be­lieve or sus­pect, but this is a com­pu­ta­tional prob­lem of supreme difficulty and im­por­tance. After the first thou­sand times we ob­serve that ob­jects in Earth grav­ity ac­cel­er­ate down­ward at 9.8 m/​s2, we can sus­pect that this will be true on the next oc­ca­sion—which is a mat­ter of prob­a­bil­is­tic in­duc­tion, not valid logic. But then to go from that sus­pi­cion, plus the ob­ser­va­tion that a build­ing is 45 me­ters tall, to a spe­cific pre­dic­tion of how long it takes the ob­ject to hit the ground, is a mat­ter of logic—what will hap­pen if ev­ery­thing we else we already be­lieve, is ac­tu­ally true. It re­quires com­pu­ta­tion to make this con­clu­sion trans­par­ent. We are not ‘log­i­cally om­ni­scient’ - the tech­ni­cal term for the im­pos­si­ble dream­like abil­ity of know­ing all the im­pli­ca­tions of ev­ery­thing you be­lieve.

The great virtue of logic in ar­gu­ment is not that you can prove things by logic that are ab­solutely cer­tain. Since log­i­cal im­pli­ca­tions are valid in ev­ery pos­si­ble world, “ob­serv­ing” them never tells us any­thing about which pos­si­ble world we live in. Logic can’t tell you that you won’t sud­denly float up into the at­mo­sphere. (What if we’re in the Ma­trix, and the Ma­trix Lords de­cide to do that to you on a whim as soon as you finish read­ing this sen­tence?) Logic can only tell you that, if that does hap­pen, you were wrong in your ex­tremely strong sus­pi­cion about grav­i­ta­tion be­ing always and ev­ery­where true in our uni­verse.

The great virtue of valid logic in ar­gu­ment, rather, is that log­i­cal ar­gu­ment ex­poses premises, so that any­one who dis­agrees with your con­clu­sion has to (a) point out a premise they dis­agree with or (b) point out an in­valid step in rea­son­ing which is strongly li­able to gen­er­ate false state­ments from true state­ments.

For ex­am­ple: Nick Bostrom put forth the Si­mu­la­tion Ar­gu­ment, which is that you must dis­agree with ei­ther state­ment (1) or (2) or else agree with state­ment (3):

(1) Earth-origi­nat­ing in­tel­li­gent life will, in the fu­ture, ac­quire vastly greater com­put­ing re­sources.

(2) Some of these com­put­ing re­sources will be used to run many simu­la­tions of an­cient Earth, aka “an­ces­tor simu­la­tions”.

(3) We are al­most cer­tainly liv­ing in a com­puter simu­la­tion.

...but un­for­tu­nately it ap­pears that not only do most re­spon­dents de­cline to say why they dis­be­lieve in (3), most are un­able to un­der­stand the dis­tinc­tion be­tween the Si­mu­la­tion Hy­poth­e­sis that we are liv­ing in a com­puter simu­la­tion, ver­sus Nick Bostrom’s ac­tual sup­port for the Si­mu­la­tion Ar­gu­ment that “You must ei­ther dis­agree with (1) or (2) or agree with (3)”. They just treat Nick Bostrom as hav­ing claimed that we’re all liv­ing in the Ma­trix. Really. Look at the me­dia cov­er­age.

I would se­ri­ously gen­er­al­ize that the main­stream me­dia only un­der­stands the “and” con­nec­tive, not the “or” or “im­plies” con­nec­tive. I.e., it is im­pos­si­ble for the me­dia to re­port on a dis­cov­ery that one of two things must be true, or a dis­cov­ery that if X is true then Y must be true (when it’s not known that X is true). Also, the me­dia only un­der­stands the “not” pre­fix when ap­plied to atomic facts; it should go with­out say­ing that “not (A and B)” can­not be re­ported-on.

Robin Han­son some­times com­plains that when he tries to ar­gue that con­clu­sion X fol­lows from rea­son­able-sound­ing premises Y, his col­leagues dis­agree with X while re­fus­ing to say which premise Y they think is false, or else say which step of the rea­son­ing seems like an in­valid im­pli­ca­tion. Such be­hav­ior is not only an­noy­ing, but log­i­cally rude, be­cause some­one else went out of their way and put in ex­tra effort to make it as easy as pos­si­ble for you to ex­plain why you dis­agreed, and you couldn’t be both­ered to pick one item off a mul­ti­ple-choice menu.

The in­spira­tion of logic for ar­gu­ment is to lay out a mod­u­lar de­bate, one which con­ve­niently breaks into smaller pieces that can ex­am­ined with smaller con­ver­sa­tions. At least when it comes to try­ing to have a real con­ver­sa­tion with a re­spected part­ner—I wouldn’t nec­es­sar­ily ad­vise a teenager to try it on their oblivi­ous par­ents—that is the great in­spira­tion we can take from the study of math­e­mat­i­cal logic: An ar­gu­ment is a suc­ces­sion of state­ments each allegedly fol­low­ing with high prob­a­bil­ity from pre­vi­ous state­ments or shared back­ground knowl­edge. Rather than, say, snow­ing some­one un­der with as much fury and as many de­mands for ap­plause as you can fit into sixty sec­onds.

Michael Vas­sar is fond of claiming that most peo­ple don’t have the con­cept of an ar­gu­ment, and that it’s pointless to try and teach them any­thing else un­til you can con­vey an in­tu­itive sense for what it means to ar­gue. I think that’s what he’s talk­ing about.


Med­i­ta­tion: It has been claimed that logic and math­e­mat­ics is the study of which con­clu­sions fol­low from which premises. But when we say that 2 + 2 = 4, are we re­ally just as­sum­ing that? It seems like 2 + 2 = 4 was true well be­fore any­one was around to as­sume it, that two ap­ples equaled two ap­ples be­fore there was any­one to count them, and that we couldn’t make it 5 just by as­sum­ing differ­ently.


Main­stream sta­tus.

Part of the se­quence Highly Ad­vanced Episte­mol­ogy 101 for Beginners

Next post: “Log­i­cal Pin­point­ing

Pre­vi­ous post: “Causal Refer­ence