[Question] What’s going on with “provability”?

Every so of­ten I hear seem­ingly math­e­mat­i­cal state­ments in­volv­ing the con­cept of be­ing prov­able. For ex­am­ple:

• I’ve seen Gödel’s In­com­plete­ness The­o­rem stated as “if a math­e­mat­i­cal sys­tem is pow­er­ful enough to ex­press ar­ith­metic, then ei­ther it con­tains a con­tra­dic­tion or there are true state­ments that it can­not prove.”

• On the AI al­ign­ment fo­rum, one of the pinned se­quences de­scribes Löb’s The­o­rem as “If Peano Arith­metic can prove that a proof of P would im­ply the truth of P, then it can also prove P is true”.

I find each of these state­ments baf­fling for a differ­ent rea­son:

• Gödel: What could it mean for a state­ment to be “true but not prov­able”? Is this just be­cause there are some state­ments such that nei­ther P nor not-P can be proven, yet one of them must be true? If so, I would (stub­bornly) con­test that per­haps P and not-P re­ally are both non-true.

• Löb: How can a sys­tem of ar­ith­metic prove any­thing? Much less prove things about proofs?

And I also have one more gen­eral con­fu­sion. What sys­tems of rea­son­ing could these kinds of the­o­rems be set in? For ex­am­ple, I’ve heard that there are proofs that PA is con­sis­tent. Let’s say one of those proofs is set in Proof Sys­tem X. Now how do we know that Proof Sys­tem X is con­sis­tent? Per­haps it can be proven con­sis­tent by us­ing Proof Sys­tem Y? Do we just end up mak­ing an in­finite chain of ap­peals up along a tower of proof sys­tems? Or do we even­tu­ally drive our­selves into the ground by reach­ing sys­tem that no­body could deny is valid? If so, why don’t we just stop and PA or ZFC?

Oh, speak­ing of ZFC. There seems to be a de­bate about whether we should ac­cept the Ax­iom of Choice. Isn’t it...ob­vi­ously true? I don’t re­ally un­der­stand this topic well enough to have any par­tic­u­lar ques­tion about the AC de­bate, but my con­fu­sion definitely ex­tends to that re­gion of con­cept space.

So here’s my ques­tion: Where can I learn about “prov­abil­ity” and/​or what clar­ify­ing in­sights could you share about it?

• First, the ques­tion of “prov­abil­ity” is whether some­thing can be proved us­ing a given col­lec­tion of ax­ioms and rules.

That is, we are not con­cerned about whether a smart math­e­mat­i­cian with great com­mu­ni­ca­tion skills could con­vince us about the ve­rac­ity of a state­ment. In­stead, we are con­cerned about whether we can con­struct a se­quence of state­ments cul­mi­nat­ing with the given state­ment, such that each step can be ver­ified by a rather sim­ple com­puter pro­gram, by match­ing it to the few pre­de­ter­mined pat­terns.

So it is perfectly okay to have a state­ment that is ob­vi­ously true, but still can­not be proved us­ing some set of ax­ioms and rules. Think about it as a weak­ness of the given set of ax­ioms and rules, rather than a prop­erty of the state­ment it­self. In other words, we never talk about “prov­abil­ity of X” (ex­cept as a short­cut), but always about “prov­abil­ity of X in sys­tem S”. Prov­abil­ity is a re­la­tion be­tween X and S.

There­fore, Gödel’s The­o­rem is not about state­ments that are un­prov­able per se (i.e. some more com­pli­cated ver­sion of “math will never be able to prove that 2 + 2 equals 4”), but rather it means “if you give me a non-con­tra­dic­tory sys­tem S of ax­ioms and rules, I can cre­ate a state­ment X such that its ve­rac­ity will be be­yond the reach of S”. Plus there is an al­gorithm for cre­at­ing a spe­cific X for given S. The gen­er­ated X will be tai­lored to the given S. (For differ­ent sys­tems S1, S2, S3 you would get differ­ent X1, X2, X3; and it’s always S1 hav­ing a prob­lem with X1, S2 with X2, and S3 with X3; but maybe X1 and X2 are easy to prove or dis­prove in S3.)

More speci­fi­cally, the state­ment X gen­er­ated by Gödel for sys­tem S is clev­erly de­signed to be equiv­a­lent to “X can­not be proved by S” with­out be­ing self-refer­en­tial. To dis­cuss the clever con­struc­tion is be­yond the scope of this com­ment.

.

Se­cond, once we start talk­ing about things like sets, we are now far out­side the realm of re­al­ity. It doesn’t make sense to ask whether “sets re­ally have a prop­erty P”, if the fact is that “in the first place, set’s don’t re­ally ex­ist”.

It would be like try­ing to use ex­per­i­men­tal physics to ver­ify whether fairies are heav­ier than 10 grams. You sim­ply can’t, be­cause there are no fairies to mea­sure. So if you have two com­pet­ing the­o­ries of physics-in­clud­ing-su­per­nat­u­ral, one say­ing that fairies are heav­ier than 10 grams, and the other say­ing they are not, well, you have a prob­lem that can­not be re­solved ex­per­i­men­tally. A set the­ory with the Ax­iom of Choice, and a set the­ory with some in­com­pat­i­ble ax­iom, are two such the­o­ries: they agree about ev­ery­thing that re­ally ex­ists, and they dis­agree about the prop­er­ties of fairies… ahem, in­finite sets. You can’t re­solve this con­flict by talk­ing about what re­ally ex­ists; and the ar­gu­ments about what doesn’t ex­ist are by their na­ture ar­bi­trary. (You can’t prove in­ter­nal in­con­sis­tency, be­cause both the­o­ries are in­ter­nally con­sis­tent. They just dis­agree with each other.)

So how can we study sets if they are not real? By defin­ing ax­ioms and rules, and ex­am­in­ing which state­ments can be proved us­ing given ax­ioms and rules. “ZF” is such a set of ax­ioms; there are state­ments you can prove us­ing it, there are state­ments you can dis­prove, and there are state­ments where the sys­tem pro­vides no an­swer.

The un­der­ly­ing rea­son is that if you imag­ine a Pla­tonic realm where all ab­strac­tions allegedly ex­ist, the prob­lem is that there are ac­tu­ally mul­ti­ple ab­strac­tions [“mod­els”] com­pat­i­ble with ZF, but differ­ent from each other in many im­por­tant ways. In some of them, Ax­iom of Choice is true. In oth­ers, it is false. This is what it means that Ax­iom of Choice can be nei­ther proved nor dis­proved in ZF. The prob­lem is that “sets” have never been un­am­bigu­ously defined in the first place!

How­ever, adopt­ing the Ax­iom of Choice (or any of its com­peti­tors) won’t ac­tu­ally solve the prob­lem. There will still re­main many ab­strac­tions com­pat­i­ble with ZFC (or what­ever), but differ­ent from each other. So you will sooner or later find other ex­cit­ing prop­er­ties of the fairies… ahem, in­finite sets, that can be nei­ther proved nor dis­proved by ZFC (or what­ever). The prob­lem, again, is that we still don’t have an un­am­bigu­ous defi­ni­tion of “sets”.

And… but I am way out of my depth here… maybe we can’t have an un­am­bigu­ous defi­ni­tion of “sets”, ever, be­cause of the Gödel The­o­rem. So we will have to keep adding new ax­ioms, but there is no ter­ri­tory to guide us in their choice, so differ­ent peo­ple will pre­fer differ­ent choices, mu­tu­ally con­tra­dic­tory.

At the end the set the­ory re­search will be frac­tured into hun­dreds of com­pet­ing defi­ni­tions, all un­der­speci­fied. The state­ments will have to be pref­aced by ever longer “ac­cord­ing to this col­lec­tion of ax­ioms” which will make things difficult and er­ror-prone. (Things you will learn un­der one col­lec­tion of ax­ioms may be false or even non­sen­si­cal un­der other col­lec­tions. So you will have to re-learn ev­ery­thing over and over again if you switch to a differ­ent sys­tem.) And the preferred col­lec­tion of ax­ioms, which will most likely provide you op­por­tu­nity in the peer-re­viewed jour­nals and re­search grants, will be de­cided “poli­ti­cally” (as the most pop­u­lar among the cur­rently es­tab­lished re­searchers); which ac­cord­ing to some peo­ple has already hap­pened with ZF(C).

• Proofs, Im­pli­ca­tions, and Models in­tro­duces some of these ideas more slowly. Other stuff from the Highly Ad­vanced Episte­mol­ogy 101 for Begin­ners is rele­vant too, and in­cludes more re­al­ism-fla­vored con­cerns about choos­ing be­tween sys­tems.

• Thanks for this great an­swer. I have a cou­ple of fol­low-up ques­tions that any­body should feel free to jump in and an­swer.

The un­der­ly­ing rea­son is that if you imag­ine a Pla­tonic realm where all ab­strac­tions allegedly ex­ist, the prob­lem is that there are ac­tu­ally mul­ti­ple ab­strac­tions [“mod­els”] com­pat­i­ble with ZF, but differ­ent from each other in many im­por­tant ways. In some of them, Ax­iom of Choice is true. In oth­ers, it is false. This is what it means that Ax­iom of Choice can be nei­ther proved nor dis­proved in ZF.

There are mod­els where the AC is speci­fi­cally false? I’ve been told that AC can be for­mu­lated as “the carte­sian product of any col­lec­tion of sets (even an in­finite col­lec­tion) is non-empty”, but I’m hav­ing trou­ble pic­tur­ing some­thing a col­lec­tion of things I call “sets” failing to satisfy this prop­erty. Are the mod­els referred to here ones “sets” are re­placed by to­tally un­re­lated math­e­mat­i­cal ob­jects that just hap­pen to satisfy the ZF ax­ioms?

• If I un­der­stand it cor­rectly, (you be­lieve) you have an in­tu­ition of “sets”, and you are go­ing to judge any ax­iom de­pend­ing on whether it is com­pat­i­ble with your in­tu­ition or not. ZFC is com­pat­i­ble with your in­tu­ition, nega­tion of AC is not.

And maybe your in­tu­ition is also un­der­speci­fied—that is, there can be mul­ti­ple mod­els that are differ­ent only in things too ab­stract for you to have an opinion on them—but still, AC has to be true in any of those mod­els that are ac­cept­able to you.

Fair enough. I be­lieve this is not differ­ent in prin­ci­ple from what pro­fes­sional math­e­mat­i­ci­ans do. Ex­cept they care­fully sep­a­rate what they be­lieve in­tu­itively to be true, from what they can prove from the ax­ioms.

The ques­tion is, whether your in­tu­ition is con­sis­tent. That is, do you be­lieve in AC only be­cause the pro­po­nents of AC got to you first, and gave you the best ar­gu­ments in their fa­vor? (Like the one about the Carte­sian product of in­finitely many nonempty sets.) What would have hap­pened if you heard about Banach-Tarski para­dox first? Would you be like “yeah, that’s cool for me; you take an or­ange, cut it and ro­tate the pieces, and you get two or­anges of the same size; my in­tu­ition is okay with that”? What is your in­tu­itive opinion on whether the set of real num­bers can be well-or­dered?

If your in­tu­ition is con­sis­tent with these things, good for you! Then yes, the other mod­els are the ones where “sets” refers to math­e­mat­i­cal ob­jects that are not sets ac­cord­ing to your in­tu­ition, and only hap­pen to satisfy the ZF ax­ioms. (It still might be use­ful to talk about such ob­jects, similarly how talk­ing about ge­ome­tries that do not satisfy Eu­clid’s fifth pos­tu­late got us some in­ter­est­ing re­sults.) But some other peo­ple’s in­tu­itions ob­ject against things that are equiv­a­lent to the ax­iom of choice, there­fore they find it eas­ier to re­ject it. And it’s not like you can prove them wrong.

• If I un­der­stand it cor­rectly, (you be­lieve) you have an in­tu­ition of “sets”, and you are go­ing to judge any ax­iom de­pend­ing on whether it is com­pat­i­ble with your in­tu­ition or not. ZFC is com­pat­i­ble with your in­tu­ition, nega­tion of AC is not.
And maybe your in­tu­ition is also un­der­speci­fied—that is, there can be mul­ti­ple mod­els that are differ­ent only in things too ab­stract for you to have an opinion on them—but still, AC has to be true in any of those mod­els that are ac­cept­able to you.

I think this is a good as­sess­ment of my cur­rent thoughts. It’s funny you ask about the Banach-Tarski para­dox, be­cause I think I ac­tu­ally did hear about that first, al­though the place I heard it (VSauce) didn’t men­tioned that it was a con­se­quence of the AC.

I think I’m okay with the BT para­dox, be­cause the so-called “pieces” end up be­ing in­sane col­lec­tions of in­di­vi­d­ual points. If it were doable with some­thing more akin to “solid chunks”, the way an ac­tual or­ange would be cut up, I might feel differ­ently.

The well-or­der­ing of the re­als is less in­tu­itive to me, and I think I re­mem­ber be­ing very sur­prised when I first read that it was true. This was in a con­text that linked it to the AC, and I think that was the first I heard of the con­tro­versy.

I think you’ve given me a good foun­da­tion to work on as I think this topic through even fur­ther. Thanks again!

• What kind of refer­ence you are us­ing for your refer­ence to sets if not the ax­ioms? That reads to me as if “Are they just to­tally un­re­lated ob­jects to red busses that just hap­pen to be a bus and red?”

Some times I have seen peo­ple ar­gue for ex­am­ple that the word “yel­low” is grounded by the set of all yel­low things. But usu­ally that kind of defi­ni­tion suffers from the list be­ing am­bi­gious/​in­suffi­cient. Like if a give a thing it ei­ther is or is not a mem­ber of that set. But list­ing all the mem­bers or oth­er­wise giv­ing some pro­ce­dure to give out all the mem­bers seems like is not the most nat­u­ral thing to do. Thus if you tried to take the carte­sian product of yel­low things and red things be­cause you can’t ex­em­plify a sam­ple just from the con­cept you can’t build up the product from mem­bers. The col­lec­tion of yel­low things prop­bably is not a set but it has many set-like prop­er­ties. By hav­ing a close in­ven­tory of sets prop­er­ties they can be dis­t­in­guished from con­fused or nearby no­tions.

Another pos­si­ble imag­i­na­tion prompt would be a per­son faced with co­or­di­nates. Is there a real num­ber that you can spe­sify that the hu­man can point out on the x axis? No, they are always go­ing to be off. In the same way if you pre­sent the axis and ask the hu­man to point out their “favourite num­ber” (that is sup­posed to keep sta­ble) they will point out a slightly differ­ent real num­ber each time they sup­pos­edly point that point out. Such a per­son can’t provide a choice func­tion. It might still make sense to treat the per­son as be­ing able to spec­ify in­ter­vals, or re­fer to all of the points or be­ing able to refer­ence cross­ing points and oth­ers that have ge­o­met­ri­cal spe­sifi­ca­tion. But in gen­eral a line is not guaran­teed to have any refer­an­cale points fal­ling within it.

• So it is perfectly okay to have a state­ment that is ob­vi­ously true, but still can­not be proved us­ing some set of ax­ioms and rules.

The un­der­ly­ing rea­son is that if you imag­ine a Pla­tonic realm where all ab­strac­tions allegedly ex­ist, the prob­lem is that there are ac­tu­ally mul­ti­ple ab­strac­tions [“mod­els”] com­pat­i­ble with ZF, but differ­ent from each other in many im­por­tant ways.

So, when you say Godel’s sen­tence is ob­vi­ously true, in which “ab­strac­tion” is it true?

• Ar­guably, the num­bers we care about. Set the­ory helpfully adds that sec­ond-or­der ar­ith­metic (ar­ith­metic us­ing the lan­guage of sets) has only a sin­gle model (up to what is called ‘iso­mor­phism’, mean­ing a pre­cise anal­ogy) and that Godel’s origi­nal sen­tence is ‘true’ within this ab­strac­tion.

• I know enough about this to not have these ques­tions, but not enough to ex­plain the an­swers to any­one else. So I’ll recom­mend a book by Torkel Franzén, who was definitely able to both un­der­stand and teach this, “Gödel’s The­o­rem: An In­com­plete Guide to Its Use and Abuse”. The book costs money, but as a pre­view, here’s a re­view of it.

Dou­glas Hofs­tadter has writ­ten a lot on the sub­ject for a pop­u­lar au­di­ence, but is bet­ter avoided un­til you un­der­stand the sub­ject your­self well enough to recog­nise the un­stated tech­ni­cal un­der­pin­nings of his ex­po­si­tion, and to see where he glosses over things a step too far. But when you are at that point, there is no need to read him.

• One source that can help you learn more about the con­cept of prov­abil­ity is Gödel, Escher, Bach: An Eter­nal Golden Braid by Dou­glas Hofs­tadter. It talks about Gödel’s The­o­rem and helps to get a deeper un­der­stand­ing of it. Plus, it’s just a very in­ter­est­ing book.

• In an­swer to the ques­tion of how can some­thing be true, but not prov­able I want to point to the Gold­bach con­jec­ture which says “ev­ery even num­ber > 2 is the sum of two primes”. If the Gold­bach con­jec­ture is false then there’s a coun­terex­am­ple which can be checked in finite time (eg. just try adding all pairs of primes less than that num­ber al­though there are faster ways). If there isn’t a coun­terex­am­ple then the Gold­bach con­jec­ture is true. To be prov­able how­ever, there would have to ex­ist a proof of the Gold­bach con­jec­ture. No such proof is known to ex­ist. Here “truth” is ex­actly what it in­tu­itively means. Either a coun­terex­am­ple ex­ists or it doesn’t. The “truth” of the Gold­bach con­jec­ture won’t de­pend on your choice of ax­ioms. All you need are the prim­i­tives nec­es­sary to define the nat­u­ral num­bers, pri­mal­ity, and ad­di­tion (in par­tic­u­lar, it won’t de­pend on AC).

• Note that you can prove “P or not P”.

• In what sense? In Clas­si­cal Logic it’s an ax­iom, and in the Calcu­lus of In­du­tive Con­struc­tions it’s un­prov­able.

(In­ter­est­ingly, you can prove in Coq that the nega­tion of “forall P, P or not P” is un­prov­able. So it’s safe to as­sume it: https://​​stack­overflow.com/​​ques­tions/​​32812834/​​how-to-prove-ex­cluded-mid­dle-is-ir­refutable-in-coq )

• Ike is re­spond­ing to this:

Gödel: What could it mean for a state­ment to be “true but not prov­able”? Is this just be­cause there are some state­ments such that nei­ther P nor not-P can be proven, yet one of them must be true? If so, I would (stub­bornly) con­test that per­haps P and not-P re­ally are both non-true.

“P and not-P re­ally are both non-true” is clas­si­cally false, and Gödel holds in clas­si­cal math­e­mat­ics, so Sunny’s re­sponse isn’t available in that case.

Sunny’s sense that “per­haps P and not-P re­ally are both non-true” might be a rea­son for him to en­dorse in­tu­ition­ism as “more cor­rect” than clas­si­cal math in some sense.

• All prov­able state­ments fol­low from the ax­ioms, in­clud­ing “P or not P” for any par­tic­u­lar P. It’s the same sense as any other state­ment can be prov­able.

• All prov­able state­ments fol­low from the axioms

Yes, in any for­mal sys­tem, all prov­able state­ments fol­low from the ax­ioms. How­ever, there are many for­mal sys­tems. Two of the most com­monly used ones are Clas­si­cal Logic and the Calcu­lus of In­duc­tive Con­struc­tions.

In Clas­si­cal Logic, “forall P, P or not P” is an ax­iom. So, it’s tech­ni­cally prov­able, but it would be mis­lead­ing to say that you can prove it with­out fur­ther com­ment.

In the Calcu­lus of In­duc­tive Con­struc­tions (which is an ex­ten­sion of In­tu­ition­is­tic Logic, if I un­der­stand cor­rectly), “forall P, P or not P” is not prov­able.

So if there’s a non-trival proof of “forall P, P or not P”, it isn’t in ei­ther of these for­mal sys­tems. If you do have one in mind, what for­mal sys­tem (logic) is it in, and what does the proof look like?

• There’s a non-triv­ial proof of “P or not P” for a spe­cific P.

• Is this just be­cause there are some state­ments such that nei­ther P nor not-P can be proven, yet one of them must be true? If so, I would (stub­bornly) con­test that per­haps P and not-P re­ally are both non-true.

Sure, truth is del­i­cate. To move for­ward in study­ing this topic, just go with the in­ter­pre­ta­tion that “nei­ther P nor not-P can be proven”.

Löb: How can a sys­tem of ar­ith­metic prove any­thing? Much less prove things about proofs?

Arith­me­ti­za­tion. If you have a bunch of ax­ioms (which are finite strings in a finite alpha­bet) and a bunch of rules of in­fer­ence (which are mechanis­tic rules for de­riv­ing new strings from old ones) then both can be en­coded as in­te­gers and ar­ith­metic. Then some­thing like “there ex­ists a se­quence of de­duc­tions lead­ing to such-and-such the­o­rem” can be en­coded as “there ex­ists an in­te­ger such that...” and then a bunch of ar­ith­metic.

For ex­am­ple, I’ve heard that there are proofs that PA is con­sis­tent. Let’s say one of those proofs is set in Proof Sys­tem X. Now how do we know that Proof Sys­tem X is con­sis­tent? Per­haps it can be proven con­sis­tent by us­ing Proof Sys­tem Y? Do we just end up mak­ing an in­finite chain of ap­peals up along a tower of proof sys­tems?

Yeah, these proofs just ap­peal to some­thing stronger and are pretty pointless.

Oh, speak­ing of ZFC. There seems to be a de­bate about whether we should ac­cept the Ax­iom of Choice. Isn’t it...ob­vi­ously true?

Well, it doesn’t fol­low from the other ax­ioms, and it has some coun­ter­in­tu­itive con­se­quences. That’s enough to make it de­bat­able :-)

• What could it mean for a state­ment to be “true but not prov­able”? Is this just be­cause there are some state­ments such that nei­ther P nor not-P can be proven, yet one of them must be true? If so, I would (stub­bornly) con­test that per­haps P and not-P re­ally are both non-true.

Can you give an ex­am­ple where both P and not-P are both non-true?

• Sure. For ex­am­ple, if P is the Con­tinuum Hy­poth­e­sis, eg “There ex­ists a set with a car­di­nal­ity greater than that of the in­te­gers but less than that of the real num­bers”.

In this case we know that ZF can­not prove this state­ment true and also can­not prove it false. In this case, I’d be tempted to say that it re­ally is not true and not false.

(I’m told that this state­ment would have to be ei­ther true or false in any “model of ZF”, but I don’t un­der­stand model the­ory well enough to have a good grasp of what that means. It might be the case that what I’m re­ally say­ing is that I’m un­will­ing to grant “truth sta­tus” to model A but not model B if both mod­els satisfy the ax­ioms of ZF. I’ll prob­a­bly have to to both more learn­ing and more think­ing be­fore I can clar­ify my thoughts on this any fur­ther, but any­one read­ing this should feel free to prod me with thought ex­per­i­ments, cut­ting ques­tions, etc.)

• Let me men­tion my fa­vorite in­tu­ition pump against the ax­iom of choice—the pris­on­ers with in­finite hats. For any finite num­ber of pris­on­ers, if they can’t com­mu­ni­cate they can’t even do bet­ter than chance, let alone sav­ing all but a tiny frac­tion. But as soon as there are in­finitely many, there’s some strange rit­ual they can do that lets them save all but an in­finitely small frac­tion. This is un­rea­son­able.

The is­sue is that once you have in­finite pris­on­ers you can con­struct these janky non-mea­surable sets that aren’t sub­ject to the laws of prob­a­bil­ity the­ory. There’s an ar­gu­ment to be made that these are a big­ger prob­lem than the ax­iom of choice—the ax­iom of choice is just what lets you take the ex­is­tence of these janky, non-con­struc­tive sets and de­clare that they give you a recipe for sav­ing pris­on­ers.