# Standard and Nonstandard Numbers

Fol­lowup to: Log­i­cal Pinpointing

“Oh! Hello. Back again?”

Yes, I’ve got an­other ques­tion. Ear­lier you said that you had to use sec­ond-or­der logic to define the num­bers. But I’m pretty sure I’ve heard about some­thing called ‘first-or­der Peano ar­ith­metic’ which is also sup­posed to define the nat­u­ral num­bers. Go­ing by the name, I doubt it has any ‘sec­ond-or­der’ ax­ioms. Hon­estly, I’m not sure I un­der­stand this sec­ond-or­der busi­ness at all.

“Well, let’s start by ex­am­in­ing the fol­low­ing model:”

“This model has three prop­er­ties that we would ex­pect to be true of the stan­dard num­bers - ‘Every num­ber has a suc­ces­sor’, ‘If two num­bers have the same suc­ces­sor they are the same num­ber’, and ‘0 is the only num­ber which is not the suc­ces­sor of any num­ber’. All three of these state­ments are true in this model, so in that sense it’s quite num­ber­like—”

And yet this model clearly is not the num­bers we are look­ing for, be­cause it’s got all these mys­te­ri­ous ex­tra num­bers like C and −2*. That C thing even loops around, which I cer­tainly wouldn’t ex­pect any num­ber to do. And then there’s that in­finite-in-both-di­rec­tions chain which isn’t cor­rected to any­thing else.

“Right, so, the differ­ence be­tween first-or­der logic and sec­ond-or­der logic is this: In first-or­der logic, we can get rid of the ABC—make a state­ment which rules out any model that has a loop of num­bers like that. But we can’t get rid of the in­finite chain un­der­neath it. In sec­ond-or­der logic we can get rid of the ex­tra chain.”

I would ask you to ex­plain why that was true, but at this point I don’t even know what sec­ond-or­der logic is.

“Bear with me. First, con­sider that the fol­low­ing for­mula de­tects 2-ness:”

x + 2 = x * 2

In other words, that’s a for­mula which is true when x is equal to 2, and false ev­ery­where else, so it sin­gles out 2?

“Ex­actly. And this is a for­mula which de­tects odd num­bers:”

∃y: x=(2*y)+1

Um… okay. That for­mula says, ‘There ex­ists a y, such that x equals 2 times y plus one.’ And that’s true when x is 1, be­cause 0 is a num­ber, and 1=(2*0)+1. And it’s true when x is 9, be­cause there ex­ists a num­ber 4 such that 9=(2*4)+1… right. The for­mula is true at all odd num­bers, and only odd num­bers.

“In­deed. Now sup­pose we had some way to de­tect the ex­is­tence of that ABC-loop in the model—a for­mula which was true at the ABC-loop and false ev­ery­where else. Then I could adapt the nega­tion of this state­ment to say ‘No ob­jects like this are al­lowed to ex­ist’, and add that as an ax­iom alongside ‘Every num­ber has a suc­ces­sor’ and so on. Then I’d have nar­rowed down the pos­si­ble set of mod­els to get rid of mod­els that have an ex­tra ABC-loop in them.”

Um… can I rule out the ABC-loop by say­ing ¬∃x:(x=A)?

“Er, only if you’ve told me what A is in the first place, and in a logic which has ruled out all mod­els with loops in them, you shouldn’t be able to point to a spe­cific ob­ject that doesn’t ex­ist—”

Right. Okay… so the idea is to rule out loops of suc­ces­sors… hm. In the num­bers 0, 1, 2, 3..., the num­ber 0 isn’t the suc­ces­sor of any num­ber. If I just took a group of num­bers start­ing at 1, like {1, 2, 3, …}, then 1 wouldn’t be the suc­ces­sor of any num­ber in­side that group. But in A, B, C, the num­ber A is the suc­ces­sor of C, which is the suc­ces­sor of B, which is the suc­ces­sor of A. So how about if I say: ‘There’s no group of num­bers G such that for any num­ber x in G, x is the suc­ces­sor of some other num­ber y in G.’

“Ah! Very clever. But it so hap­pens that you just used sec­ond-or­der logic, be­cause you talked about groups or col­lec­tions of en­tities, whereas first-or­der logic only talks about in­di­vi­d­ual en­tities. Like, sup­pose we had a logic talk­ing about kit­tens and whether they’re in­no­cent. Here’s a model of a uni­verse con­tain­ing ex­actly three dis­tinct kit­tens who are all in­no­cent:”

Er, what are those ‘prop­erty’ thin­gies?

“They’re all pos­si­ble col­lec­tions of kit­tens. They’re la­beled prop­er­ties be­cause ev­ery col­lec­tion of kit­tens cor­re­sponds to a prop­erty that some kit­tens have and some kit­tens don’t. For ex­am­ple, the col­lec­tion on the top right, which con­tains only the grey kit­ten, cor­re­sponds to a pred­i­cate which is true at the grey kit­ten and false ev­ery­where else, or to a prop­erty which the grey kit­ten has which no other kit­ten has. Ac­tu­ally, for now let’s just pre­tend that ‘prop­erty’ just says ‘col­lec­tion’.”

Okay. I un­der­stand the con­cept of a col­lec­tion of kit­tens.

“In first-or­der logic, we can talk about in­di­vi­d­ual kit­tens, and how they re­late to other in­di­vi­d­ual kit­tens, and whether or not any kit­ten bear­ing a cer­tain re­la­tion ex­ists or doesn’t ex­ist. For ex­am­ple, we can talk about how the grey kit­ten adores the brown kit­ten. In sec­ond-or­der logic, we can talk about col­lec­tions of kit­tens, and whether or not those col­lec­tions ex­ist. So in first-or­der logic, I can say, ‘There ex­ists a kit­ten which is in­no­cent’, or ‘For ev­ery in­di­vi­d­ual kit­ten, that kit­ten is in­no­cent’, or ‘For ev­ery in­di­vi­d­ual kit­ten, there ex­ists an­other in­di­vi­d­ual kit­ten which adores the first kit­ten.’ But it re­quires sec­ond-or­der logic to make state­ments about col­lec­tions of kit­tens, like, ‘There ex­ists no col­lec­tion of kit­tens such that ev­ery kit­ten in it is adored by some other kit­ten in­side the col­lec­tion.’”

I see. So when I tried to say that you couldn’t have any group of num­bers, such that ev­ery num­ber in the group was a suc­ces­sor of some other num­ber in the group...

″...you quan­tified over the ex­is­tence or nonex­is­tence of col­lec­tions of num­bers, which means you were us­ing sec­ond-or­der logic. How­ever, in this par­tic­u­lar case, it’s eas­ily pos­si­ble to rule out the ABC-loop of num­bers us­ing only first-or­der logic. Con­sider the for­mula:”

x=SSSx

x plus 3 is equal to it­self?

“Right. That’s a first-or­der for­mula, since it doesn’t talk about col­lec­tions. And that for­mula is false at 0, 1, 2, 3… but true at A, B, and C.”

What does the ‘+’ mean?

“Er, by ‘+’ I was try­ing to say, ‘this for­mula works out to True’ and similarly ‘¬’ was sup­posed to mean the for­mula works out to False. The gen­eral idea is that we now have a for­mula for de­tect­ing 3-loops, and dis­t­in­guish­ing them from stan­dard num­bers like 0, 1, 2 and so on.”

I see. So by adding the new ax­iom, ¬∃x:x=SSSx, we could rule out all the mod­els con­tain­ing A, B, and C or any other 3-loop of non­stan­dard num­bers.

“Right.”

But this seems like a rather ar­bi­trary sort of ax­iom to add to a fun­da­men­tal the­ory of ar­ith­metic. I mean, I’ve never seen any at­tempt to de­scribe the num­bers which says, ‘No num­ber is equal to it­self plus 3’ as a ba­sic premise. It seems like it should be a the­o­rem, not an ax­iom.

“That’s be­cause it’s brought in us­ing a more gen­eral rule. In par­tic­u­lar, first-or­der ar­ith­metic has an in­finite ax­iom schema—an in­finite but com­putable scheme of ax­ioms. Each ax­iom in the schema says, for a differ­ent first-or­der for­mula Φ(x) - pro­nounced ‘phi of x’ - that:”

1. If Φ is true at 0, i.e: Φ(0)

2. And if Φ is true of the suc­ces­sor of any num­ber where it’s true, i.e: ∀x: Φ(x)→Φ(Sx)

3. Then Φ is true of all num­bers: ∀n: Φ(n)

(Φ(0) ∧ (∀x: Φ(x) → Φ(Sx))) → (∀n: Φ(n))

“In other words, ev­ery for­mula which is true at 0, and which is true of the suc­ces­sor of any num­ber of which it is true, is true ev­ery­where. This is the in­duc­tion schema of first-or­der ar­ith­metic. As a spe­cial case we have the par­tic­u­lar in­duc­tive ax­iom:”

(0≠SSS0 ∧ (∀x: (x≠SSSx) → (Sx≠SSSSx)) → (∀n: n≠SSSn)

But that doesn’t say that for all n, n≠n+3. It gives some premises from which that con­clu­sion would fol­low, but we don’t know the premises.

“Ah, how­ever, we can prove those premises us­ing the other ax­ioms of ar­ith­metic, and hence prove the con­clu­sion. The for­mula (SSSx=x) is false at 0, be­cause 0 is not the suc­ces­sor of any num­ber, in­clud­ing SS0. Similarly, con­sider the for­mula SSSSx=Sx, which we can re­ar­range as S(SSSx)=S(x). If two num­bers have the same suc­ces­sor they are the same num­ber, so SSSx=x. If truth at Sx proves truth at x, then falsity at x proves falsity at Sx, modus po­nens to modus tol­lens. Thus the for­mula is false at zero, false of the suc­ces­sor of any num­ber where it’s false, and so must be false ev­ery­where un­der the in­duc­tion ax­iom schema of first-or­der ar­ith­metic. And so first-or­der ar­ith­metic can rule out mod­els like this:”

...er, I think I see? Be­cause if this model obeys all the other ax­ioms which which we already speci­fied, that didn’t filter it out ear­lier—ax­ioms like ‘zero is not the suc­ces­sor of any num­ber’ and ‘if two num­bers have the same suc­ces­sor they are the same num­ber’ - then we can prove that the for­mula x≠SSSx is true at 0, and prove that if the for­mula true at x it must be true at x+1. So once we then add the fur­ther ax­iom that if x≠SSSx is true at 0, and if x≠SSSx is true at Sy when it’s true at y, then x≠SSSx is true at all x...

“We already have the premises, so we get the con­clu­sion. ∀x: x≠SSSx, and thus we filter out all the 3-loops. Similar logic rules out N-loops for all N.”

So then did we get rid of all the non­stan­dard num­bers, and leave only the stan­dard model?

“No. Be­cause there was also that prob­lem with the in­finite chain … −2*, −1*, 0*, 1* and so on.”

Here’s one idea for get­ting rid of the model with an in­finite chain. All the non­stan­dard num­bers in the chain are “greater” than all the stan­dard num­bers, right? Like, if w is a non­stan­dard num­ber, then w > 3, w > 4, and so on?

“Well, we can prove by in­duc­tion that no num­ber is less than 0, and w isn’t equal to 0 or 1 or 2 or 3, so I’d have to agree with that.”

Okay. We should also be able to prove that if x > y then x + z > y + z. So if we take non­stan­dard w and ask about w + w, then w + w must be greater than w + 3, w + 4, and so on. So w + w can’t be part of the in­finite chain at all, and yet adding any two num­bers ought to yield a third num­ber.

“In­deed, that does prove that if there’s one in­finite chain, there must be two in­finite chains. In other words, that origi­nal, ex­act model in the pic­ture, can’t all by it­self be a model of first-or­der ar­ith­metic. But show­ing that the chain im­plies the ex­is­tence of yet other el­e­ments, isn’t the same as prov­ing that the chain doesn’t ex­ist. Similarly, since all num­bers are even or odd, we must be able to find v with v + v = w, or find v with v + v + 1 = w. Then v must be part of an­other non­stan­dard chain that comes be­fore the chain con­tain­ing w.”

But then that re­quires an in­finite num­ber of in­finite chains of non­stan­dard num­bers which are all greater than any stan­dard num­ber. Maybe we can ex­tend this logic to even­tu­ally reach a con­tra­dic­tion and rule out the ex­is­tence of an in­finite chain in the first place—like, we’d show that any com­plete col­lec­tion of non­stan­dard num­bers has to be larger than it­self -

“Good idea, but no. You end up with the con­clu­sion that if a sin­gle non­stan­dard num­ber ex­ists, it must be part of a chain that’s in­finite in both di­rec­tions, i.e., a chain that looks like an or­dered copy of the nega­tive and pos­i­tive in­te­gers. And that if an in­finite chain ex­ists, there must be in­finite chains cor­re­spond­ing to all ra­tio­nal num­bers. So some­thing that could ac­tu­ally be a non­stan­dard model of first-or­der ar­ith­metic, has to con­tain at least the stan­dard num­bers fol­lowed by a copy of the ra­tio­nal num­bers with each ra­tio­nal num­ber re­placed by a copy of the in­te­gers. But then that setup works just fine with both ad­di­tion and mul­ti­pli­ca­tion—we can’t prove that it has to be any larger than what we’ve already said.”

Okay, so how do we get rid of an in­finite num­ber of in­finite chains of non­stan­dard num­bers, and leave just the stan­dard num­bers at the be­gin­ning? What kind of state­ment would they vi­o­late—what sort of ax­iom would rule out all those ex­tra num­bers?

“We have to use sec­ond-or­der logic for that one.”

Hon­estly I’m still not 100% clear on the differ­ence.

“Okay… ear­lier you gave me a for­mula which de­tected odd num­bers.”

Right. ∃y: x=(2*y)+1, which was true at x=1, x=9 and so on, but not at x=0, x=4 and so on.

“When you think in terms of col­lec­tions of num­bers, well, there’s some col­lec­tions which can be defined by for­mu­las. For ex­am­ple, the col­lec­tion of odd num­bers {1, 3, 5, 7, 9, …} can be defined by the for­mula, with x free, ∃y: x=(2*y)+1. But you could also try to talk about just the col­lec­tion {1, 3, 5, 7, 9, …} as a col­lec­tion, a set of num­bers, whether or not there hap­pened to be any for­mula that defined it—”

Hold on, how can you talk about a set if you can’t define a for­mula that makes some­thing a mem­ber or a non-mem­ber? I mean, that seems a bit smelly from a ra­tio­nal­ist per­spec­tive -

“Er… re­mem­ber the ear­lier con­ver­sa­tion about kit­tens?”

“Sup­pose you say some­thing like, ‘There ex­ists a col­lec­tion of kit­tens, such that ev­ery kit­ten adores only other kit­tens in the col­lec­tion’. Give me a room full of kit­tens, and I can count through all pos­si­ble col­lec­tions, check your state­ment for each col­lec­tion, and see whether or not there’s a col­lec­tion which is ac­tu­ally like that. So the state­ment is mean­ingful—it can be falsified or ver­ified, and it con­strains the state of re­al­ity. But you didn’t give me a lo­cal for­mula for pick­ing up a sin­gle kit­ten and de­cid­ing whether or not it ought to be in this mys­te­ri­ous col­lec­tion. I had to iter­ate through all the col­lec­tions of kit­tens, find the col­lec­tions that matched your state­ment, and only then could I de­cide whether any in­di­vi­d­ual kit­ten had the prop­erty of be­ing in a col­lec­tion like that. But the state­ment was still falsifi­able, even though it was, in math­e­mat­i­cal par­lance, im­pred­ica­tive—that’s what we call it when you make a state­ment that can only be ver­ified by look­ing at many pos­si­ble col­lec­tions, and doesn’t start from any par­tic­u­lar col­lec­tion that you tell me how to con­struct.”

Ah… hm. What about in­finite uni­verses of kit­tens, so you can’t iter­ate through all pos­si­ble col­lec­tions in finite time?

“If you say, ‘There ex­ists a col­lec­tion of kit­tens which all adore each other’, I could ex­hibit a group of three kit­tens which adored each other, and so prove the state­ment true. If you say ‘There’s a col­lec­tion of four kit­tens who adore only each other’, I might come up with a con­struc­tive proof, given the other known prop­er­ties of kit­tens, that your state­ment was false; and any time you tried giv­ing me a group of four kit­tens, I could find a fifth kit­ten, adored by some kit­ten in your group, that falsified your at­tempt. But this is get­ting us into some rather deep parts of math we should prob­a­bly stay out of for now. The point is that even in in­finite uni­verses, there are sec­ond-or­der state­ments that you can prove or falsify in finite amounts of time. And once you ad­mit those par­tic­u­lar sec­ond-or­der state­ments are talk­ing about some­thing mean­ingful, well, you might as well just ad­mit that sec­ond-or­der state­ments in gen­eral are mean­ingful.”

...that sounds a lit­tle iffy to me, like we might get in trou­ble later on.

“You’re not the only math­e­mat­i­cian who wor­ries about that.”

But let’s get back to num­bers. You say that we can use sec­ond-or­der logic to rule out any in­finite chain.

“In­deed. In sec­ond-or­der logic, in­stead of us­ing an in­finite ax­iom schema over all for­mu­las Φ, we quan­tify over pos­si­ble col­lec­tions di­rectly, and say, in a sin­gle state­ment:”

∀P: P(0) ∧ (∀x: P(x) → P(Sx)) → (∀n: P(n))

“Here P is any pred­i­cate true or false of in­di­vi­d­ual num­bers. Any col­lec­tion of num­bers cor­re­sponds to a pred­i­cate that is true of num­bers in­side the col­lec­tion and false of num­bers out­side of it.”

Okay… and how did that rule out in­finite chains again?

“Be­cause in prin­ci­ple, whether or not there’s any first-or­der for­mula that picks them out, there’s the­o­ret­i­cally a col­lec­tion that con­tains the stan­dard num­bers {0, 1, 2, …} and only the stan­dard num­bers. And if you treat that col­lec­tion as a pred­i­cate P, then P is true at 0 - that is, 0 is in the stan­dard num­bers. And if 200 is a stan­dard num­ber then so is 201, and so on; if P is true at x, it’s true at x+1. On the other hand, if you treat the col­lec­tion ‘just the stan­dard num­bers’ as a pred­i­cate, it’s false at −2*, false at −1*, false at 0* and so on—those num­bers aren’t in this the­o­ret­i­cal col­lec­tion. So it’s vac­u­ously true that this pred­i­cate is true at 1* if it’s true at 0*, be­cause it’s not true at 0*. And so we end up with:”

“And so the sin­gle sec­ond-or­der ax­iom...”

∀P: P0 ∧ (∀x: Px → P(Sx)) → (∀n: Pn)

″...rules out any dis­con­nected chains, finite loops, and in­deed ev­ery non­stan­dard num­ber, in one swell foop.”

But what did that ax­iom mean, ex­actly? I mean, taboo the phrase ‘stan­dard num­bers’ for a mo­ment, pre­tend I’ve got no idea what those are, just ex­plain to me what the ax­iom ac­tu­ally says.

“It says that the model be­ing dis­cussed—the model which fits this ax­iom—makes it im­pos­si­ble to form any col­lec­tion closed un­der suc­ces­sion which in­cludes 0 and doesn’t in­clude ev­ery­thing. It’s im­pos­si­ble to have any col­lec­tion of ob­jects in this uni­verse such that 0 is in the col­lec­tion, and the suc­ces­sor of ev­ery­thing in the col­lec­tion is in the col­lec­tion, and yet this col­lec­tion doesn’t con­tain ev­ery­thing. So you can’t have a dis­con­nected in­finite chain—there would then ex­ist at least one col­lec­tion over ob­jects in this uni­verse that con­tained 0 and all its suc­ces­sor-de­scen­dants, yet didn’t con­tain the chain; and we have a shiny new ax­iom which says that can’t hap­pen.”

Can you per­haps op­er­a­tional­ize that in a more sen­so­ry­mo­tory sort of way? Like, if this is what I be­lieve about the uni­verse, then what do I ex­pect to see?

“If this is what you be­lieve about the math­e­mat­i­cal model that you live in… then you be­lieve that nei­ther you, nor any ad­ver­sary, nor yet a su­per­in­tel­li­gence, nor yet God, can con­sis­tently say ‘Yea’ or ‘Nay’ to ob­jects in such fash­ion that when you pre­sent them with 0, they say ‘Yea’, and when you pre­sent them with any other ob­ject, if they say ‘Yea’, they also say ‘Yea’ for the suc­ces­sor of that ob­ject; and yet there is some ob­ject for which they say ‘Nay’. You be­lieve this can never hap­pen, no mat­ter what. The way in which the ob­jects in the uni­verse are ar­ranged by suc­ces­sion, just doesn’t let that hap­pen, ever.”

Ah. So if, say, they said ‘Nay’ for 42, I’d go back and ask about 41, and then 40, and by the time I reached 0, I’d find ei­ther that they said ‘Nay’ about 0, or that they said ‘Nay’ for 41 and yet ‘Yea’ for 40. And what do I ex­pect to see if I be­lieve in first-or­der ar­ith­metic, with the in­finite ax­iom schema?

“In that case, you be­lieve there’s no neatly speci­fi­able, com­pactly de­scrib­able rule which be­haves like that. But if you be­lieve the sec­ond-or­der ver­sion, you be­lieve no­body can pos­si­bly be­have like that even if they’re an­swer­ing ran­domly, or branch­ing the uni­verse to an­swer differ­ent ways in differ­ent al­ter­nate uni­verses, and so on. And note, by the way, that if we have a finite uni­verse—i.e., we throw out the rule that ev­ery num­ber has a suc­ces­sor, and say in­stead that 256 is the only num­ber which has no suc­ces­sor—then we can ver­ify this ax­iom in finite time.”

I see. Still, is there any way to rule out in­finite chains us­ing first-or­der logic? I might find that eas­ier to deal with, even if it looks more com­pli­cated at first.

“I’m afraid not. One way I like to look at it is that first-or­der logic can talk about con­straints on how the model looks from any lo­cal point, while only sec­ond-or­der logic can talk about global qual­ities of chains, col­lec­tions, and the model as a whole. Whether ev­ery num­ber has a suc­ces­sor is a lo­cal prop­erty—a ques­tion of how the model looks from the van­tage point of any one num­ber. Whether a num­ber plus three, can be equal to it­self, is a ques­tion you could eval­u­ate at the lo­cal van­tage point of any one num­ber. Whether a num­ber is even, is a ques­tion you can an­swer by look­ing around for a sin­gle, in­di­vi­d­ual num­ber x with the prop­erty that x+x equals the first num­ber. But when you try to say that there’s only one con­nected chain start­ing at 0, by in­vok­ing the idea of con­nect­ed­ness and chains you’re try­ing to de­scribe non-lo­cal prop­er­ties that re­quire a logic-of-pos­si­ble-col­lec­tions to spec­ify.”

Huh. But if all the ‘lo­cal’ prop­er­ties are the same re­gard­less, why worry about global prop­er­ties? In first-or­der ar­ith­metic, any ‘lo­cal’ for­mula that’s true at zero and all of its ‘nat­u­ral’ suc­ces­sors would also have to be true of all the dis­con­nected in­finite chains… right? Or did I make an er­ror there? All the other in­finite chains be­sides the 0-chain—all ‘non­stan­dard num­bers’ - would have just the same prop­er­ties as the ‘nat­u­ral’ num­bers, right?

“I’m afraid not. The first-or­der ax­ioms of ar­ith­metic may fail to pin down whether or not a Tur­ing ma­chine halts—whether there ex­ists a time at which a Tur­ing ma­chine halts. Let’s say that from our per­spec­tive in­side the stan­dard num­bers, the Tur­ing ma­chine ‘re­ally doesn’t’ halt—it doesn’t halt on clock tick 0, doesn’t halt on clock tick 1, doesn’t halt on tick 2, and so on through all the stan­dard suc­ces­sors of the 0-chain. In non­stan­dard mod­els of the in­te­gers—mod­els with other in­finite chains—there might be some­where in­side a non­stan­dard chain where the Tur­ing ma­chine goes from run­ning to halted and stays halted there­after.”

“In this new model—which is fully com­pat­i­ble with the first-or­der ax­ioms, and can’t be ruled out by them—it’s not true that ‘for ev­ery num­ber t at which the Tur­ing ma­chine is run­ning, it will still be run­ning at t+1’. Even though if we could some­how re­strict our at­ten­tion to the ‘nat­u­ral’ num­bers, we would see that the Tur­ing ma­chine was run­ning at 0, 1, 2, and ev­ery time in the suc­ces­sor-chain of 0.”

Okay… I’m not quite sure what the prac­ti­cal im­pli­ca­tion of that is?

“It means that many Tur­ing ma­chines which in fact never halt at any stan­dard time, can’t be proven not to halt us­ing first-or­der rea­son­ing, be­cause their non-halt­ing-ness does not ac­tu­ally fol­low log­i­cally from the first-or­der ax­ioms. Logic is about which con­clu­sions fol­low from which premises, re­mem­ber? If there are mod­els which are com­pat­i­ble with all the first-or­der premises, but still falsify the state­ment ‘X runs for­ever’, then the state­ment ‘X runs for­ever’ can’t log­i­cally fol­low from those premises. This means you won’t be able to prove—shouldn’t be able to prove—that this Tur­ing ma­chine halts, us­ing only first-or­der logic.”

How ex­actly would this fail in prac­tice? I mean, where does the proof go bad?

“You wouldn’t get the sec­ond step of the in­duc­tion, ‘for ev­ery num­ber t at which the Tur­ing ma­chine is run­ning, it will still be run­ning at t+1’. There’d be non­stan­dard mod­els with some non­stan­dard t that falsifies the premise—a non­stan­dard time where the Tur­ing ma­chine goes from run­ning to halted. Even though if we could some­how re­strict our at­ten­tion to only the stan­dard num­bers, we would see that the Tur­ing ma­chine was run­ning at 0, 1, 2, and so on.”

But if a Tur­ing ma­chine re­ally ac­tu­ally halts, there’s got to be some par­tic­u­lar time when it halts, like on step 97 -

“In­deed. But 97 ex­ists in all non­stan­dard mod­els of ar­ith­metic, so we can prove its ex­is­tence in first-or­der logic. Any time 0 is a num­ber, ev­ery num­ber has a suc­ces­sor, num­bers don’t loop, and so on, there’ll ex­ist 97. Every non­stan­dard model has at least the stan­dard num­bers. So when­ever a Tur­ing ma­chine does halt, you can prove in first-or­der ar­ith­metic that it halts—it does in­deed fol­low from the premises. That’s kinda what you’d ex­pect, given that you can just watch the Tur­ing ma­chine for 97 steps. When some­thing ac­tu­ally does halt, you should be able to prove it halts with­out wor­ry­ing about un­bounded fu­ture times! It’s when some­thing doesn’t ac­tu­ally halt—in the stan­dard num­bers, that is—that the ex­is­tence of ‘non­stan­dard halt­ing times’ be­comes a prob­lem. Then, the con­clu­sion that the Tur­ing ma­chine runs for­ever may not ac­tu­ally fol­low from first-or­der ar­ith­metic, be­cause you can obey all the premises of first-or­der ar­ith­metic, and yet still be in­side a non­stan­dard model where this Tur­ing ma­chine halts at a non­stan­dard time.”

So sec­ond-or­der ar­ith­metic is more pow­er­ful than first-or­der ar­ith­metic in terms of what fol­lows from the premises?

“That fol­lows in­evitably from the abil­ity to talk about fewer pos­si­ble mod­els. As it is writ­ten, ‘What is true of one ap­ple may not be true of an­other ap­ple; thus more can be said about a sin­gle ap­ple than about all the ap­ples in the world.’ If you can re­strict your dis­course to a nar­rower col­lec­tion of mod­els, there are more facts that fol­low in­evitably, be­cause the more mod­els you might be talk­ing about, the fewer facts can pos­si­bly be true about all of them. And it’s also definitely true that sec­ond-or­der ar­ith­metic proves more the­o­rems than first-or­der ar­ith­metic—for ex­am­ple, it can prove that a Tur­ing ma­chine which com­putes Good­stein se­quences always reaches 0 and halts, or that Her­cules always wins the hy­dra game. But there’s a bit of con­tro­versy we’ll get into later about whether sec­ond-or­der logic is ac­tu­ally more pow­er­ful than first-or­der logic in gen­eral.”

Well, sure. After all, just be­cause no­body has ever yet in­vented a first-or­der for­mula to filter out all the non­stan­dard num­bers, doesn’t mean it can never, ever be done. To­mor­row some brilli­ant math­e­mat­i­cian might figure out a way to take an in­di­vi­d­ual num­ber x, and do lo­cal things to it us­ing ad­di­tion and mul­ti­pli­ca­tion and the ex­is­tence or nonex­is­tence of other in­di­vi­d­ual num­bers, which can tell us whether that num­ber is part of the 0-chain or some other in­finite-in-both-di­rec­tions chain. It’ll be as easy as (a=b*c) -

“Nope. Ain’t never gonna hap­pen.”

But maybe you could find some en­tirely differ­ent cre­ative way of first-or­der ax­io­m­a­tiz­ing the num­bers which has only the stan­dard model -

“Nope.”

Er… how do you know that, ex­actly? I mean, part of the Player Char­ac­ter Code is that you don’t give up when some­thing seems im­pos­si­ble. I can’t quite see yet how to de­tect in­finite chains us­ing a first-or­der for­mula. But then ear­lier I didn’t re­al­ize you could rule out finite loops, which turned out to be quite sim­ple once you ex­plained. After all, there’s two dis­tinct uses of the word ‘im­pos­si­ble’, one which in­di­cates pos­i­tive knowl­edge that some­thing can never be done, that no pos­si­ble chain of ac­tions can ever reach a goal, even if you’re a su­per­in­tel­li­gence. This kind of knowl­edge re­quires a strong, definite grasp on the sub­ject mat­ter, so that you can rule out ev­ery pos­si­ble av­enue of suc­cess. And then there’s an­other, much more com­mon use of the word ‘im­pos­si­ble’, which means that you thought about it for five sec­onds but didn’t see any way to do it, usu­ally used in the pres­ence of weak grasps on a sub­ject, sub­jects that seem sa­credly mys­te­ri­ous -

“Right. Rul­ing out an in­finite-in-both-di­rec­tions chain, us­ing a first-or­der for­mula, is the first kind of im­pos­si­bil­ity. We know that it can never be done.”

I see. Well then, what do you think you know, and how do you think you know it? How is this definite, pos­i­tive knowl­edge of im­pos­si­bil­ity ob­tained, us­ing your strong grasp on the non-mys­te­ri­ous sub­ject mat­ter?

“We’ll take that up next time.”

Part of the se­quence Highly Ad­vanced Episte­mol­ogy 101 for Beginners

Pre­vi­ous post: “By Which It May Be Judged

• Now how do we get rid of the chain?

“We have to use sec­ond-or­der logic for that one.”

No we don’t.

The model with the nat­u­ral num­bers and a sin­gle “in­te­ger line” is not a first or­der model of ar­ith­metic. The rea­son is this. For a non-stan­dard num­ber “a” large enough there is a (non-stan­dard) nat­u­ral num­ber that’s ap­prox­i­mately some ra­tio­nal frac­tion of “a.” This num­ber then has suc­ces­sors and pre­de­ces­sors, so it has an “in­te­ger line” around it. But be­cause we can play this game for any frac­tion, we need lots of in­te­ger lines (or­dered ac­cord­ing to the to­tal or­der­ing on the ra­tio­nals).

See this for de­tails:

http://​​web.mit.edu/​​24.242/​​www/​​Non­stan­dard­Models.pdf

• Yes we do.

The prob­lem of a chain isn’t in­tended to be limited to the prob­lem of ex­actly one chain, and I didn’t want to com­pli­cate the di­a­gram or con­fuse my read­ers by show­ing them a copy of the ra­tio­nals with each ra­tio­nal re­placed by a copy of the in­te­gers. If you can’t get rid of a larger struc­ture that has a chain in it, you can’t get rid of the chain. To put it an­other way, show­ing that the chain de­picted im­plies fur­ther ex­tra el­e­ments isn’t the same as rul­ing out the ex­is­tence of that chain.

Hence the word­ing, “How do we get rid of the chain?” not “How do we get rid of this par­tic­u­lar ex­act model here?”

A very quick way to see that there must be more than one chain is to note that if x > y, then x + z > y + z. An el­e­ment of the non­stan­dard chain is greater than any nat­u­ral num­ber, so if we add two non­stan­dard num­bers to­gether, the re­sult must be greater than the non­stan­dard start­ing point plus any nat­u­ral num­ber. There­fore there must be an­other chain which comes af­ter the first one. For more on this see the linked pa­per.

EDIT: Sev­eral oth­ers re­ported mis­in­ter­pret­ing what I had in the origi­nal, so I’ve ed­ited the post ac­cord­ingly. Thanks for rais­ing the is­sue, Ilya!

• It’s prob­a­bly worth ex­plic­itly men­tion­ing that the struc­ture that you de­scribed isn’t ac­tu­ally a model of PA. I’d imag­ine that could oth­er­wise be con­fus­ing for read­ers who have never seen this stuff be­fore and are clever enough to no­tice the is­sue.

• Edited.

• Thanks, it makes much more sense now.

• Thanks for edit­ing!

• After I had read your post but be­fore I had read IlyaSphitser’s com­ment I thought that the par­tic­u­lar model with a sin­gle in­te­ger chain was in fact a model of first-or­der ar­ith­metic, so the post was definitely mis­lead­ing to me in that re­spect.

• An el­e­ment of the non­stan­dard chain is greater than any nat­u­ral num­ber...

Can some­one ex­plain this? I don’t un­der­stand how > can be a valid op­er­a­tion on two dis­con­nected chains of num­ber-thin­gies.

• < is defined in terms of plus by say­ing x<y iff there ex­ists a such that y=z+x. + is sup­posed to be pro­vided as a prim­i­tive op­er­a­tion as part of the data con­sist­ing of a model of PA. It’s not ac­tu­ally pos­si­ble to give a con­crete de­scrip­tion of what + looks like in gen­eral for non-stan­dard mod­els be­cause of Te­nen­baums’s The­o­rem, but at least when one of x or y (say x) is a stan­dard num­ber it’s ex­actly what you’d ex­pect: x+y is what you get by start­ing at y and go­ing x steps to the right.

To see that x<y when­ever x is a stan­dard num­ber and y isn’t, you need to be a lit­tle tricky. You ac­tu­ally prove an in­finite fam­ily of state­ments. The first one is “for all x, ei­ther x=0 or else x>0”. The sec­ond is “for all x, ei­ther x=0 or x=1 or x>1″, and in gen­eral it’s “for all x, ei­ther x=0,1,..., or n, or else x>n”. Each of these can be proven by in­duc­tion, and the en­tire in­finite fam­ily to­gether im­plies that a non-stan­dard num­ber is big­ger than ev­ery stan­dard num­ber.

• Thanks!

I sup­pose you can’t prove a state­ment like “No mat­ter how many times you ex­pand this in­finite fam­ily of ax­ioms, you’ll never en­counter a non-stan­dard num­ber” in first-or­der logic? Should I not think of num­bers and non-stan­dard num­bers as hav­ing differ­ent types? Or should I think of > as ac­cept­ing differ­ently typed things? (where I’m us­ing the defi­ni­tion of “type” from com­puter sci­ence, e.g. “strongly-typed lan­guage”)

• Sorry I didn’t an­swer this be­fore; I didn’t see it. To the ex­tent that the anal­ogy ap­plies, you should think of non-stan­dard num­bers and stan­dard num­bers as hav­ing the same type. Speci­fi­cally, the type of things that are be­ing quan­tified over in what­ever first or­der logic you are us­ing. And you’re right that you can’t prove that state­ment in first or­der logic; Worse, you can’t even say it in first or­der logic (see the next post, on Godel’s the­o­rems and Com­pact­ness/​Lowen­heim Skolem for why).

• Thanks. Hm. I think I see why that can’t be said in first or­der logic.

...my brain is shout­ing “if I start at 0 and count up I’ll never reach a non­stan­dard num­ber, there­fore they don’t ex­ist” at me so loudly that it’s very difficult to re­strict my thoughts to only first-or­der ones.

• This is largely a mat­ter of keep­ing track of the dis­tinc­tion be­tween “first or­der logic: the math­e­mat­i­cal con­struct” and “first or­der logic: the form of rea­son­ing I some­times use when think­ing about math”. The former is an ideal­ized model of the lat­ter, but they are dis­tinct and be­long in dis­tinct men­tal buck­ets.

It may help to write a proof checker for first or­der logic. Or al­ter­na­tively, if you are able to read higher math, study some math­e­mat­i­cal logic/​model the­ory.

• In math­e­mat­ics, a [bi­nary] re­la­tion (like >, since it con­sid­ers two nat­u­ral num­bers and then is ei­ther true or false, based on which num­bers are con­sid­ered) is just a set of or­dered pairs. Within the stan­dard model of the nat­u­ral num­bers, > is just the [in­finite] col­lec­tion of or­dered pairs { (2,1) , (3,1) , (3,2) , (4,1) , (4,2) , (4,3) , … }. So, sup­pose we have two chains of num­ber-thin­gies...1, 2, 3,… and 1^, 2^, 3^, …. We can make the ‘>’ rule as fol­lows: ” ‘x > y’ if and only if ‘x has a caret and y does not, or (in this case, both num­bers must be in the same chain) x is greater than y within its own chain’ ”. This [in­finite] col­lec­tion of or­dered pairs would be { (2,1) , (2^,1^) , (1^,1) , (3^,1^) , (3,2) , (3^,2^) , (3,1) , (4^,1^) , (1^,2) , (4^,2^) , (4,3) , (4^,3^) , … }.

So ‘>’ is a valid re­la­tion on two dis­con­nected chains of num­ber-thin­gies, be­cause we define it to be so by fiat. The num­bers we’re work­ing with are non­stan­dard...so there is no rea­son to ex­pect that there should be some stan­dard, nat­u­ral mean­ing for ‘>’.

Im­por­tant Note: This ex­pla­na­tion of ‘>’ does not cor­re­spond to a non­stan­dard model of first-or­der Peano ar­ith­metic (and, clearly, not the stan­dard model, ei­ther). If you want to know more about that, look to earth­worm­chuck163′s com­ment. I thought it might be helpful to you to un­der­stand it in a case that’s eas­ier to pic­ture, be­fore jump­ing to the case of a non­stan­dard model of first-or­der Peano ar­ith­metic. That case is even more com­plex than Eliezer re­vealed within his post. It would prob­a­bly be ex­tremely helpful to you to learn about well-or­ders, or­der types, and the or­di­nal num­bers to get a han­dle on this stuff. You are more tal­ented than I if you are able to un­der­stand it with­out that back­ground knowl­edge.

Hope this helps.

Edit: An­noy­ingly (in this case), the as­ter­isk causes ital­i­ciza­tion. Changed as­ter­isks to carets.

Edit 2: Changed “op­er­a­tion” to “re­la­tion” ev­ery­where, as per pa­per-ma­chine’s cor­rect com­ment.

• In math­e­mat­ics, a [bi­nary] op­er­a­tion (like >, since it con­sid­ers two nat­u­ral num­bers and then is ei­ther true or false, based on which num­bers are con­sid­ered) is just a set of or­dered pairs.

Not to nit­pick, but “>” is a bi­nary re­la­tion, not a bi­nary op­er­a­tion.

• Ha, thanks. I don’t mind nit­pick­ing. I’ll edit the com­ment.

• Ac­tu­ally, a bi­nary re­la­tion is a bi­nary op­er­a­tion (it re­turns 1 if true and 0 if false). You passed up a chance to counter-nit­pick the nit­picker.

• Yes, if you want a two-sorted the­ory, then you can make a boolean type and lift all re­la­tions to op­er­a­tions.

That’s not the typ­i­cal use of the word “op­er­a­tion” in model the­ory, how­ever.

• Wikipe­dia con­curs:

Any countable non­stan­dard model of ar­ith­metic has or­der type ω + (ω + ω) · η, where ω is the or­der type of the stan­dard nat­u­ral num­bers, ω is the dual or­der (an in­finite de­creas­ing se­quence) and η is the or­der type of the ra­tio­nal num­bers. In other words, a countable non­stan­dard model be­gins with an in­finite in­creas­ing se­quence (the stan­dard el­e­ments of the model). This is fol­lowed by a col­lec­tion of “blocks,” each of or­der type ω* + ω, the or­der type of the in­te­gers. Th­ese blocks are in turn densely or­dered with the or­der type of the ra­tio­nals. The re­sult fol­lows fairly eas­ily be­cause it is easy to see that the non stan­dard num­bers have to be dense and lin­early or­dered with­out end­points, and the ra­tio­nals are the only countable dense lin­ear or­der with­out end­points.

Edit: Eliezer seems to have been aware of this, and gave a valid re­ply to your com­ment, so I won’t call it a “mis­take” any­more. I do think some re­word­ing or a clar­ify­ing an­no­ta­tion within the OP would be helpful, though.

• Very nice. Th­ese notes say that ev­ery countable non­stan­dard model of Peano ar­ith­metic is iso­mor­phic, as an or­dered set, to the nat­u­ral num­bers fol­lowed by lex­i­co­graph­i­cally or­dered pairs (r, z) for r a pos­i­tive ra­tio­nal and z an in­te­ger. If I re­mem­ber rightly, the or­der­ing can be defined in terms of ad­di­tion: x ⇐ y iff ex­ists z. x+z ⇐ y. So if we want to have a countable non­stan­dard model of Peano ar­ith­metic with suc­ces­sor func­tion and ad­di­tion we need all these non­stan­dard num­bers.

It seems that if we only care about Peano ar­ith­metic with the suc­ces­sor func­tion, then the nat­u­rals plus a sin­gle copy of the in­te­gers is a model. If I was try­ing to prove this, I’d think that just look­ing at the suc­ces­sor func­tion, to any first-or­der pred­i­cate an el­e­ment of the copy of the in­te­gers would be in­dis­t­in­guish­able from a very large stan­dard nat­u­ral num­ber, by stan­dard FO lo­cal­ity re­sults.

• I think whether nat­u­rals plus one non-stan­dard in­te­ger line is a model of Peano’s ax­ioms for the suc­ces­sor only (no ad­di­tion/​mul­ti­pli­ca­tion) de­pends on whether we use sec­ond or­der or first or­der logic to ex­press in­duc­tion. (No in sec­ond or­der for­mu­la­tion due to Dedekind’s re­sult, yes for any first or­der for­mu­la­tion).

• Eliezer, I just want to say thanks. This con­ver­sa­tional method of teach­ing logic/​math is very ap­proach­able and en­gag­ing to me. Much ap­pre­ci­ated!

• If you en­joyed this, you should try Gödel, Escher, Bach. The style and sub­ject mat­ter are very similar.

• I sup­pose it is not sim­ply co­in­ci­dence that I am read­ing it right now. Thanks for the sug­ges­tion!

• Con­sider us­ing “☑” or similar) to mean “true”, rather than over­load­ing “+”?

• The dou­ble-turn­stile ⊨ is the usual sym­bol for say­ing that a sen­tence is true (in a given model).

• Or T and F for ‘is true’ and ‘is false’, or the T and up­side-down T of­ten used for (tau­tolog­i­cal) truth and (tau­tolog­i­cal) falsity for true and false.

• Can you (or some­one else too, I guess) give an ex­am­ple of a Tur­ing ma­chine with a non­stan­dard halt­ing time? It’s not clear to me what you mean by run­ning a Tur­ing ma­chine for a non­stan­dard num­ber of steps. (I think I can make this mean­ingful in my fa­vorite non­stan­dard model of Peano ar­ith­metic, namely an ul­tra­power of the stan­dard model, but I don’t see how to make it mean­ingful in gen­eral.)

• Yeah, there’s a lit­tle non-ob­vi­ous trick to talk­ing about prop­er­ties of Tur­ing ma­chines in the lan­guage of ar­ith­metic, which is es­sen­tial to un­der­stand­ing this.

The first thing you do is to use a lit­tle num­ber the­ory to define a bi­jec­tion be­tween nat­u­ral num­bers and finite lists of nat­u­ral num­bers. Next, you define a way to en­code the sta­tus of a Tur­ing ma­chine at one point in time as a list of num­bers (giv­ing the cur­rent state and the con­tents of the tapes); with your bi­jec­tion, you can en­code the sta­tus at one point in time as a sin­gle num­ber. Now, you en­code ex­e­cu­tion his­to­ries as finite lists of sta­tus num­bers, which your bi­jec­tion maps to a sin­gle num­ber. You can write “n de­notes a valid ex­e­cu­tion his­tory that ends in a halt­ing state” (i.e., n is a list of valid sta­tuses, with the first one be­ing a start sta­tus, the last one be­ing a halt­ing sta­tus, and each in­ter­me­di­ate one be­ing the uniquely de­ter­mined cor­rect suc­ces­sor to the pre­vi­ous one). After do­ing all this work, you can write a for­mula in the lan­guage of ar­ith­metic say­ing “the Tur­ing ma­chine m halts on in­put i”, by sim­ply say­ing “there is an n which de­notes a valid ex­e­cu­tion his­tory of ma­chine m, start­ing at in­put i and end­ing in a halt­ing state”.

Now con­sider an ex­e­cu­tion his­tory con­sist­ing of a “finite” list of non­stan­dard length.

• Okay, this is what I sus­pected af­ter think­ing about it for a bit, but like earth­worm­chuck it is not clear to me in what sense we are “re­ally” talk­ing about run­ning Tur­ing ma­chines for a non­stan­dard num­ber of steps here… the in­ter­pre­ta­tion I had in mind in the case of an ul­tra­power of the stan­dard model is more di­rect: namely, run­ning a Tur­ing ma­chine for the non­stan­dard num­ber of steps (a1, a2, a3, …) ought to mean con­sid­er­ing the se­quence of states of the Tur­ing ma­chine af­ter steps a1, a2, a3, … as an el­e­ment of the ul­tra­power of the set of pos­si­ble states of the Tur­ing ma­chine (in other words, af­ter non­stan­dard times, the Tur­ing ma­chine may be in non­stan­dard states). It is not clear to me whether we have such an in­ter­pre­ta­tion in gen­eral.

• Ok—as I replied to earth­worm­chuck, I think Eliezer isn’t say­ing at all that there is a use­ful way in which these non­stan­dard ex­e­cu­tion his­to­ries are “re­ally” talk­ing about Tur­ing ma­chines, he’s say­ing the ex­act op­po­site: they aren’t talk­ing about Tur­ing ma­chines, which is bad if you want to talk about Tur­ing ma­chines, since it means that first-or­der logic doesn’t suffice for ex­press­ing ex­actly what it is you do want to talk about.

• WIth that in­ter­pre­ta­tion, you couldn’t have a halt at a non­stan­dard time with­out halt­ing at some stan­dard time, right? If it were halted at some non­stan­dard time, it would be halted at al­most all the stan­dard times in that non­stan­dard time (here “al­most all” is with re­spect to the cho­sen ul­tra­filter), and hence in par­tic­u­lar at some stan­dard time.

(Add here stan­dard note for read­ers un­used to in­finity that it can be made perfectly sen­si­ble to talk about Tur­ing ma­chines run­ning in­finitely long and be­yond but this has noth­ing to do with what’s be­ing talked about here.)

• Ah. Right. Some­how I to­tally for­got about Łoś′s the­o­rem.

• you can write a for­mula in the lan­guage of ar­ith­metic say­ing “the Tur­ing ma­chine m halts on in­put i”

You get a for­mula which is true of the stan­dard num­bers m and i if and only if the m’th Tur­ing ma­chine halts on in­put i. Is there re­ally any mean­ingful sense in which this for­mula is still talk­ing about Tur­ing ma­chines when you sub­sti­tute el­e­ments of some non-stan­dard model?

• You get a for­mula which is true of the stan­dard num­bers m and i if and only if the m’th Tur­ing ma­chine halts on in­put i. Is there re­ally any mean­ingful sense in which this for­mula is still talk­ing about Tur­ing ma­chines when you sub­sti­tute el­e­ments of some non-stan­dard model?

In a sense, no. Eliezer’s point is this: Given the ac­tual Tur­ing ma­chine with num­ber m = 4 = SSSS0 and in­put i = 2 = SS0, you can sub­sti­tute these in to get a closed for­mula φ whose mean­ing is “the Tur­ing ma­chine SSSS0 halts on in­put SS0”. The ac­tual for­mula is some­thing like, “There is a num­ber e such that e de­notes a valid ex­e­cu­tion his­tory for ma­chine SSSS0 on in­put SS0 that ends in a halt­ing state.” In the stan­dard model, talk­ing about the stan­dard num­bers, this for­mula is true iff the ma­chine ac­tu­ally halts on that in­put. But in first-or­der logic, you can­not pin­point the stan­dard model, and so it can hap­pen that for­mula φ is false in the stan­dard model, but true in some non­stan­dard model. If you use sec­ond-or­der logic (and be­lieve its stan­dard se­man­tics, not its Henkin se­man­tics), for­mula φ is valid, i.e. true in ev­ery model, if and only if ma­chine 4 re­ally halts on in­put 2.

• Okay. This is ex­actly what I thought it should be, but the way Eliezer phrased things made me won­der if I was miss­ing some­thing. Thanks for clar­ify­ing.

• Dis­claimer: I am not fa­mil­iar with the for­mal­ities of Tur­ing ma­chines, and am quite pos­si­bly talk­ing out of my ass, and prob­a­bly not think­ing along the same lines as Eliezer here. But it might be pos­si­ble to sal­vage the ideas into some­thing more for­mal/​cor­rect.

Con­sider a model con­tain­ing ex­actly the nat­u­ral num­bers and the starred chain. Then we might have a Tur­ing ma­chine which starts at 0 and 0 , halts if it is fed 0, and con­tinues to the suc­ces­sor oth­er­wise. Then it never halts on the nat­u­ral chain, but halts im­me­di­ately on the starred chain. Here, a Tur­ing ma­chine pre­sum­ably op­er­ates on ev­ery chain in a model meet­ing the first-or­der Peano ax­ioms.

So in gen­eral, it might be mean­ingful to talk of a Tur­ing ma­chine act­ing within a model con­tain­ing chains, which is closed on ev­ery given chain (e.g. it can’t jump from 0 to 0 ), and which could there­fore be said to be as­so­ci­ated by a ‘halt time’ func­tion, h, which maps each chain (or each chain’s zero, if you like) to a non­nega­tive num­ber in that chain which is the halt­ing time on that chain. So in my above ex­am­ple, we might leave h(0) un­defined, be­cause the ma­chine never halts on the nat­u­rals, and h(0 )=0*, be­cause it halts im­me­di­ately on that chain. This would then com­pletely define the halt­ing time over chains. (In fact, we could prob­a­bly drop clo­sure if we wanted to.)

• (Edited:) I think you’re con­flat­ing the nat­u­ral num­bers and the tape that the Tur­ing ma­chine runs on. In­ter­pret­ing “non­stan­dard halt­ing time,” the way I think Eliezer is us­ing the term, doesn’t re­quire chang­ing our no­tion of what a tape is; it just re­quires trans­lat­ing the state­ment “this Tur­ing ma­chine is in state s at time t” into a state­ment in Peano ar­ith­metic (where t is a nat­u­ral num­ber) and then in­ter­pret­ing it in a non­stan­dard model.

• 20 Dec 2012 6:49 UTC
−4 points
Parent

I think that refers to tur­ing ma­chines that never halt at stan­dard num­bers of steps (i.e. it would halt at in­finity, or more for­mally ω, which is a non­stan­dard num­ber). It might also rep­re­sent halt­ing at a nega­tive time (i.e. if you ran the tur­ing ma­chine back­wards for N steps, then for­ward again for less than N steps, it would halt, but oth­er­wise doesn’t halt). Any­thing that fails to halt in a stan­dard num­ber of steps can be con­sid­ered to halt in a non­stan­dard num­ber of steps, if you in­clude the re­straint that there has to be a value X such that it halts in X steps. By that defi­ni­tion, a tur­ing ma­chine halts if and only if X is a stan­dard num­ber. I could be wrong though.

• Eliezer isn’t ask­ing about how long a par­tic­u­lar Tur­ing ma­chine takes to halt—he’s ask­ing the bi­nary ques­tion, “Will it halt or not?” As far as I could tell, Eliezer was claiming that there ex­ist Tur­ing ma­chines that don’t halt, but that we can’t prove don’t halt us­ing first-or­der Peano ar­ith­metic. The par­tic­u­lar ex­am­ple was to show how this claim was plau­si­ble (and, in fact, true).

If you ran the Tur­ing ma­chine back­wards for N steps...

In some cases, this isn’t even a well-defined op­er­a­tion.

Any­thing that fails to halt in a stan­dard num­ber of steps...

Fails to halt. The stan­dard num­bers are the ones we care about. It’s the proof that this is the case that is non­triv­ial, and in some cases re­quires sec­ond-or­der logic (or at least, that’s what I think Eliezer is claiming). But you don’t always need sec­ond-or­der logic, so what you said (”...can be con­sid­ered to halt in a non­stan­dard num­ber of steps”, and re­ally, this should be, “on a step cor­re­spond­ing to a non­stan­dard num­ber”) was wrong.

By the way, ω isn’t a non­stan­dard num­ber in countable non­stan­dard mod­els of Peano ar­ith­metic. It’s an or­di­nal num­ber, not a car­di­nal num­ber, so I’m not even ex­actly sure what you mean...but a Tur­ing ma­chine can’t halt at time in­finity, be­cause there’s no such thing as “time in­finity”.

I re­ally, hon­estly, don’t mean this re­ply to come off as con­de­scend­ing. I think it would help you to read through the Wikipe­dia ar­ti­cle on Tur­ing ma­chines.

• It should re­fer to a Tur­ing ma­chine that never halts but can­not be proven in Peano ar­ith­metic not to halt. The sec­ond con­di­tion is im­por­tant (oth­er­wise it would just be a Tur­ing ma­chine that never halts, pe­riod). I know how to write down such a Tur­ing ma­chine (edit: for an ex­plicit ex­am­ple, con­sider a Tur­ing ma­chine which is search­ing for a con­tra­dic­tion in PA); what I don’t know is how this defi­ni­tion can be re­lated to a defi­ni­tion in terms of defin­ing what it means to run a Tur­ing ma­chine for a non­stan­dard num­ber of steps.

It doesn’t nec­es­sar­ily make sense to talk about run­ning a Tur­ing ma­chine back­wards. Also, mod­els of first-or­der Peano ar­ith­metic do not con­tain nega­tive num­bers; this is ruled out by the ax­iom that 0 is not a suc­ces­sor.

• I don’t think it could halt at a nega­tive time. If it did, it would have to stay halted, which would mean that it would still be halted at zero, so the pro­gram halts in the nat­u­ral num­bers.

• You should be care­ful with ad­di­tion and mul­ti­pli­ca­tion—to use them, you would have to define them first, and this is not triv­ial if you have the nat­u­ral num­bers plus A->B->C->A, in­finite chains and so on.

In ad­di­tion, “group” has a spe­cific math­e­mat­i­cal mean­ing, if you use it for ar­bi­trary sets this is quite con­fus­ing.

• You don’t have to define ad­di­tion and mul­ti­pli­ca­tion—you can just make them be a part of your lan­guage. In fact, in first or­der the­o­ries of ar­ith­metic, you have to do so be­cause you can­not define ad­di­tion and mul­ti­pli­ca­tion from suc­ces­sor in first or­der logic.

In other words, the difficulty is with the lan­guage not with whether you hap­pen to be us­ing a stan­dard or a non-stan­dard model. This is a gen­eral rule in model the­ory (and for that mat­ter ev­ery­where else): what you can ex­press has to do with the lan­guage not the sub­ject.

• ...now I find my­self wish­ing philo­soph­i­cal whimsy reached more of­ten for things like kit­ten in­no­cence than con­trived tor­ture sce­nar­ios.

• We can’t have both?

• Not at the same time, I hope.

• Every time you take box A and box B from Omega, Omicron tor­tures a kit­ten.

• It was once thought that ador­ing cats caused one to get tor­tured. How­ever, a re­cent med­i­cal study has come out, show­ing that most cat ador­ers have a cer­tain gene, ACGT, and that whether some­one has the gene or not, their chances of get­ting tor­tured go down if they adore cats. The strong cor­re­la­tion be­tween ador­ing cats and get­ting tor­tured is be­cause of a third fac­tor, ACGT, that leads to both.

Hav­ing learned of this new study, would you choose to adore cats?

• Or, in­deed, should I choose not to adore cats be­cause it might be ev­i­dence I had tox­o­plas­mo­sis?

• Ac­tu­ally, that tox­o­plas­mo­sis thing is the only hap­piness-cre­at­ing-prefer­ence-in­duc­ing, nega­tive-side-effect dis­ease I ac­tu­ally know that re­ally works for Solomon’s Prob­lem. You can ei­ther pet cute kit­tens already tested and guaran­teed not to have tox­o­plas­mo­sis, or re­frain. This ought to be our go-to real-life ex­am­ple against EDT!

• You guys de­liber­ately chose ex­am­ples so that acronyms are en­tirely made up of let­ters also used for nu­cleotides, didn’t you?

• I don’t think I have a choice re­ally, I already do adore cats. Is that an ac­tual study btw? If it is, I’m gonna cheat by get­ting a sci­en­tist to scan my dna and tell me if I have it.

• Is that an ac­tual study btw?

Prob­a­bly not.

• I don’t think I have a choice re­ally, I already do adore cats.

Then you are mis­taken about hu­man psy­chol­ogy. You definitely have a choice about whether you will adore cats, it is sim­ply one that re­quires ac­tion.

• I hope you don’t imag­ine it’s an in­no­cent kit­ten, though. Be­cause even hy­po­thet­i­cally speak­ing, it’s not. Eet ees an evil keeten.

• Ac­tu­ally, when­ever you two-box, my friend Clive puts his cat Omicron into Box A.

Yes, he has a cat called Omicron, and he does nor­mally go around with ax­ioms of set the­ory on his t-shirt.

• Does it mat­ter if you don’t have for­mal rules for what you’re do­ing with mod­els?

Do you ex­pect what you’re do­ing with mod­els to be for­mal­iz­able in ZFC?

Does it mat­ter if ZFC is a first-or­der the­ory?

• “Does it mat­ter if X” is not a ques­tion; “mat­ter” is a two-place pred­i­cate (X mat­ters to Y). What you seem to be wor­ried about is the fol­low­ing: you need some set the­ory to talk about mod­els of first-or­der logic. ZFC is a com­mon ax­iom­a­ti­za­tion of set the­ory. But ZFC is it­self a first-or­der the­ory, so it seems cir­cu­lar to use ZFC to talk about mod­els of first-or­der logic. But if this is what you’re wor­ried about, you should just say so di­rectly.

• If you taboo one-pred­i­cate ‘mat­ter’, please spe­cial­ize the two-place pred­i­cate (X mat­ters to Y) to Y = “the OP’s sub­se­quent use of this ar­ti­cle”, and use the re­sult­ing one-place pred­i­cate.

I am not wor­ried about ap­par­ent cir­cu­lar­ity. Once I in­ter­nal­ized the Lowen­heim-Skolem ar­gu­ment that first-or­der the­o­ries have countable “non-stan­dard” mod­els, then model the­ory dis­solved for me. The syn­tac­ti­cal /​ for­mal­ist view of se­man­tics, that what math­e­mat­i­ci­ans are do­ing is ma­nipu­lat­ing finite strings of sym­bols, is always a perfectly good model, in the model the­o­retic sense. If you want to un­der­stand what the math­e­mat­i­cian is do­ing, you may look at what they’re do­ing, rather than tak­ing them at their word and try­ing to bog­gle your imag­i­na­tion with images of big­ness. Does dis­solv­ing model the­ory mat­ter?

There’s plenty of en­cod­ings in math­e­mat­ics—for ex­am­ple, us­ing first-or­der pred­i­cate logic and the ZFC ax­ioms to talk about sec­ond-or­der logic, or putting clas­si­cal logic in­side of in­tu­ition­is­tic logic with the dou­ble nega­tion trans­la­tion. Does the prevalence of en­cod­ings (analo­gous to the Tur­ing Tarpit) mat­ter?

For­mal ar­gu­ments, to be used in the real world, oc­cur as the mid­dle of an in­for­mal sand­wich—first there’s an in­for­mal ar­gu­ment that the premises are ap­pro­pri­ate or rea­son­able, and third there’s an in­for­mal ar­gu­ment in­ter­pret­ing the con­clu­sion. I un­der­stand the for­mal part of this post, but I don’t un­der­stand the in­for­mal parts at all. Non­stan­dard (par­tic­u­larly countable) mod­els are ev­ery­where and un­avoid­able (analo­gously Godel showed that true but un­prov­able state­ments are ev­ery­where and un­avoid­able). Against that back­ground, the for­mal suc­cess of sec­ond-or­der logic in ex­iling non­stan­dard mod­els of ar­ith­metic doesn’t seem (to me) a good start­ing point for any third ar­gu­ment.

• Nice post, but I think you got some­thing wrong. Your struc­ture with a sin­gle two-sided in­finite chain isn’t ac­tu­ally a model of first or­der PA. If x is an el­e­ment of the two-sided chain, then y=2x=x+x is an­other non-stan­dard num­ber, and y nec­es­sar­ily lies in a differ­ent chain since y-x=x is a non-stan­dard num­ber. Of course, you need to be a lit­tle bit care­ful to be sure that this ar­gu­ment can be ex­pressed in first or­der lan­guage, but I’m pretty sure it can. So, as soon as there is one chain of non-stan­dard num­bers, that forces the ex­is­tence of in­finitely many.

• And then there’s that in­finite-in-both-di­rec­tions chain which isn’t cor­rected to any­thing else.

Did you mean to say “con­nected”, or did I miss some­thing?

• “First, con­sider that the fol­low­ing for­mula de­tects 2-ness” Con­sider chang­ing this to some­thing like “First, con­sider that the fol­low­ing for­mula de­tects 2-ness among the num­bers as we want them to be”? It wasn’t im­me­di­ately ob­vi­ous to me that the starred chain’s ‘-1*’ didn’t satisfy the equa­tion.

Er, also, you might want to have only one of the in­ter­locu­tors be­gin­ning sen­tences with “Er” lest we lose track of which is sup­posed to be cur­rent-you. ;)

But yeah, a nice ex­po­si­tion!

• Fas­ci­nat­ing, I thought Ten­nan­baum’s the­o­rem im­plied non-stan­dard mod­els were rather im­pos­si­ble to vi­su­al­ize. The non-stan­dard model of Peano ar­ith­metic illus­trated in the di­a­gram only gives the suc­ces­sor re­la­tion, there’s no defi­ni­tion of ad­di­tion and mul­ti­pli­ca­tion. Ten­nen­baum’s the­o­rem im­plies there’s no com­putable way to do this, but is there a proof that they can be defined at all for this par­tic­u­lar model?

• This is so much clearer than my col­lege class.

• I’m go­ing to have to read the proof of the hy­dra game, be­cause I pretty quickly got over 2.8k nodes and still in in­creas­ing...

• It’s even worse than that, de­pend­ing on how you start, you can eas­ily get 100s of thou­sands of nodes...

• It’s even worse than that, the func­tion for the max­i­mum num­ber of nodes you end up be­fore they start go­ing down, if you play us­ing the worst pos­si­ble strat­egy, in­creases faster than any func­tion which Peano ar­ith­metic can prove to be to­tal (i.e., it grows faster than any Tur­ing ma­chine run on var­i­ous in­puts, which Peano ar­ith­metic can prove to halt for any in­put). To say that this grows faster than the Ack­er­mann func­tion is putting it very mildly.

• Well, it doesn’t say you have to win quickly.

I was skep­ti­cal at first, but con­sider it this way: At each step you make a sub­tree sim­pler, and then in­sert an ar­bi­trary num­ber of copies of the sim­pler sub­tree. Even­tu­ally you must end up with a large num­ber of copies of the sim­plest pos­si­ble sub­tree, a sin­gle node off the root. Those don’t grow the hy­dra when re­moved, so you you chop them all off and then win.

I found I could see this in­tu­itively if I chopped the top-most head of the most-com­plex tree for the first sev­eral rounds, in most con­figu­ra­tions; you’ll see what­ever tree you’re work­ing on get wider, but shorter. It helps to lower the start­ing num­ber of nodes to 7 or so, as well.

• Yes, while it was clear on a sec­ond read­ing this was also clear, thanks.

• Well, this was very in­ter­est­ing, for­mal logic is a very fun topic. I just spent ~10 min­utes try­ing to find a way in first or­der logic to write that ax­iom, as it in­tu­itively feels (to some­one who has stud­ied for­mal logic at least) that there should be a way… Of course I failed, all the ax­ioms I at­tempted turned out to be no more pow­er­ful then “0 is not the suc­ces­sor of any num­ber”. I am deeply in­trigued by this prob­lem, and I am look­ing for­ward to your next post where you ex­plain ex­actly why it’s im­pos­si­ble.

• If you like spoilers, google “Lowen­heim-Skoler”—the same tech­nique as the proof for the “up­wards” part al­lows you to gen­er­ate non-stan­dard mod­els for the First-or­der logic ver­sion of Peano ax­ioms in a fairly straight-for­ward man­ner.

• Okay, my brain isn’t wrap­ping around this quite prop­erly (though the ex­pla­na­tion has already helped me to un­der­stand the con­cepts far bet­ter than my col­lege ed­u­ca­tion on the sub­ject has!).

Con­sider the state­ment: “There ex­ists no x for which, for any num­ber k, x af­ter k suc­ces­sions is equal to zero.” (¬∃x: ∃k: xS-k-times = 0, k>0 is the clos­est I can figure to de­pict it for­mally). Why doesn’t that ax­iom elimi­nate the pos­si­bil­ity of any in­finite or finite chain that in­volves a num­ber be­low zero, and thus elimi­nate the pos­si­bil­ity of the two-sided in­finite chain?

Or… is that state­ment a sec­ond-or­der one, some­how, in which case how so?

Edit: Okay, the gears hav­ing turned a bit fur­ther, I’d like to add: “For all x, there ex­ists a num­ber k such that 0 af­ter k suc­ces­sions is equal to x.”

That should deal with an­other pos­si­ble un­der­stand­ing of that in­finite chain. Or is defin­ing k in those ax­ioms the prob­lem?

• “For all x, there ex­ists a num­ber k such that 0 af­ter k suc­ces­sions is equal to x” That should deal with an­other pos­si­ble un­der­stand­ing of that in­finite chain. Or is defin­ing k in those ax­ioms the prob­lem?

I made roughly a similar com­ment in the Log­i­cal Pin­point­ing post, and Kindly offered a re­sponse there.

If I un­der­stood him cor­rectly ba­si­cally it meant “you can’t use num­bers to count stuff yet, un­til you’ve first pin­pointed what a num­ber is...”. And rep­e­ti­tion isn’t defined in first or­der logic ei­ther.

• Ah, so the state­ment is sec­ond-or­der.

And while I’m pretty sure you could re­place the state­ment with an in­finite num­ber of first-or­der state­ments that pre­cisely de­scribe ev­ery mem­ber of the set (0S = 1, 0SS = 2, 0SSS = 3, etc), you couldn’t say “Th­ese are the only mem­bers of the set”, thus ex­clud­ing other chains, with­out talk­ing about the set—so it’d still be sec­ond-or­der.

Thanks!

• It’s a bit worse than that. Even if we defined the “k-suc­ces­sions” op­er­a­tor (which is ba­si­cally ad­di­tion), it doesn’t ac­tu­ally let us do what we want. “For all x, there ex­ists a num­ber k such that 0 af­ter k suc­ces­sions is equal to x” is always satis­fied by set­ting k=x, even if x is some weird al­ter­nate-uni­verse num­ber like 2*. Granted, I have no clue what “tak­ing 2* suc­ces­sions of 0” means, but...

• 20 Dec 2012 17:01 UTC
0 points

Nice post! Are you go­ing to get into Dedekind’s proof that all mod­els of the sec­ond-or­der Peano ax­ioms are iso­mor­phic?

Edit: Fixed typo.

• In first-or­der logic, we can get rid of the ABC—make a state­ment which rules out any model that has a loop of num­bers like that. But we can’t get rid of the in­finite chain un­der­neath it. In sec­ond-or­der logic we can get rid of the ex­tra chain.

(ex­ists x) (forall y) not (S(y) = x)

This state­ment is a first or­der state­ment which is true of stan­dard nat­u­ral num­bers, but is not true in your model.

???

• I don’t see yet how this con­nects to the other posts from the episte­mol­ogy se­quence, but it’s definitely nice. I’ve wanted to learn more math­e­mat­i­cal logic for some time. I didn’t quite un­der­stand why ex­actly us­ing an ax­iom schema isn’t as good as us­ing sec­ond or­der logic, be­fore I read this post.

‘There’s no group of num­bers G such that for any num­ber x in G, x is the suc­ces­sor of some other num­ber y in G.’

I read that “any” as “for at least one”, rather than as “for ev­ery”. That con­fused me quite a bit. Maybe na­tive speak­ers won’t have a prob­lem with that, but to me the con­nec­tion be­tween “any” and “some” is too close.

It’s also not clear to my where the or­der re­la­tion comes from.

• I think the point of this post is to demon­strate that log­i­cal pin­point­ing is hard. You might think that the first-or­der Peano ar­ith­metic ax­ioms log­i­cally pin­point the nat­u­ral num­bers, and what this dis­cus­sion will end up show­ing is that they just don’t be­cause of gen­eral prop­er­ties of first-or­der logic (speci­fi­cally the Löwen­heim–Skolem the­o­rem).

If log­i­cally pin­point­ing some­thing as seem­ingly sim­ple as the nat­u­ral num­bers de­pends on some­thing as seem­ingly non­triv­ial as un­der­stand­ing the dis­tinc­tion be­tween first-or­der and sec­ond-or­der logic, then (or so I imag­ine the ar­gu­ment will con­tinue) we shouldn’t ex­pect log­i­cally pin­point­ing some­thing like moral­ity to be any eas­ier. In fact we have ev­ery rea­son to ex­pect it to be sub­stan­tially harder.

The defi­ni­tion of the or­der re­la­tion is non­triv­ial. In sec­ond-or­der Peano ar­ith­metic you can define ad­di­tion from the suc­ces­sor op­er­a­tion by in­duc­tion, and then you can define a to be less than b if there is a pos­i­tive in­te­ger n such that a + n = b. My un­der­stand­ing is that you can­not define ad­di­tion this way in first-or­der Peano ar­ith­metic. In­stead it is nec­es­sary to ex­plic­itly talk about ad­di­tion in the ax­ioms. From here one could also go on to ex­plic­itly talk about the or­der re­la­tion in the ax­ioms.

• I read that “any” as “for at least one”, rather than as “for ev­ery”.

Prob­a­bly it’s be­cause of the “no group” be­fore it; cf “I can do any­thing” and “I can’t do any­thing”. Ne­ga­tions and quan­tifiers in English some­times in­ter­act in weird ways, mak­ing it non-triv­ial to get the se­man­tics from the syn­tax.

• Wik­tionary gives the mean­ings “at least one” and “no mat­ter what kind”. The first likely doesn’t ap­ply here, as it’s not used in a nega­tion or ques­tion. To in­ter­pret “no mat­ter what kind” to mean “ev­ery” seems like a stretch to me. I re­ally do think the mean­ing of “any” is am­bigu­ous here. “any” just speci­fies that we don’t have any fur­ther con­straints on x. You could re­place it with “ev­ery” or “at least one”, but not with “ev­ery even” or “at least one even”, as that would in­tro­duce a new con­straint.

• The first likely doesn’t ap­ply here, as it’s not used in a nega­tion or ques­tion.

It doesn’t, but I was hy­poth­e­siz­ing that the rea­son why on the first read it sounded to you as though it did was the nega­tion (“no group”) be­fore it.