# Second-Order Logic: The Controversy

“So the ques­tion you asked me last time was, ‘Why does any­one bother with first-or­der logic at all, if sec­ond-or­der logic is so much more pow­er­ful?’”

Right. If first-or­der logic can’t talk about finite­ness, or dis­t­in­guish the size of the in­te­gers from the size of the re­als, why even bother?

“The first thing to re­al­ize is that first-or­der the­o­ries can still have a lot of power. First-or­der ar­ith­metic does nar­row down the pos­si­ble mod­els by a lot, even if it doesn’t nar­row them down to a sin­gle model. You can prove things like the ex­is­tence of an in­finite num­ber of primes, be­cause ev­ery model of the first-or­der ax­ioms has an in­finite num­ber of primes. First-or­der ar­ith­metic is never go­ing to prove any­thing that’s wrong about the stan­dard num­bers. Any­thing that’s true in all mod­els of first-or­der ar­ith­metic will also be true in the par­tic­u­lar model we call the stan­dard num­bers.”

Even so, if first-or­der the­ory is strictly weaker, why bother? Un­less sec­ond-or­der logic is just as in­com­plete rel­a­tive to third-or­der logic, which is weaker than fourth-or­der logic, which is weaker than omega-or­der logic -

“No, sur­pris­ingly enough—there’s tricks for mak­ing sec­ond-or­der logic en­code any propo­si­tion in third-or­der logic and so on. If there’s a col­lec­tion of third-or­der ax­ioms that char­ac­ter­izes a model, there’s a col­lec­tion of sec­ond-or­der ax­ioms that char­ac­ter­izes the same model. Once you make the jump to sec­ond-or­der logic, you’re done—so far as any­one knows (so far as I know) there’s noth­ing more pow­er­ful than sec­ond-or­der logic in terms of which mod­els it can char­ac­ter­ize.”

Then if there’s one spoon which can eat any­thing, why not just use the spoon?

“Well… this gets into com­plex is­sues. There are math­e­mat­i­ci­ans who don’t be­lieve there is a spoon when it comes to sec­ond-or­der logic.”

Like there are math­e­mat­i­ci­ans who don’t be­lieve in in­finity?

“Kind of. Look, sup­pose you couldn’t use sec­ond-or­der logic—you be­longed to a species that doesn’t have sec­ond-or­der logic, or any­thing like it. Your species doesn’t have any na­tive men­tal in­tu­ition you could use to con­struct the no­tion of ‘all prop­er­ties’. And then sup­pose that, af­ter some­body used first-or­der set the­ory to prove that first-or­der ar­ith­metic had many pos­si­ble mod­els, you stood around shout­ing that you be­lieved in only one model, what you called the stan­dard model, but you couldn’t ex­plain what made this model differ­ent from any other model—”

Well… a lot of times, even in math, we make state­ments that gen­uinely mean some­thing, but take a while to figure out how to define. I think some­body who talked about ‘the num­bers’ would mean some­thing even be­fore sec­ond-or­der logic was in­vented.

“But here the hy­poth­e­sis is that you be­long to a species that can’t in­vent sec­ond-or­der logic, or think in sec­ond-or­der logic, or any­thing like it.”

Then I sup­pose you want me to draw the con­clu­sion that this hy­po­thet­i­cal alien is just stand­ing there shout­ing about stan­dard­ness, but its words don’t mean any­thing be­cause they have no way to pin down one model as op­posed to an­other one. And I ex­pect this species is also mag­i­cally for­bid­den from talk­ing about all pos­si­ble sub­sets of a set?

“Yeah. They can’t talk about the largest pow­er­set, just like they can’t talk about the small­est model of Peano ar­ith­metic.”

Then you could ar­guably deny that shout­ing about the ‘stan­dard’ num­bers would mean any­thing, to the mem­bers of this par­tic­u­lar species. You might as well shout about the ‘fleem’ num­bers, I guess.

“Right. Even if all the mem­bers of this species did have a built-in sense that there was a spe­cial model of first-or­der ar­ith­metic that was fleemer than any other model, if that fleem-ness wasn’t bound to any­thing else, it would be mean­ingless. Just a float­ing word. Or if all you could do was define fleem­ness as floob­ness and floob­ness as fleem­ness, you’d have a loop of float­ing words; and that might give you the im­pres­sion that each par­tic­u­lar word had a mean­ing, but the loop as a whole wouldn’t be con­nected to re­al­ity. That’s why it doesn’t help to say that the stan­dard model of num­bers is the small­est among all pos­si­ble mod­els of Peano ar­ith­metic, if you can’t define ‘small­est pos­si­ble’ any more than you can define ‘con­nected chain’ or ‘finite num­ber of pre­de­ces­sors’.”

But sec­ond-or­der logic does seem to have con­se­quences first-or­der logic doesn’t. Like, what about all that Godelian stuff? Doesn’t sec­ond-or­der ar­ith­metic se­man­ti­cally im­ply… its own Godel state­ment? Be­cause the unique model of sec­ond-or­der ar­ith­metic doesn’t con­tain any num­ber en­cod­ing a proof of a con­tra­dic­tion from sec­ond-or­der ar­ith­metic? Wait, now I’m con­fused.

“No, that’s cor­rect. It’s not para­dox­i­cal, be­cause there’s no effec­tive way of find­ing all the se­man­tic im­pli­ca­tions of a col­lec­tion of sec­ond-or­der ax­ioms. There’s no analogue of Godel’s Com­plete­ness The­o­rem for sec­ond-or­der logic—no syn­tac­tic sys­tem for de­riv­ing all the se­man­tic con­se­quences. Se­cond-or­der logic is sound, in the sense that any­thing syn­tac­ti­cally prov­able from a set of premises, is true in any model obey­ing those premises. But sec­ond-or­der logic isn’t com­plete; there are se­man­tic con­se­quences you can’t de­rive. If you take sec­ond-or­der logic at face value, there’s no effec­tively com­putable way of de­riv­ing all the con­se­quences of what you say you ‘be­lieve’… which is a ma­jor rea­son some math­e­mat­i­ci­ans are sus­pi­cious of sec­ond-or­der logic. What does it mean to be­lieve some­thing whose con­se­quences you can’t de­rive?”

But sec­ond-or­der logic clearly has some ex­pe­ri­en­tial con­se­quences first-or­der logic doesn’t. Sup­pose I build a Tur­ing ma­chine that looks for proofs of a con­tra­dic­tion from first-or­der Peano ar­ith­metic. If PA’s con­sis­tency isn’t prov­able in PA, then by the Com­plete­ness The­o­rem there must ex­ist non­stan­dard mod­els of PA where this ma­chine halts af­ter find­ing a proof of a con­tra­dic­tion. So first-or­der logic doesn’t tell me that this ma­chine runs for­ever—maybe it has non­stan­dard halt­ing times, i.e., it runs at all stan­dard N, but halts at −2* some­where along a dis­con­nected chain. Only sec­ond-or­der logic tells me there’s no proof of PA’s in­con­sis­tency and there­fore this ma­chine runs for­ever. Only sec­ond-or­der logic tells me I should ex­pect to see this ma­chine keep run­ning, and not ex­pect—note falsifi­a­bil­ity—that the ma­chine ever halts.

“Sure, you just used a sec­ond-or­der the­ory to de­rive a con­se­quence you didn’t get from a first-or­der the­ory. But that’s not the same as say­ing that you can only get that con­se­quence us­ing sec­ond-or­der logic. Maybe an­other first-or­der the­ory would give you the same pre­dic­tion.”

Like what?

“Well, for one thing, first-or­der set the­ory can prove that first-or­der ar­ith­metic has a model. Zer­melo-Fraenkel set the­ory can prove the ex­is­tence of a set such that all the first-or­der Peano ax­ioms are true about that set. It fol­lows within ZF that sound rea­son­ing on first-or­der ar­ith­metic will never prove a con­tra­dic­tion. And since ZF can prove that the syn­tax of clas­si­cal logic is sound—”

What does it mean for set the­ory to prove that logic is sound!?

“ZF can quote for­mu­las as struc­tured, and en­code mod­els as sets, and then rep­re­sent a finite ZF-for­mula which says whether or not a set of quoted for­mu­las is true about a model. ZF can then prove that no step of clas­si­cal logic goes from premises that are true in­side a set-model, to premises that are false in­side a set-model. In other words, ZF can rep­re­sent the idea ‘for­mula X is se­man­ti­cally true in model Y’ and ‘these syn­tac­tic rules never pro­duce se­man­ti­cally false state­ments from se­man­ti­cally true state­ments’.”

Wait, then why can’t ZF prove it­self con­sis­tent? If ZF be­lieves in all the ax­ioms of ZF, and it be­lieves that logic never pro­duces false state­ments from true state­ments -

“Ah, but ZF can’t prove that there ex­ists any set which is a model of ZF, so it can’t prove the ZF-ax­ioms are con­sis­tent. ZF shouldn’t be able to prove that some set is a model of ZF, be­cause that’s not true in all mod­els. Many mod­els of ZF don’t con­tain any in­di­vi­d­ual set well-pop­u­lated enough for that one set to be a model of ZF all by it­self.”

I’m kind of sur­prised in a Godelian sense that ZF can con­tain sets as large as the uni­verse of ZF. Doesn’t any given set have to be smaller than the whole uni­verse?

“In­side any par­tic­u­lar model of ZF, ev­ery set within that model is smaller than that model. But not all mod­els of ZF are the same size; in fact, mod­els of ZF of ev­ery size ex­ist, by the Lowen­heim-Skolem the­o­rem. So you can have some mod­els of ZF—some uni­verses in which all the el­e­ments col­lec­tively obey the ZF-re­la­tions—con­tain­ing in­di­vi­d­ual sets which are larger than other en­tire mod­els of ZF. A set that large is called a Grothendieck uni­verse and as­sum­ing it ex­ists is equiv­a­lent to as­sum­ing the ex­is­tence of ‘strongly in­ac­cessible car­di­nals’, sizes so large that you prov­ably can’t prove in­side set the­ory that any­thing that large ex­ists.”

Whoa.

(Pause.)

But...

“But?”

I agree you’ve shown that one ex­pe­ri­en­tial con­se­quence of sec­ond-or­der ar­ith­metic—namely that a ma­chine look­ing for proofs of in­con­sis­tency from PA, won’t be seen to halt—can be de­rived from first-or­der set the­ory. Can you get all the con­se­quences of sec­ond-or­der ar­ith­metic in some par­tic­u­lar first-or­der the­ory?

“You can’t get all the se­man­tic con­se­quences of sec­ond-or­der logic, taken at face value, in any the­ory or any com­putable rea­son­ing. What about the halt­ing prob­lem? Taken at face value, it’s a se­man­tic con­se­quence of sec­ond-or­der logic that any given Tur­ing ma­chine ei­ther halts or doesn’t halt—”

Per­son­ally I find it rather in­tu­itive to imag­ine that a Tur­ing ma­chine ei­ther halts or doesn’t halt. I mean, if I’m walk­ing up to a ma­chine and watch­ing it run, tel­ling me that its halt­ing or not-halt­ing ‘isn’t en­tailed by my ax­ioms’ strikes me as not de­scribing any ac­tual ex­pe­rience I can have with the ma­chine. Either I’ll see it halt even­tu­ally, or I’ll see it keep run­ning for­ever.

“My point is that the state­ments we can de­rive from the syn­tax of cur­rent sec­ond-or­der logic is limited by that syn­tax. And by the halt­ing prob­lem, we shouldn’t ever ex­pect a com­putable syn­tax that gives us all the se­man­tic con­se­quences. There’s no pos­si­ble the­ory you can ac­tu­ally use to get a cor­rect ad­vance pre­dic­tion about whether an ar­bi­trary Tur­ing ma­chine halts.”

Okay. I agree that no com­putable rea­son­ing, on sec­ond-or­der logic or any­thing else, should be able to solve the halt­ing prob­lem. Un­less time travel is pos­si­ble, but even then, you shouldn’t be able to solve the ex­panded halt­ing prob­lem for ma­chines that use time travel.

“Right, so the syn­tax of sec­ond-or­der logic can’t prove ev­ery­thing. And in fact, it turns out that, in terms of what you can prove syn­tac­ti­cally us­ing the stan­dard syn­tax, sec­ond-or­der logic is iden­ti­cal to a many-sorted first-or­der logic.”

Huh?

“Sup­pose you have a first-or­der logic—one that doesn’t claim to be able to quan­tify over all pos­si­ble pred­i­cates—which does al­low the uni­verse to con­tain two differ­ent sorts of things. Say, the logic uses lower-case let­ters for all type-1 ob­jects and up­per-case let­ters for all type-2 ob­jects. Like, ‘∀x: x = x’ is a state­ment over all type-1 ob­jects, and ‘∀Y: Y = Y’ is a state­ment over all type-2 ob­jects. But aside from that, you use the same syn­tax and proof rules as be­fore.”

Okay...

“Now add an el­e­ment-re­la­tion x∈Y, say­ing that x is an el­e­ment of Y, and add some first-or­der ax­ioms for mak­ing the type-2 ob­jects be­have like col­lec­tions of type-1 ob­jects, in­clud­ing ax­ioms for mak­ing sure that most de­scrib­able type-2 col­lec­tions ex­ist—i.e., the col­lec­tion X con­tain­ing just x is guaran­teed to ex­ist, and so on. What you can prove syn­tac­ti­cally in this the­ory will be iden­ti­cal to what you can prove us­ing the stan­dard syn­tax of sec­ond-or­der logic—even though the the­ory doesn’t claim that all pos­si­ble col­lec­tions of type-1s are type-2s, and the the­ory will have mod­els where many ‘pos­si­ble’ col­lec­tions are miss­ing from the type-2s.”

Wait, now you’re say­ing that sec­ond-or­der logic is no more pow­er­ful than first-or­der logic?

“I’m say­ing that the sup­posed power of sec­ond-or­der logic de­rives from in­ter­pret­ing it a par­tic­u­lar way, and tak­ing on faith that when you quan­tify over all prop­er­ties, you’re ac­tu­ally talk­ing about all prop­er­ties.”

But then sec­ond-or­der ar­ith­metic is no more pow­er­ful than first-or­der ar­ith­metic in terms of what it can ac­tu­ally prove?

“2nd-or­der ar­ith­metic is way more pow­er­ful than first-or­der ar­ith­metic. But that’s be­cause first-or­der set the­ory is more pow­er­ful than ar­ith­metic, and adding the syn­tax of sec­ond-or­der logic cor­re­sponds to adding ax­ioms with set-the­o­retic prop­er­ties. In terms of which con­se­quences can be syn­tac­ti­cally proven, sec­ond-or­der ar­ith­metic is more pow­er­ful than first-or­der ar­ith­metic, but weaker than first-or­der set the­ory. First-or­der set the­ory can prove the ex­is­tence of a model of sec­ond-or­der ar­ith­metic—ZF can prove there’s a col­lec­tion of num­bers and sets of num­bers which mod­els a many-sorted logic with syn­tax cor­re­spond­ing to sec­ond-or­der logic—and so ZF can prove sec­ond-or­der ar­ith­metic con­sis­tent.”

But first-or­der logic, in­clud­ing first-or­der set the­ory, can’t even talk about the stan­dard num­bers!

“Right, but first-or­der set the­ory can syn­tac­ti­cally prove more state­ments about ‘num­bers’ than sec­ond-or­der ar­ith­metic can prove. And when you com­bine that with the se­man­tic im­pli­ca­tions of sec­ond-or­der ar­ith­metic not be­ing com­putable, and with any sec­ond-or­der logic be­ing syn­tac­ti­cally iden­ti­cal to a many-sorted first-or­der logic, and first-or­der logic hav­ing neat prop­er­ties like the Com­plete­ness The­o­rem… well, you can see why some math­e­mat­i­ci­ans would want to give up en­tirely on this whole sec­ond-or­der busi­ness.”

But if you deny sec­ond-or­der logic you can’t even say the word ‘finite’. You would have to be­lieve the word ‘finite’ was a mean­ingless noise.

“You’d define finite­ness rel­a­tive to what­ever first-or­der model you were work­ing in. Like, a set might be ‘finite’ only on ac­count of the model not con­tain­ing any one-to-one map­ping from that set onto a smaller sub­set of it­self—”

But that set wouldn’t ac­tu­ally be finite. There wouldn’t ac­tu­ally be, like, only a billion ob­jects in there. It’s just that all the map­pings which could prove the set was in­finite would be mys­te­ri­ously miss­ing.

“Ac­cord­ing to some other model, maybe. But since there is no one true model—”

How is this not crazy talk along the lines of ‘there is no one true re­al­ity’? Are you say­ing there’s no re­ally small­est set of num­bers closed un­der suc­ces­sion, with­out all the ex­tra in­finite chains? Doesn’t talk­ing about how these the­o­ries have mul­ti­ple pos­si­ble mod­els, im­ply that those pos­si­ble mod­els are log­i­cal thin­gies and one of them ac­tu­ally does con­tain the largest pow­er­set and the small­est in­te­gers?

“The math­e­mat­i­ci­ans who deny sec­ond-or­der logic would see that rea­son­ing as in­vok­ing an im­plicit back­ground uni­verse of set the­ory. Every­thing you’re say­ing makes sense rel­a­tive to some par­tic­u­lar model of set the­ory, which would con­tain pos­si­ble mod­els of Peano ar­ith­metic as sets, and could look over those sets and pick out the small­est in that model. Similarly, that set the­ory could look over a pro­posed model for a many-sorted logic, and say whether there were any sub­sets within the larger uni­verse which were miss­ing from the many-sorted model. Ba­si­cally, your brain is in­sist­ing that it lives in­side some par­tic­u­lar model of set the­ory. And then, from that stand­point, you could look over some other set the­ory and see how it was miss­ing sub­sets that your the­ory had.”

Argh! No, damn it, I live in the set the­ory that re­ally does have all the sub­sets, with no mys­te­ri­ously miss­ing sub­sets or mys­te­ri­ous ex­tra num­bers, or de­nu­mer­able col­lec­tions of all pos­si­ble re­als that could like to­tally map onto the in­te­gers if the map­ping that did it hadn’t gone miss­ing in the Aus­tralian out­back -

“But ev­ery­body says that.”

Okay...

“Yeah?”

Screw set the­ory. I live in the phys­i­cal uni­verse where when you run a Tur­ing ma­chine, and keep watch­ing for­ever in the phys­i­cal uni­verse, you never ex­pe­rience a time where that Tur­ing ma­chine out­puts a proof of the in­con­sis­tency of Peano Arith­metic. Fur­ther­more, I live in a uni­verse where space is ac­tu­ally com­posed of a real field and space is ac­tu­ally in­finitely di­visi­ble and con­tains all the points be­tween A and B, rather than space only con­tain­ing a de­nu­mer­able num­ber of points whose ex­is­tence can be proven from the first-or­der ax­iom­a­ti­za­tion of the real num­bers. So to talk about physics—for­get about math­e­mat­ics—I’ve got to use sec­ond-or­der logic.

“Ah. You know, that par­tic­u­lar re­sponse is not one I have seen in the pre­vi­ous liter­a­ture.”

Yes, well, I’m not a pure math­e­mat­i­cian. When I ask whether I want an Ar­tifi­cial In­tel­li­gence to think in sec­ond-or­der logic or first-or­der logic, I won­der how that af­fects what the AI does in the ac­tual phys­i­cal uni­verse. Here in the ac­tual phys­i­cal uni­verse where times are fol­lowed by suc­ces­sor times, I strongly sus­pect that we should only ex­pect to ex­pe­rience stan­dard times, and not ex­pe­rience any non­stan­dard times. I think time is con­nected, and global con­nect­ed­ness is a prop­erty I can only talk about us­ing sec­ond-or­der logic. I think that ev­ery par­tic­u­lar time is finite, and yet time has no up­per bound—that there are all finite times, but only finite times—and that’s a prop­erty I can only char­ac­ter­ize us­ing sec­ond-or­der logic.

“But if you can’t ever tell the differ­ence be­tween stan­dard and non­stan­dard times? I mean, lo­cal prop­er­ties of time can be de­scribed us­ing first-or­der logic, and you can’t di­rectly see global prop­er­ties like ‘con­nect­ed­ness’ -”

But I can tell the differ­ence. There are only non­stan­dard times where a proof-check­ing ma­chine, run­ning for­ever, out­puts a proof of in­con­sis­tency from the Peano ax­ioms. So I don’t ex­pect to ex­pe­rience see­ing a ma­chine do that, since I ex­pect to ex­pe­rience only stan­dard times.

“Set the­ory can also prove PA con­sis­tent. If you use set the­ory to define time, you similarly won’t ex­pect to see a time where PA is proven in­con­sis­tent—those non­stan­dard in­te­gers don’t ex­ist in any model of ZF.”

Why should I an­ti­ci­pate that my phys­i­cal uni­verse is re­stricted to hav­ing only the non­stan­dard times al­lowed by a more pow­er­ful set the­ory, in­stead of non­stan­dard times al­lowed by first-or­der ar­ith­metic? If I then talk about a non­stan­dard time where a proof-enu­mer­at­ing ma­chine proves ZF in­con­sis­tent, will you say that only non­stan­dard times al­lowed by some still more pow­er­ful the­ory can ex­ist? I think it’s clear that the way you’re de­cid­ing which ex­per­i­men­tal out­comes you’ll have to ex­cuse, is by se­cretly as­sum­ing that only stan­dard times ex­ist re­gard­less of which the­ory is re­quired to nar­row it down.

“Ah… hm. Doesn’t physics say this uni­verse is go­ing to run out of ne­gen­tropy be­fore you can do an in­finite amount of com­pu­ta­tion? Maybe there’s only a bounded amount of time, like it stops be­fore googol­plex or some­thing. That can be char­ac­ter­ized by first-or­der the­o­ries.”

We don’t know that for cer­tain, and I wouldn’t want to build an AI that just as­sumed lifes­pans had to be finite, in case it was wrong. Be­sides, should I use a differ­ent logic than if I’d found my­self in Con­way’s Game of Life, or some­thing else re­ally in­finite? Wouldn’t the same sort of crea­tures evolve in that uni­verse, hav­ing the same sort of math de­bates?

“Per­haps no uni­verse like that can ex­ist; per­haps only finite uni­verses can ex­ist, be­cause only finite uni­verses can be uniquely char­ac­ter­ized by first-or­der logic.”

You just used the word ‘finite’! Fur­ther­more, taken at face value, our own uni­verse doesn’t look like it has a finite col­lec­tion of en­tities re­lated by first-or­der log­i­cal rules. Space and time both look like in­finite col­lec­tions of points—con­tin­u­ous col­lec­tions, which is a sec­ond-or­der con­cept—and then to char­ac­ter­ize the size of that in­finity we’d need sec­ond-or­der logic. I mean, by the Lowen­heim-Skolem the­o­rem, there aren’t just de­nu­mer­able mod­els of first-or­der ax­iom­a­ti­za­tions of the re­als, there’s also uni­mag­in­ably large car­di­nal in­fini­ties which obey the same premises, and that’s a pos­si­bil­ity straight out of H. P. Love­craft. Espe­cially if there are any things hid­ing in the in­visi­ble cracks of space.”

“How could you tell if there were in­ac­cessible car­di­nal quan­tities of points hid­den in­side a straight line? And any­thing that lo­cally looks con­tin­u­ous each time you try to split it at a de­scrib­able point, can be ax­io­m­a­tized by a first-or­der schema for con­ti­nu­ity.”

That brings up an­other point: Who’d re­ally be­lieve that the rea­son Peano ar­ith­metic works on phys­i­cal times, is be­cause that whole in­finite ax­iom schema of in­duc­tion, con­tain­ing for ev­ery Φ a sep­a­rate rule say­ing...

(Φ(0) ∧ (∀x: Φ(x) → Φ(Sx))) → (∀n: Φ(n))

...was used to spec­ify our uni­verse? How im­prob­a­ble would it be for an in­finitely long list of rules to be true, if there wasn’t a unified rea­son for all of them? It seems much more likely that the real rea­son first-or­der PA works to de­scribe time, is that all prop­er­ties which are true at a start­ing time and true of the suc­ces­sor of any time where they’re true, are true of all later times; and this gen­er­al­iza­tion over prop­er­ties makes in­duc­tion hold for first-or­der for­mu­las as a spe­cial case. If my na­tive thought pro­cess is first-or­der logic, I wouldn’t see the con­nec­tion be­tween each in­di­vi­d­ual for­mula in the ax­iom schema—it would take sep­a­rate ev­i­dence to con­vince me of each one—they would feel like in­de­pen­dent math­e­mat­i­cal facts. But af­ter do­ing sci­en­tific in­duc­tion over the fact that many prop­er­ties true at zero, with suc­ces­sion pre­serv­ing truth, seem to be true ev­ery­where—af­ter gen­er­al­iz­ing the sim­ple, com­pact sec­ond-or­der the­ory of num­bers and times—then you could in­vent an in­finite first-or­der the­ory to ap­prox­i­mate it.

“Maybe that just says you need to ad­just what­ever the­ory of sci­en­tific in­duc­tion you’re us­ing, so that it can more eas­ily in­duct in­finite ax­iom schemas.”

But why the heck would you need to in­duct in­finite ax­iom schemas in the first place, if Real­ity na­tively ran on first-or­der logic? Isn’t it far more likely that the way we ended up with these in­finite lists of ax­ioms was that Real­ity was man­u­fac­tured—for­give the an­thro­po­mor­phism—that Real­ity was man­u­fac­tured us­ing an un­der­ly­ing schema in which time is a con­nected se­ries of events, and space is a con­tin­u­ous field, and these are prop­er­ties which hap­pen to re­quire sec­ond-or­der logic for hu­mans to de­scribe? I mean, if you picked out first-or­der the­o­ries at ran­dom, what’s the chance we’d end up in­side an in­finitely long ax­iom schema that just hap­pened to look like the pro­jec­tion of a com­pact sec­ond-or­der the­ory? Aren’t we ig­nor­ing a sort of hint?

“A hint to what?”

Well, I’m not that sure my­self, at this depth of philos­o­phy. But I would cur­rently say that find­ing our­selves in a phys­i­cal uni­verse where times have suc­ces­sor times, and space looks con­tin­u­ous, seems like a pos­si­ble hint that the Teg­mark Level IV mul­ti­verse—or the way Real­ity was man­u­fac­tured, or what­ever—might look more like causal uni­verses char­ac­ter­i­z­able by com­pact sec­ond-or­der the­o­ries than causal uni­verses char­ac­ter­i­z­able by first-or­der the­o­ries.

“But since any sec­ond-or­der the­ory can just as eas­ily be in­ter­preted as a many-sorted first-or­der the­ory with quan­tifiers that can range over ei­ther el­e­ments or sets of el­e­ments, how could us­ing sec­ond-or­der syn­tax ac­tu­ally im­prove an Ar­tifi­cial In­tel­li­gence’s abil­ity to han­dle a re­al­ity like that?”

Good ques­tion. One ob­vi­ous an­swer is that the AI would be able to in­duct what you would call an in­finite ax­iom schema, as a sin­gle ax­iom—a sim­ple, finite hy­poth­e­sis.

“There’s all sorts of ob­vi­ous hacks to sci­en­tific in­duc­tion of first-or­der ax­ioms which would let you as­sign nonzero prob­a­bil­ity to com­putable in­finite se­quences of ax­ioms—”

Sure. So be­yond that… I would cur­rently guess that the ba­sic as­sump­tion be­hind ‘be­hav­ing as if’ sec­ond-or­der logic is true, says that the AI should act as if only the ‘ac­tu­ally small­est’ num­bers will ever ap­pear in physics, rel­a­tive to some ‘true’ math­e­mat­i­cal uni­verse that it thinks it lives in, but knows it can’t fully char­ac­ter­ize. Which is roughly what I’d say hu­man math­e­mat­i­ci­ans do when they take sec­ond-or­der logic at face value; they as­sume that there’s some true math­e­mat­i­cal uni­verse in the back­ground, and that sec­ond-or­der logic lets them talk about it.

“And what be­hav­iorally, ex­per­i­men­tally dis­t­in­guishes the hy­poth­e­sis, ‘I live in the true ul­ti­mate math with fully pop­u­lated pow­er­sets’ from the hy­poth­e­sis, ‘There’s some ran­dom model of first-or­der set-the­ory ax­ioms I hap­pen to be liv­ing in’?”

Well… one be­hav­ioral con­se­quence is sus­pect­ing that your time obeys an in­finitely long list of first-or­der ax­ioms with in­duc­tion schemas for ev­ery for­mula. And then more­over be­liev­ing that you’ll never ex­pe­rience a time when a proof-check­ing ma­chine out­puts a proof that Zer­melo-Fraenkel set the­ory is in­con­sis­tent—even though there’s prov­ably some mod­els with times like that, which fit the ax­iom schema you just in­ducted. That sounds like se­cretly be­liev­ing that there’s a back­ground ‘true’ set of num­bers that you think char­ac­ter­izes phys­i­cal time, and that it’s the small­est such set. Another sus­pi­cious be­hav­ior is that as soon as you sus­pect Zer­melo-Fraenkel set the­ory is con­sis­tent, you sud­denly ex­pect not to ex­pe­rience any phys­i­cal time which ZF proves isn’t a stan­dard num­ber. You don’t think you’re in the non­stan­dard time of some weaker the­ory like Peano ar­ith­metic. You think you’re in the min­i­mal time ex­press­ible by any and all the­o­ries, so as soon as ZF can prove some num­ber doesn’t ex­ist in the min­i­mal set, you think that ‘real time’ lacks such a num­ber. All of these sound like be­hav­iors you’d carry out if you thought there was a sin­gle ‘true’ math­e­mat­i­cal uni­verse that pro­vided the best model for de­scribing all phys­i­cal phe­nom­ena, like time and space, which you en­counter—and be­liev­ing that this ‘true’ back­drop used the largest pow­er­sets and small­est num­bers.

“How ex­actly do you for­mal­ize all that rea­son­ing, there? I mean, how would you ac­tu­ally make an AI rea­son that way?”

Er… I’m still work­ing on that part.

“That makes your the­ory a bit hard to crit­i­cize, don’t you think? Per­son­ally, I wouldn’t be sur­prised if any such for­mal­ized rea­son­ing looked just like be­liev­ing that you live in­side a first-or­der set the­ory.”

I sup­pose I wouldn’t be too sur­prised ei­ther—it’s hard to ar­gue with the re­sults on many-sorted log­ics. But if com­pre­hend­ing the phys­i­cal uni­verse is best done by as­sum­ing that real phe­nom­ena are char­ac­ter­ized by a ‘true’ math­e­mat­ics con­tain­ing the pow­er­sets and the nat­u­ral num­bers—and thus ex­pect­ing that no math­e­mat­i­cal model we can for­mu­late will ever con­tain any larger pow­er­sets or smaller num­bers than those of the ‘true’ back­drop to physics—then I’d call that a moral vic­tory for sec­ond-or­der logic. In first-or­der logic we aren’t even sup­posed to be able to talk about such things.

“Really? To me that sounds like be­liev­ing you live in­side a model of first-or­der set the­ory, and be­liev­ing that all mod­els of any the­o­ries you can in­vent must also be sets in the larger model. You can prove the Com­plete­ness The­o­rem in­side ZF plus the Ax­iom of Choice, so ZFC already proves that all con­sis­tent the­o­ries have mod­els which are sets, al­though of course it can’t prove that ZFC it­self is such a the­ory. So—an­thro­po­mor­phi­cally speak­ing—no model of ZFC ex­pects to en­counter a the­ory that has fewer num­bers or larger pow­er­sets than it­self. No model of ZFC ex­pects to en­counter any quoted-model, any set that a quoted the­ory en­tails, which con­tains larger pow­er­sets than the ones in its own Pow­er­set Ax­iom. A first-or­der set the­ory can even make the leap from the finite state­ment, ‘P is true of all my sub­sets of X’, to be­liev­ing, ‘P is true of all my sub­sets of X that can be de­scribed by this de­nu­mer­able col­lec­tion of for­mu­las’ - it can en­com­pass the jump from a sin­gle ax­iom over ‘all my sub­sets’, to a quoted ax­iom schema over for­mu­las. I’d sum all that up by say­ing, ‘sec­ond-or­der logic is how first-or­der set the­ory feels from the in­side’.”

Maybe. Even in the event that nei­ther hu­man nor su­per­in­tel­li­gent cog­ni­tion will ever be able to ‘prop­erly talk about’ un­bounded finite times, global con­nect­ed­ness, par­tic­u­lar in­finite car­di­nal­ities, or true spa­tial con­ti­nu­ity, it doesn’t fol­low that Real­ity is similarly limited in which physics it can priv­ilege.

Part of the se­quence Highly Ad­vanced Episte­mol­ogy 101 for Beginners

Pre­vi­ous post: “Godel’s Com­plete­ness and In­com­plete­ness The­o­rems

• I kept ex­pect­ing some­one to ob­ject that “this Tur­ing ma­chine never halts” doesn’t count as a pre­dic­tion, since you can never have ob­served it to run for­ever.

• Then the state­ment “this Tur­ing ma­chine halts for ev­ery in­put” doesn’t count as a pre­dic­tion ei­ther, be­cause you can never have ob­served it for ev­ery in­put, even if the ma­chine is just “halt”. And the state­ment “this Tur­ing ma­chine even­tu­ally halts” is bor­der­line, be­cause its nega­tion doesn’t count as a pre­dic­tion. What does this give us?

• Per­haps there is a type the­ory for pre­dic­tions, with con­crete pre­dic­tions like “The bus will come at 3 o’clock”, and func­tions that out­put con­crete pre­dic­tions like “Every mon­day, wednes­day and fri­day, the bus will come at 3 o’clock” (con­sider the state­ment as a func­tion tak­ing a time and re­turn­ing a con­crete pre­dic­tion yes or no).

An ul­tra­fini­tist would prob­a­bly not ar­gue with the ex­is­tence of such a func­tion, even though to some­one other than an ul­tra­fini­tist, the func­tion looks like it is quan­tify­ing over all time. From the ul­tra­fini­tist’s point of view, you’re go­ing to ap­ply it to some con­crete time, and at that point it’s go­ing to out­put a some con­crete pre­dic­tion.

• Ul­trafini­tism doesn’t seem rele­vant here. The “type the­ory” you’re think­ing of seems to be just the ar­ith­meti­cal hi­er­ar­chy. A “con­crete pre­dic­tion” is a state­ment with only bounded quan­tifiers, “this Tur­ing ma­chine even­tu­ally halts” is a state­ment with one ex­is­ten­tial quan­tifier, “this Tur­ing ma­chine halts for ev­ery in­put” is a state­ment with one uni­ver­sal and one ex­is­ten­tial quan­tifier, and so on. Or am I miss­ing some­thing?

• The ar­ith­meti­cal hi­er­ar­chy is pre­sum­ing a back­ground of pred­i­cate logic; I was not pre­sum­ing that. Yes, the type the­ory that I was ges­tur­ing to­wards would have some similar­ity to the ar­ith­meti­cal hi­er­ar­chy.

I was try­ing to sug­gest that the an­swer to “what is a pre­dic­tion” might look like a type the­ory of differ­ent var­i­ants of a pre­dic­tion. Per­haps a lin­ear hi­er­ar­chy like the ar­ith­meti­cal hi­er­ar­chy, yes, per­haps some­thing more com­pli­cated. There could be a sin­gle start­ing type “con­crete pre­dic­tion” and a type con­struc­tor that, given source type and a des­ti­na­tion type, gives the type of state­ments defin­ing func­tions that take ar­gu­ments of the source type and give ar­gu­ments of the des­ti­na­tion type.

The in­tu­itive bridge for me from ul­tra­fini­tism to the ques­tion “what counts as a pre­dic­tion?” is that ul­tra­fini­tists do hap­pily work with en­tities like 2^1000 con­sid­ered as a struc­ture like (“^” “2” “1000″), even if they deny that those struc­tures re­fer to things that are definitely num­bers (of course, they can agree that they are definitely “num­bers”, given an ap­pro­pri­ately fini­tist defi­ni­tion of “num­ber”). Maybe ex­treme tee­to­talers re­gard­ing what counts as a pre­dic­tion would hap­pily work with things such as com­putable func­tions re­turn­ing pre­dic­tions, even if they don’t con­sider them pre­dic­tions.

• Ac­tu­ally, I think “This Tur­ing ma­chine halts” is the more ob­vi­ously trou­ble­some one. It gives no com­putable ex­pec­ta­tion of when we might ob­serve a halt. (Any com­putable prob­a­bil­ity dis­tri­bu­tion we would as­sign to halt­ing time would put too much mass at small times as com­pared to the true dis­tri­bu­tion of halt­ing times.)

• If you take this ob­jec­tion se­ri­ously, then you should also take is­sue with pre­dic­tions like “no­body will ever trans­mit in­for­ma­tion faster than the speed of light”, or things like it. After all, you can never ac­tu­ally ob­serve the laws of physics to have been sta­ble and uni­ver­sal for all time.

If noth­ing else, you can con­sider each as be­ing a com­pact speci­fi­ca­tion of an in­finite se­quence of testable pre­dic­tions: “doesn’t halt af­ter one step”, “doesn’t halt af­ter two steps”,… “doesn’t halt af­ter n steps”.

• Tech­ni­cally speak­ing, you can ob­serve the loop en­coded in the Tur­ing ma­chine’s code some­where—ev­ery non­halt­ing Tur­ing ma­chine has some kind of loop. The Halt­ing the­o­rems say that you can­not write down any finite pro­gram which will rec­og­nize and iden­tify any in­finite loop, or de­duc­tively prove the ab­sence thereof, in bounded time. How­ever, hu­man be­ings don’t have finite pro­grams, and don’t work by de­duc­tion, so I sus­pect, with a sketch of math­e­mat­i­cal ground­ing, that these prob­lems sim­ply don’t ap­ply to us in the same way they ap­ply to reg­u­lar Tur­ing ma­chines.

EDIT: To clar­ify, hu­man minds aren’t “magic” or any­thing: the anal­ogy be­tween us and reg­u­lar Tur­ing ma­chines with finite in­put and pro­gram tape just isn’t ac­cu­rate. We’re a lot closer to in­duc­tive Tur­ing ma­chines or gen­er­al­ized Tur­ing ma­chines. We ex­hibit non­halt­ing be­hav­ior by de­sign and have more-or-less in­finite in­put tapes.

• The Halt­ing the­o­rems say that you can­not write down any finite pro­gram which will rec­og­nize and iden­tify any in­finite loop

I think I would use “ev­ery” where you use “any.”

• Let’s just use quan­tifiers.

~ex­ist P, forall M i, P (M, i) \/​ ~P (M, i).


Where P is a finite pro­gram run­ning in finite time, M is a Tur­ing ma­chine, and i is an in­put string.

• I don’t un­der­stand your no­ta­tion. You have a dis­junc­tion of a for­mula and its nega­tion, which is tau­tolog­i­cally true, hence the whole state­ment is tau­tolog­i­cally false.

P is a pro­gram; what mean­ing is in­tended by the pred­i­cate P(M,i)? In fact, what work is the ob­ject P do­ing in that for­mula? A Tur­ing ma­chine M is given an in­put string i; there is no other “pro­gram” in­volved.

• No. Th­ese aren’t “loops” in any mean­ingful sense of the word. Failing to halt and go­ing into an in­finite loop are not the same thing.

• Failing to halt and go­ing into an in­finite loop are not the same thing.

I’d ap­pre­ci­ate some ex­pla­na­tion on that, to see if you’re say­ing some­thing I haven’t heard be­fore or if we’re talk­ing past each-other. I don’t just in­clude while(true); un­der “in­finite loop”, I also in­clude in­finitely-ex­pand­ing re­cur­sions that can­not be char­ac­ter­ized as coin­duc­tive step­wise com­pu­ta­tions. Ba­si­cally, any­thing that would eval­u­ate to the \Bot type in type-the­ory counts for “in­finite loop” here, just plain com­pu­ta­tional di­ver­gence.

• I think that what Joshua was talk­ing about by ‘in­finite loop’ is ‘pass­ing through the same state an in­finite num­ber of times.’ That is, a /​loop/​, rather than just a line with no end­point. al­though this would rule out (some ar­bi­trary-size int type) x = 0; while(true){ x++; } on a ma­chine with in­finite mem­ory, as it would never pass through the same state twice. So maybe I’m still mi­s­un­der­stand­ing.

• Ok. By that defi­ni­tion, yes these are the same thing. I don’t think this is a stan­dard no­tion of what peo­ple mean by in­finite loop though.

• ev­ery non­halt­ing Tur­ing ma­chine has some kind of loop

For­mal proof needed; I in fact ex­pect there to be some­thing analo­gous to a Pen­rose tiling.

More­over, to adapt Keynes’ apoc­ryphal quote, a Tur­ing ma­chine can defer its loop for longer than you can pon­der it.

And fi­nally, as a gen­eral note, if you find that your proof that hu­man be­ings can solve the halt­ing prob­lem can’t be made for­mal and con­cise, you might con­sider the pos­si­bil­ity that your in­tu­ition is just plain wrong. It is in fact rele­vant that the­o­ret­i­cal com­puter sci­en­tists seem to agree that the halt­ing prob­lem is not solv­able by phys­i­cal pro­cesses in the uni­verse, in­clud­ing hu­man be­ings.

• And fi­nally, as a gen­eral note, if you find that your proof that hu­man be­ings can solve the halt­ing prob­lem can’t be made for­mal and con­cise, you might con­sider the pos­si­bil­ity that your in­tu­ition is just plain wrong.

I didn’t say that it can’t be made for­mal. I said that the for­mal­ism isn’t con­cise enough for one blog com­ment. Most nor­mal jour­nal/​con­fer­ence pa­pers are about ten pages long, so that’s the stan­dard of con­ci­sion you should use for a claimed for­mal­ism in the­o­ret­i­cal CS.

And in­deed, I think I can fit my for­mal­ism, when it’s done, into ten pages.

• If any­one wants a go, here’s a Tur­ing ma­chine to try on. Does it halt?

Writ­ten in python for the con­ve­nience of coders.

• To clar­ify, hu­man minds aren’t “magic” or any­thing: the anal­ogy be­tween us and reg­u­lar Tur­ing ma­chines with finite in­put and pro­gram tape just isn’t ac­cu­rate.

Ac­tu­ally, the stan­dard Tur­ing ma­chine has an in­finite tape. And, the role of the tape in a Tur­ing ma­chine is equiv­a­lent to the in­put, out­put and work­ing mem­ory of a com­puter; the Tur­ing ma­chine’s “pro­gram” is the finite state ma­chine.

How­ever, hu­man be­ings don’t have finite pro­grams, and don’t work by in­duc­tion, so I sus­pect, with a sketch of math­e­mat­i­cal ground­ing, that these prob­lems sim­ply don’t ap­ply to us in the same way they ap­ply to reg­u­lar Tur­ing ma­chines.

It seems to me that the above sug­gests that hu­mans are ca­pa­ble of cog­ni­tive tasks that Tur­ing ma­chines are not. But if this is true, it fol­lows that hu­man-level AGI is not achiev­able via a com­puter, since com­put­ers are thought to be equiv­a­lent in power to Tur­ing ma­chines.

• First of all, swapped “in­duc­tion” to “de­duc­tion”, be­cause I’m a mo­ron.

And, the role of the tape in a Tur­ing ma­chine is equiv­a­lent to the in­put, out­put and work­ing mem­ory of a com­puter; the Tur­ing ma­chine’s “pro­gram” is the finite state ma­chine.

There’s no real se­man­tic dis­tinc­tion be­tween the origi­nal con­tents of the One Tape, or the finite con­tents of the In­put Tape, or an ar­bi­trar­ily com­pli­cated state-ma­chine pro­gram, ac­tu­ally. You can build tape data for a Univer­sal Tur­ing Ma­chine to simu­late any other Tur­ing ma­chine.

It seems to me that the above sug­gests that hu­mans are ca­pa­ble of cog­ni­tive tasks that Tur­ing ma­chines are not. But if this is true, it fol­lows that hu­man-level AGI is not achiev­able via a com­puter, since com­put­ers are thought to be equiv­a­lent in power to Tur­ing ma­chines.

No. You’re just mak­ing the anal­ogy be­tween Tur­ing ma­chines and real, phys­i­cal com­put­ing de­vices overly strict. A sim­ple Tur­ing ma­chine takes a finite in­put, has a finite pro­gram, and ei­ther pro­cesses for finite time be­fore ac­cept­ing (pos­si­bly with some­thing on an out­put tape or some­thing like that), or runs for­ever.

Real, phys­i­cal com­put­ing de­vices, both biolog­i­cal and sili­con, run coin­duc­tive (in­finite-loop-un­til-it-isn’t) pro­grams all the time. Every op­er­at­ing sys­tem ker­nel, or mes­sage loop, or game loop is a coin­duc­tive pro­gram: its chief job is to iter­ate for­ever, tak­ing a finite time to pro­cess I/​O in each step. Each step performs some definite amount of se­man­tic work, but there is an in­definite num­ber of steps (gen­er­ally: un­til a spe­cial “STOP NOW” in­put is given). To re­it­er­ate: be­cause these pro­grams both loop in­finitely and perform I/​O (with the I/​O data not be­ing com­putable from any­thing the pro­gram has when it be­gins run­ning, not be­ing “pre­dictable” in any sense by the pro­gram), they are phys­i­cally-re­al­iz­able pro­grams that sim­ply don’t match the neat anal­ogy of nor­mal Tur­ing ma­chines.

Like­wise, hu­man be­ings loop in­finitely and perform I/​O. Which more-or-less gives away half my math­e­mat­i­cal sketch of how we solve the prob­lem!

• To re­it­er­ate: be­cause these pro­grams both loop in­finitely and perform I/​O (with the I/​O data not be­ing com­putable from any­thing the pro­gram has when it be­gins run­ning, not be­ing “pre­dictable” in any sense by the pro­gram), they are phys­i­cally-re­al­iz­able pro­grams that sim­ply don’t match the neat anal­ogy of nor­mal Tur­ing ma­chines.

They match the neat anal­ogy of Tur­ing ma­chines that start with pos­si­bly in­finitely many non-blank squares on their tapes. All the ob­vi­ous things you might like to know are still un­de­cid­able. Is this ma­chine guaran­teed to even­tu­ally read ev­ery item of its in­put? Is there any in­put that will drive the ma­chine into a cer­tain one of its states? Will the ma­chine ever write a given finite string? Will it even­tu­ally have writ­ten ev­ery finite pre­fix of a given in­finite string? Will it ever halt? Th­ese are all straight­for­wardly re­ducible to the stan­dard halt­ing prob­lem.

Or per­haps you would add to the model an en­vi­ron­ment that re­sponds to what the ma­chine does. In that case, rep­re­sent the en­vi­ron­ment by an or­a­cle (not nec­es­sar­ily a com­putable one), and define some turn-tak­ing mechanism for the ma­chine to ask ques­tions and re­ceive an­swers. All of the in­ter­est­ing ques­tions re­main un­de­cid­able.

• I agree with this:

There’s no real se­man­tic dis­tinc­tion be­tween the origi­nal con­tents of the One Tape, or the finite con­tents of the In­put Tape, or an ar­bi­trar­ily com­pli­cated state-ma­chine pro­gram, ac­tu­ally. You can build tape data for a Univer­sal Tur­ing Ma­chine to simu­late any other Tur­ing ma­chine.

and with this:

Real, phys­i­cal com­put­ing de­vices, both biolog­i­cal and sili­con, run coin­duc­tive (in­finite-loop-un­til-it-isn’t) pro­grams all the time. Every op­er­at­ing sys­tem ker­nel, or mes­sage loop, or game loop is a coin­duc­tive pro­gram: its chief job is to iter­ate for­ever, tak­ing a finite time to pro­cess I/​O in each step. Each step performs some definite amount of se­man­tic work, but there is an in­definite num­ber of steps (gen­er­ally: un­til a spe­cial “STOP NOW” in­put is given).

How­ever, I don’t see how you are go­ing to be able to use these facts to solve the halt­ing prob­lem. I’m guess­ing that, rather than stat­i­cally ex­am­in­ing the Tur­ing ma­chine, you will ex­e­cute it for some pe­riod of time and study the ex­e­cu­tion, and af­ter some finite amount of time, you ex­pect to be able to cor­rectly state whether or not the ma­chine will halt. Is that the ba­sic idea?

• I’m guess­ing that, rather than stat­i­cally ex­am­in­ing the Tur­ing ma­chine, you will ex­e­cute it for some pe­riod of time and study the ex­e­cu­tion, and af­ter some finite amount of time, you ex­pect to be able to cor­rectly state whether or not the ma­chine will halt. Is that the ba­sic idea?

Ba­si­cally, yes. “Halt­ing Em­piri­cism”, you could call it. The is­sue is pre­cisely that you can’t do em­piri­cal rea­son­ing via de­duc­tion from a fixed ax­iom set (ie: a fixed, finite pro­gram halts x : pro­gram → boolean). You need to do it by in­duc­tive rea­son­ing, in­stead.

• You need to do it by in­duc­tive rea­son­ing, in­stead.

What is in­duc­tive rea­son­ing? Bayesian up­dat­ing? Or some­thing else?

• My re­ply to you is the same as my re­ply to g_pep­per: it’s eas­ier for me to just do my back­ground re­search, dou­ble-check ev­ery­thing, and even­tu­ally pub­lish the full for­mal­ism than it is to ex­plain all the de­tails in a blog com­ment.

You are also cor­rect to note that what­ever com­bi­na­tion of ma­chine, per­son, in­put tape, and em­piri­cal data I provide, the X Ma­chine can never solve the Halt­ing Prob­lem for the X Ma­chine. The real math in­volved in my think­ing here in­volves demon­strat­ing the ex­is­tence of an or­der­ing: there should ex­ist a sense in which some ma­chines are “smaller” than oth­ers, and A can solve B’s halt­ing prob­lem when A is “strictly larger” than B, pos­si­bly strictly larger by some nec­es­sary amount.

(Of course, this already holds if A has an nth-level Tur­ing Or­a­cle, B has an mth-level Tur­ing Or­a­cle, and n > m, but that’s triv­ial from the defi­ni­tion of an or­a­cle. I’m think­ing of some­thing that ac­tu­ally con­cerns phys­i­cally re­al­iz­able ma­chines.)

Like I said: try­ing to go into ex­ten­sive de­tail via blog com­ment will do noth­ing but con­fus­ingly un­pack my in­tu­itions about the prob­lem, in­creas­ing net con­fu­sion. The proper thing to do is for­mal­ize, and that’s go­ing to take a bunch of time.

• This is in­ter­est­ing. Do you have an out­line of how a per­son would go about de­ter­min­ing whether a Tur­ing ma­chine will halt? If so, I would be in­ter­ested in see­ing it. Alter­na­tively, if you have a more de­tailed ar­gu­ment as to why a per­son will be able to de­ter­mine whether an ar­bi­trary Tur­ing ma­chine will halt, even if that ar­gu­ment does not con­tain the de­tails of how the per­son would pro­ceed, I would be in­ter­ested in see­ing that.

Or, are you just mak­ing the ar­gu­ment that an in­tel­li­gent per­son ought to be able in ev­ery case to use some com­bi­na­tion of in­duc­tive rea­son­ing, cre­ativity, math­e­mat­i­cal in­tu­ition, etc., to cor­rectly de­ter­mine whether or not a Tur­ing ma­chine will halt?

• My re­ply to you is the same as my re­ply to Richard Ken­n­away: it’s eas­ier for me to just do my back­ground re­search, dou­ble-check ev­ery­thing, and even­tu­ally pub­lish the full for­mal­ism than it is to ex­plain all the de­tails in a blog com­ment.

• Fair enough. I am look­ing for­ward to see­ing the for­mal­ism!

• It’s pos­si­ble to com­pute whether each ma­chine halts us­ing an in­duc­tive Tur­ing ma­chine like this:

ini­tial­ize out­put tape to all ze­ros, rep­re­sent­ing the as­ser­tion that no Tur­ing ma­chine halts
for i = 1 to in­finity
. for j = 1 to i
. .       run Tur­ing ma­chine j for i steps
. .       if it halts: set bit j in the out­put tape to 1


Is this what you meant? If so, I’m not sure what this has to do with ob­serv­ing loops.

When you say that ev­ery non­halt­ing Tur­ing ma­chine has some kind of loop, do you mean the kind of loop that many halt­ing Tur­ing ma­chines also con­tain?

• When you say that ev­ery non­halt­ing Tur­ing ma­chine has some kind of loop, do you mean the kind of loop that many halt­ing Tur­ing ma­chines also con­tain?

No, I mean pre­cisely the fact that it doesn’t halt. You can think of it as an in­finitely grow­ing re­cur­sion if that helps, but what’s in ques­tion is re­ally pre­cisely the non­halt­ing be­hav­ior.

I’m go­ing to make a Dis­cus­sion post about the mat­ter, since Luke wants me to share the whole in­for­mal sh­piel on the sub­ject.

• It is pos­si­ble to make a Tur­ing ma­chine that re­turns “HALT”, “RUNS FOREVER” or “UNSURE” and is never wrong. If the uni­verse as we know it runs on quan­tum me­chan­ics, it should be pos­si­ble to simu­late a math­e­mat­i­cian in a box with un­limited data stor­age and time, us­ing a finite Tur­ing ma­chine. This would im­ply the ex­is­tence of some prob­lem that would leave the math­e­mat­i­cian stuck for­ever. There are plenty of Tur­ing ma­chines where we have no Idea if they halt. If you de­scribe the quan­tum state of the ob­serv­able uni­verse to plank length re­s­olu­tion, what part of hu­mans sup­pos­edly in­finite minds is not de­scribed in these 2^2^100 bits of in­for­ma­tion?

• 5 Jan 2013 2:19 UTC
0 points
Parent

“This Tur­ing ma­chine won’t halt in 3^^^3 steps” is a falsifi­able pre­dic­tion. Re­place 3^^^3 with what­ever num­ber is enough to guaran­tee what­ever re­sult you need.

Edit: But you’re right.

• “This Tur­ing ma­chine won’t halt in 3^^^3 steps” is a falsifi­able pre­dic­tion.

Really? First, 3^^^3 is an ab­surdly large num­ber of steps, much more steps than you could fea­si­bly go through be­fore the heat death of the uni­verse or what­ever hor­rible fate awaits us. Se­cond, af­ter enough steps your con­fi­dence in the out­come of any com­pu­ta­tion can be made ar­bi­trar­ily low be­cause of pos­si­ble er­rors, e.g. er­rors caused by cos­mic rays.

It seemed to me that ul­tra­fini­tism wasn’t his true re­jec­tion.

Edit: But you’re right.

• Okay, non-ul­tra­fini­tist ob­jec­tion:

Re­place 3^^^3 with what­ever num­ber is enough to guaran­tee what­ever re­sult you need.

But how do you know what num­ber that is? The func­tion “the num­ber of steps needed to guaran­tee that a Tur­ing ma­chine won’t halt” is not a com­putable func­tion of Tur­ing ma­chines, be­cause if it were you could solve the halt­ing prob­lem. So how do you com­pute a func­tion that’s, y’know, not com­putable?

• First of all, I shouldn’t have used the num­ber 3^^^3. That was a stupid idea, and I’ve for­got­ten what the no­ta­tion means. So just re­place that with “generic re­ally big num­ber”, let’s just say a googol­plex (and, yes, I’m aware that’s nowhere close to 3^^^3).

To an­swer your ques­tion, I doubt FAI is go­ing to care about whether Tur­ing ma­chine X halts; it’s go­ing to be look­ing for a proof of a cer­tain re­sult re­gard­ing its own al­gorithms. It’s pos­si­ble that it only needs to prove that its al­gorithms will work for “generic re­ally big num­ber” amount of steps, and thus could use a bounded quan­tifier in­stead of a uni­ver­sal quan­tifier. Us­ing bounded quan­tifiers makes your sys­tem much weaker, but also moves it closer to be­ing de­cid­able (I mean this in a metaphor­i­cal sense; the sys­tem still won’t be de­cid­able). Also, re­gard­ing search­ing for proofs, we’ll have to set an up­per bound for how long we search, as well. So we need to spec­ify some spe­cific num­ber in that con­text, too.

Speci­fi­cally which num­bers to se­lect is some­thing I have no idea about; I’m not an AI re­searcher. Maybe I’ll have some kind of idea (but very prob­a­bly not) when a more for­mal prob­lem speci­fi­ca­tion is given, but this ar­ti­cle is still dis­cussing the­o­ret­i­cals, not ap­pli­ca­tions yet.

• Are you aware of the busy beaver func­tion? Read this.

Ba­si­cally, it’s im­pos­si­ble to write down num­bers large enough for that to work.

• In­ter­est­ing ar­ti­cle! What do you mean, though? Are you say­ing that Knuth’s triple up-ar­row is un­com­putable (I don’t see why that would be the case, but I could be wrong.)?

• No, Solvent is mak­ing the same point as in my an­swer: in or­der to write down large enough num­bers to guaran­tee that Tur­ing ma­chines don’t halt, you need to write down a func­tion larger than the busy beaver func­tion, and any such func­tion fails to be com­putable be­cause you can use it to solve the halt­ing prob­lem.

• Ba­si­cally, the busy beaver func­tion tells us the max­i­mum num­ber of steps that a Tur­ing ma­chine with a given num­ber of states and sym­bols can run for. If we know the busy beaver of, for ex­am­ple, 5 states and 5 sym­bols, then we can tell you if any 5 state 5 sym­bol Tur­ing ma­chine will even­tu­ally halt.

How­ever, you can see why it’s im­pos­si­ble to in gen­eral find the busy beaver func­tion- you’d have to know which Tur­ing ma­chines of a given size halted, which is in gen­eral im­pos­si­ble.

• One of the par­ti­ci­pants in this di­alogue seems too con­cerned with pin­ning down mod­els uniquely and also seems too con­vinced he knows what model he’s in. Sup­pose we live in a simu­la­tion which is be­ing run by su­per­be­ings who have ac­cess to or­a­cles that can tell them when Tur­ing ma­chines are at­tempt­ing to find con­tra­dic­tions in PA. When­ever they de­tect that some­thing in the simu­la­tion is at­tempt­ing to find con­tra­dic­tions in PA, that part of the simu­la­tion mys­te­ri­ously stops work­ing af­ter the billionth or trillionth step or some­thing. Then run­ning such Tur­ing ma­chines can’t tell us whether we live in a uni­verse where PA is con­sis­tent or not.

I also wish both par­ti­ci­pants in the di­alogue would take ul­tra­fini­tism more se­ri­ously. It is not as wacky as it sounds, and it seems like a good idea to be con­ser­va­tive about such things when de­sign­ing AI.

Edit: Here is an ul­tra­fini­tist fable that might be use­ful or at least amus­ing, from the link.

I have seen some ul­tra­fini­tists go so far as to challenge the ex­is­tence of 2^100 as a nat­u­ral num­ber, in the sense of there be­ing a se­ries of ‘points’ of that length. There is the ob­vi­ous ‘draw the line’ ob­jec­tion, ask­ing where in 2^1, 2^2, 2^3, … , 2^100 do we stop hav­ing ‘Pla­ton­is­tic re­al­ity’? Here this … is to­tally in­no­cent, in that it can be eas­ily be re­placed by 100 items (names) sep­a­rated by com­mas. I raised just this ob­jec­tion with the (ex­treme) ul­tra­fini­tist Ye­s­enin-Volpin dur­ing a lec­ture of his.

He asked me to be more spe­cific. I then pro­ceeded to start with 2^1 and asked him whether this is ‘real’ or some­thing to that effect. He vir­tu­ally im­me­di­ately said yes. Then I asked about 2^2, and he again said yes, but with a per­cep­ti­ble de­lay. Then 2^3, and yes, but with more de­lay. This con­tinued for a cou­ple of more times, till it was ob­vi­ous how he was han­dling this ob­jec­tion. Sure, he was pre­pared to always an­swer yes, but he was go­ing to take 2^100 times as long to an­swer yes to 2^100 then he would to an­swer­ing 2^1. There is no way that I could get very far with this.

• I also wish both par­ti­ci­pants in the di­alogue would take ul­tra­fini­tism more se­ri­ously.

For what it’s worth, I’m an ul­tra­fini­tist. Since 2005, at least as far as I’ve been able to tell.

• How long do you ex­pect to stay an ul­tra­fini­tist?

• Un­til I’m de­stroyed, of course!

… but since Qiaochu asked that we take ul­tra­fini­tism se­ri­ously, I’ll give a se­ri­ous an­swer: some­thing else will prob­a­bly re­place ul­tra­fini­tism as my preferred (max­i­mum a pos­te­ri­ori) view of math and the world within 20 years or so. That is, I ex­pect to de­ter­mine that the ques­tion of whether ul­tra­fini­tism is true is not quite the right ques­tion to be ask­ing, and have a bet­ter ques­tion by then, with a differ­ent best guess at the an­swer… just be­cause similar changes of per­spec­tive have hap­pened to me sev­eral times already in my life.

• Is that be­cause 2005 is as far from the pre­sent time as you dare to go?

• What pre­cisely do the Overflow­ers (and the math­e­mat­i­cian re­port­ing that anec­dote) mean by the “ex­is­tence” of a num­ber?

For in­stance, I see that the anec­dote-re­porter refers to there be­ing a se­ries of points of a par­tic­u­lar length, but I as­sume they don’t mean that in an in­tu­itive, literal sense: there are cer­tainly at least 2^100 Planck lengths be­tween me and the other end of the room.

• I am not sure. If I tabooed “ex­ist,” then my best guess is that ul­tra­fini­tists would ar­gue that state­ments in­volv­ing re­ally big num­bers are not mean­ingful. For ex­am­ple, they might ar­gue that such state­ments are not ver­ifi­able in the real world. (Edit: as an­other ex­am­ple, as I men­tioned in an­other com­ment, ul­tra­fini­tists might ar­gue that you can­not count to re­ally big num­bers.)

For in­stance, I see that the anec­dote-re­porter refers to there be­ing a se­ries of points of a par­tic­u­lar length, but I as­sume they don’t mean that in an in­tu­itive, literal sense: there are cer­tainly at least 2^100 Planck lengths be­tween me and the other end of the room.

Yes, but just barely: 2^100 Planck lengths is some­thing like 2 x 10^{-5} me­ters, so sub­sti­tute 2^1000 Planck lengths, which is sub­stan­tially larger than the di­ame­ter of the uni­verse.

• Seems weird to think that some of the pos­si­ble con­figu­ra­tions of bits on my 1.5TB hard drive don’t ex­ist. Which ones? I hope none of the re­ally good col­lec­tions of pr0n are log­i­cally un­reach­able.

If that num­ber does ex­ist, then what about re­ally big busy beaver num­bers, like bb(2^10^13 )? They’re just a se­ries of com­pu­ta­tions on hard drive con­tents. And that num­ber is so close to in­finity that we might as well just step from ul­tra­fini­tism to plain old fini­tism.

• While I am not an ul­tra­fini­tist, I be­lieve the idea is meant to be this: It is not valid to talk about those num­bers, be­cause there is no mean­ingful thing you can do with those num­bers that can af­fect the real world. There­fore, the ul­tra­fini­tists say that it is not re­ally log­i­cal to treat these num­bers as “ex­ist­ing” as they can not af­fect the real world at all, and why say that some­thing ex­ists if it can­not af­fect any­thing at all?

• I hope none of the re­ally good col­lec­tions of pr0n are log­i­cally un­reach­able.

This seems in­cred­ibly likely, doesn’t it? (As long as we are happy to bound ‘log­i­cally reach­able’ to within the ob­serv­able uni­verse.)

• Seems weird to think that some of the pos­si­ble con­figu­ra­tions of bits on my 1.5TB hard drive don’t ex­ist.

Would you like to go through all of them just to be sure? How long do you think that will take you?

what about re­ally big busy beaver num­bers, like bb(2^10^13 )? They’re just a se­ries of com­pu­ta­tions on hard drive con­tents.

Try­ing to ac­tu­ally com­pute a suffi­ciently large busy beaver num­ber, you’ll run into the prob­lem that there won’t be enough ma­te­rial in the ob­serv­able uni­verse to con­struct the cor­re­spond­ing Tur­ing ma­chines and/​or that there won’t be enough us­able en­ergy to power them for the re­quired lengths of time and/​or that the heat death of the uni­verse will oc­cur be­fore the re­quired lengths of time. If there’s no phys­i­cal way to go through the rele­vant com­pu­ta­tions, there’s no phys­i­cal sense in which the rele­vant com­pu­ta­tions out­put a re­sult.

• It may not be pos­si­ble to check all of them, but it cer­tainly is pos­si­ble to check one of them...any one of them. And whichever one you choose to check, you’ll find that it ex­ists. So if you claim that some of the pos­si­ble con­figu­ra­tions don’t ex­ist, you’re claiming they’d have to be among the ones you don’t choose to check. But wait, this im­plies that your choice of which one(s) to check some­how af­fects which ones ex­ist. It sure would be spooky if that some­how turns out to be the case, which I doubt.

• Ex­actly. And I could make my choice of which pr0n library to check—or which 1.5TB tur­ing ma­chine to run—de­pen­dent on 10^13 quan­tum coin­flips; which, while it would take a while, seems phys­i­cally re­al­iz­able.

• Help me out here…

One of the par­ti­ci­pants in this di­alogue … seems too con­vinced he knows what model he’s in.

I can imag­ine liv­ing a simu­la­tion… I just don’t un­der­stand yet what you mean by liv­ing in a model in the sense of logic and model the­ory, be­cause a model is a static thing. I heard some­one once be­fore talk about “what are we in?”, as though the phys­i­cal uni­verse were a model, in the sense of model the­ory. He wasn’t able to op­er­a­tional­ize what he meant by it, though. So, what do you mean when you say this? Are you con­sid­er­ing the phys­i­cal uni­verse a first-or­der struc­ture) some­how? If so, how? And con­cern­ing its role as a model, what for­mal sys­tem are you con­sid­er­ing it a model of?

• That was im­pre­cise, but I was try­ing to com­ment on this part of the di­alogue us­ing the lan­guage that it had es­tab­lished:

Argh! No, damn it, I live in the set the­ory that re­ally does have all the sub­sets, with no mys­te­ri­ously miss­ing sub­sets or mys­te­ri­ous ex­tra num­bers, or de­nu­mer­able col­lec­tions of all pos­si­ble re­als that could like to­tally map onto the in­te­gers if the map­ping that did it hadn’t gone miss­ing in the Aus­tralian out­back -

I was also com­ment­ing on this part:

Screw set the­ory. I live in the phys­i­cal uni­verse where when you run a Tur­ing ma­chine, and keep watch­ing for­ever in the phys­i­cal uni­verse, you never ex­pe­rience a time where that Tur­ing ma­chine out­puts a proof of the in­con­sis­tency of Peano Arith­metic.

The point I was try­ing to make, and maybe I did not use sen­si­ble words to make it, is that This Guy (I don’t know what his name is—who writes a di­alogue with un­named par­ti­ci­pants, by the way?) doesn’t ac­tu­ally know that, for two rea­sons: first, Peano ar­ith­metic might ac­tu­ally be in­con­sis­tent, and sec­ond, even if it were con­sis­tent, there might be some mys­te­ri­ous force pre­vent­ing us from dis­cov­er­ing this fact.

I just don’t un­der­stand yet what you mean by liv­ing in a model in the sense of logic and model the­ory, be­cause a model is a static thing.

Models be­ing static is a mat­ter of in­ter­pre­ta­tion. It is easy to write down a first-or­der the­ory of dis­crete dy­nam­i­cal sys­tems (sets equipped with an en­domap, in­ter­preted as a suc­ces­sor map which de­scribes the state of a dy­nam­i­cal sys­tem at time t + 1 given its state at time t). If time is dis­cretized, our own uni­verse could be such a thing, and even if it isn’t, cel­lu­lar au­tomata are such things. Are these “static” or “dy­namic”?

• Argh! No, damn it, I live in the set the­ory that re­ally does have all the sub­sets, with no mys­te­ri­ously miss­ing sub­sets or mys­te­ri­ous ex­tra num­bers,

In­deed, I think it’s some­what un­clear what is meant here. The speaker at­tempts to re­late it to physics, refer­ring to the idea that we ap­pear to live in con­tin­u­ous space… but how does the speaker pro­pose to rule out in­finites­i­mals and other non­stan­dard en­tities? (The speaker only seems to in­di­cate hor­ror about dev­ils liv­ing in the cracks.) Or, for that mat­ter, countable mod­els of the re­als, as some­one already men­tioned. This isn’t di­rectly re­lated to the ques­tion of what set the­ory is true, what set the­ory we live in, etc… (Per­haps the speaker’s in­ten­tion in this line was to as­sume that we live in a Teg­mark mul­ti­verse, so that we liter­ally do live in some set the­ory?)

In­stead, I think the speaker should have ar­gued that we can re­fer to this state of af­fairs, not that it must be the true state of af­fairs. To give an­other ex­am­ple, I’m not at all con­vinced that time must cor­re­spond to the stan­dard model of the nat­u­ral num­bers (I’m not even sure it doesn’t loop back upon it­self even­tu­ally, when it comes down to it, though I agree that causal mod­els dis­al­low this and I find it im­prob­a­ble for that rea­son). Yet, I’m (rel­a­tively) happy to say that we can at least re­fer to this as a pos­si­ble state of af­fairs. (Per­haps with a qual­ifier: “Un­less peano ar­ith­metic turns out to be in­con­sis­tent...”)

• That was im­pre­cise, but I was try­ing to com­ment on this part of the di­alogue us­ing the lan­guage that it had established

Ah, I was ask­ing you be­cause I thought us­ing that lan­guage meant you’d made sense of it ;) The lan­guage of us “liv­ing in a (model of) set the­ory” is some­thing I’ve heard be­fore (not just from you and Eliezer), which made me think I was miss­ing some­thing. Us liv­ing in a dy­nam­i­cal sys­tem makes sense, and a dy­nam­i­cal sys­tem can con­tain a model of set the­ory, so at least we can “live with” mod­els of set the­ory… we in­ter­act with (parts of) mod­els of set the­ory when we play with col­lec­tions of phys­i­cal ob­jects.

Models be­ing static is a mat­ter of in­ter­pre­ta­tion.

Of course, time has been a fourth di­men­sion for ages ;) My point is that set the­ory doesn’t seem to have a rea­son­able dy­nam­i­cal in­ter­pre­ta­tion that we could live in, and I think I’ve con­cluded it’s con­fus­ing to talk like that. I can only make sense of “liv­ing with” or “be­liev­ing in” mod­els.

• Set the­ory doesn’t have a dy­nam­i­cal in­ter­pre­ta­tion be­cause it’s not causal, but finite causal sys­tems have first-or­der de­scrip­tions and in­finite causal sys­tems have sec­ond-or­der de­scrip­tions. Not ev­ery­thing log­i­cal is causal; ev­ery­thing causal is log­i­cal.

• I like your pro­posal, but why not just stan­dard fini­tism? What is your ob­jec­tion to prim­i­tive re­cur­sive ar­ith­metic?

As a side note, I per­son­ally con­sider a the­ory “safe” if it fits with my in­tu­itions and can be proved con­sis­tent us­ing prim­i­tive re­cur­sive ar­ith­metic and trans­finite in­duc­tion up to an or­di­nal I can pic­ture. So I think of Peano ar­ith­metic as safe, since I can pic­ture ep­silon-zero.

• This isn’t my ob­jec­tion per­son­ally, but a suffi­ciently ul­tra fini­tist re­jects the prin­ci­ple of in­duc­tion.

• I think it’s more that an ul­tra­fini­tist claims not to know that suc­ces­sor is a to­tal func­tion—you could still in­duct for as long as suc­ces­sion lasts. Though this is me guess­ing, not some­thing I’ve read.

• Either that, or they claim that cer­tain num­bers, such as 3^^^3, can­not be reached by any num­ber of iter­a­tions of the suc­ces­sor func­tion start­ing from 0. It’s dis­prov­able, but with­out in­duc­tion or cut it’s too long a proof for any com­puter in the uni­verse to do in a finite hu­man lifes­pan, or even the earth’s lifes­pan.

• Alter­na­tively, one could start by ask­ing whether 2^50 is a real num­ber or not, if yes go up to 2^75, if no go to 2^25, and in up to 7 steps find a real num­ber that, when dou­bled, ceases to be a real num­ber. There may be im­prac­ti­cal or even non­com­putable num­bers, but con­ti­nu­ity holds that dou­bling a real num­ber always yields a real num­ber.

• I think the point of the fable is that Ye­s­enin-Volpin was count­ing to each num­ber in his head be­fore declar­ing whether it was ‘real’ or not, so if you asked him whether 2^50 was ‘real’ he’d just be quiet for a re­ally re­ally long time.

• But wouldn’t that dis­prove ul­tra­fini­tism? All finite num­bers, even 3^^^3, can be counted to (in the ab­sence of any time limit, such as a lifes­pan), there’s just no hu­man who re­ally wants to.

• All finite num­bers, even 3^^^3, can be counted to

As I un­der­stand it, this is pre­cisely the kind of state­ment that ul­tra­fini­tists do not be­lieve.

• If that’s true, then Ye­s­enin-Volpin was just playin the role of a Tur­ing ma­chine try­ing to de­ter­mine if a cer­tain pro­gram halts (the pro­gram that counts from 0 to the in­put). If 3^^^3 (say) for a fini­tist doesn’t ex­ists, then s/​he re­ally has a differ­ent con­cept of num­ber than you have (for ex­am­ple, they are not ax­io­m­a­tized by the Peano Arith­metic). It’s fun to ob­serve that ul­tra­fini­tism is ax­io­matic: if it’s a co­her­ent point of view, it can­not prove that a cer­tain num­ber doesn’t ex­ists, only as­sume it. I also sus­pect (but finite model the­ory is not my field at all) that they have an ‘in­ner’ model that mimics stan­dard nat­u­ral num­bers...

• Well, that’s what the anti-ul­tra­fini­tists say. It is pre­cisely the con­tention of the ul­tra­fini­tists that you couldn’t “count to 3^^^3”, what­ever that might mean.

• Hmm.

So, it’s not suffi­cient to define a set of steps that de­ter­mine a num­ber… it must be pos­si­ble to ex­e­cute them? That’s a rather prag­matic ap­proach. Albeit it one you’d have to keep up­dat­ing if our power to com­pute and com­pre­hend length­ier se­ries of steps grows faster than you pre­dict.

• No, ul­tra­fini­tism is not a doc­trine about our prac­ti­cal count­ing ca­pac­i­ties. Ul­trafini­tism holds that you may not have ac­tu­ally de­noted a num­ber by ‘3^^^3’, be­cause there is no such num­ber.

• I would have done the fol­low­ing if I had been asked that: calcu­late which num­bers I would have time to count up to be­fore I was thrown out/​got bored/​died/​earth ended/​uni­verse ran out of ne­gen­tropy. I would prob­a­bly have to an­swer I don’t know, or I think X is a num­ber for some of them, but it’s still an an­swer, and un­til re­cently peo­ple could not say wether “the small­est n>2 such that there are in­te­gers a,b,c satis­fy­ing a^n + b^n = c^n” was a num­ber or not.

I’m not ad­vo­cat­ing any kind of fini­tism, but I agree that the po­si­tion should be taken se­ri­ously.

• There is a lot of psuedo-ar­gu­ing go­ing on here. I’m a math­e­mat­i­cian who spe­cial­izes in logic and foun­da­tions, I think I can set­tle most of the con­tro­versy with the fol­low­ing sum­mary:

1) Se­cond-or­der logic is a perfectly ac­cept­able way to for­mal­ize math­e­mat­ics-in-gen­eral, stronger than Peano Arith­metic but some­what weaker than ZFC (com­pa­rable in proof-the­o­retic strength to “Type The­ory” of “Ma­clane Set The­ory” or “Bounded Zer­melo Set The­ory”, which means suffi­cient for al­most all main­stream math­e­mat­i­cal re­search, though in­ad­e­quate for re­sults which de­pend on Frankel’s Re­place­ment Ax­iom).

2) First-or­der logic has bet­ter syn­tac­tic prop­er­ties than Se­cond-or­der logic, in par­tic­u­lar it satis­fies Godel’s Com­plete­ness The­o­rem which guaran­tees that the syn­tax is ad­e­quate for the se­man­tics. Se­cond-or­der logic has bet­ter se­man­tic prop­er­ties than first-or­der logic, in par­tic­u­lar avoid­ing non­stan­dard mod­els of the in­te­gers and of set-the­o­ret­i­cal uni­verses V_k for var­i­ous or­di­nals k.

3) The crit­ics of sec­ond-or­der logic say that they don’t un­der­stand the con­cept of “finite in­te­ger” and “all sub­sets” and that the peo­ple us­ing sec­ond-or­der logic are mak­ing mean­ingless noises, and all we re­ally can be sure about is ZFC and its con­se­quences, and since the ZFC ax­ioms ac­tu­ally al­low us to prove more the­o­rems about ob­jects we care about than the usual for­mal­iza­tions of sec­ond-or­der logic (such as, for ex­am­ple, the con­sis­tency of full Zer­melo set the­ory), the fans of sec­ond-or­der logic should just quit and use ZFC instead

4) the fans of sec­ond-or­der logic re­ply that noth­ing we ac­tu­ally prove and trans­late into the lan­guage of set the­ory is false and our rea­son­ing is sound by your stan­dards so stop in­sult­ing us by claiming that since YOU can’t de­cide what’s true about sets, OUR no­tion of “prop­erty” or “sub­set” is vague and in­co­her­ent. We say that CH is ei­ther true of false but we don’t know which be­cause we don’t have a de­duc­tive calcu­lus that set­tles it which we are con­fi­dent in, you say that CH is vague and mean­ingless, but this is a pseudo-prob­lem un­til ei­ther you come up with new set-the­o­ret­i­cal ax­ioms which de­cide CH or we come up with a stonger de­duc­tive sys­tem which de­cides it. We can still prove “ZF proves Con(Z)”, we just think that the the­o­rems we can de­rive in pure sec­ond-or­der logic are episte­molog­i­cally bet­ter sup­ported than the­o­rems you need the full power of ZFC to prove (be­cause they already fol­low from “log­i­cal” rather than “math­e­mat­i­cal” ax­ioms) and you shouldn’t mind us mak­ing this dis­tinc­tion.

5) The best the­o­ries of physics we have need real in­fini­ties and finitely many iter­a­tions of the Pow­er­set ax­iom but Se­cond Order Logic or “Type The­ory” is com­pletely ad­e­quate for them and it is hard to con­ceive that ques­tions about higher in­finites that re­quire in­finitely many iter­a­tions of the Pow­er­set op­er­a­tion can be rele­vant to physics.

6) It is also hard to con­ceive that any form of math­e­mat­i­cal fini­tism is ad­e­quate for mak­ing ac­tual progress in the­o­ret­i­cal physics, even if re­sults that are proved us­ing stan­dard in­fini­tary math­e­mat­ics can be re­for­mu­lated as fini­tary state­ments about in­te­gers and com­pu­ta­tions, these nom­i­nal­is­tic re­duc­tions are so cum­ber­some that the self-im­posed avoidance of in­fini­tary on­tol­ogy crip­ples phys­i­cal in­sight.

• (Since the av­er­age LW reader may not know whether to trust that this com­menter is a math­e­mat­i­cian spe­cial­iz­ing in logic and foun­da­tions, I re­mark that the above sum­mary sounds much like the pa­pers I’ve read on sec­ond-or­der logic. Though ‘pseudo-ar­gu­ing’ is an odd way to de­scribe the an­cient ex­po­si­tional tra­di­tion of di­alogue.)

• it is hard to con­ceive that ques­tions about higher in­finites that re­quire in­finitely many iter­a­tions of the Pow­er­set op­er­a­tion can be rele­vant to physics.

If sets with car­di­nal­ities equal to these higher in­fini­ties ex­ist as math­e­mat­i­cal en­tities, then some of them will be log­i­cally de­pen­dent on us, on what we do. For ex­am­ple those sets may con­tain in­tel­li­gent be­ings who are simu­lat­ing our uni­verse and tak­ing differ­ent ac­tions based on what they ob­serve. I don’t know if this is ac­tu­ally the case, or if it is, how much we ought to care, but it would be nice if an FAI could con­sider and try to an­swer ques­tions like this, and not be ar­bi­trar­ily limited in its rea­son­ing by the choice of math­e­mat­i­cal lan­guage we make when build­ing it.

• Best math­e­mat­i­cal com­ment on LessWrong in four years.

I ap­plaud you, sir or madam.

• I con­fess that I lost track of the rea­son­ing about which-or­der-logic-can-do-what-and-why some­where in the last post or so. I’m also not clear how and why it is im­por­tant in un­der­stand­ing this “Highly Ad­vanced Episte­mol­ogy 101 for Begin­ners”. I’m won­der­ing whether (or how) the abil­ity to “prop­erly talk about un­bounded finite times, global con­nect­ed­ness, par­tic­u­lar in­finite car­di­nal­ities, or true spa­tial con­ti­nu­ity” is es­sen­tial or even im­por­tant for (F)AI re­search.

• 5 Jan 2013 6:31 UTC
4 points
Parent

I con­fess that I lost track of the rea­son­ing about which-or­der-logic-can-do-what-and-why some­where in the last post or so.

Me too.

I’m also not clear how and why it is im­por­tant in un­der­stand­ing this “Highly Ad­vanced Episte­mol­ogy 101 for Begin­ners”.

It’s the buildup to the “open prob­lems in FAI”. Large parts of the in­ter­nals of an AI look like sys­tems for rea­son­ing in rigor­ous ways about math, mod­els, etc.

• It’s the buildup to the “open prob­lems in FAI”. Large parts of the in­ter­nals of an AI look like sys­tems for rea­son­ing in rigor­ous ways about math, mod­els, etc.

If that were the rea­son­ing, it’d be nice if he came out and ex­plained why he be­lieves that to be the case. Be­cuase just about any A(G)I re­searcher would take is­sue with that state­ment...

• Maybe we need a handy sum­mary table of the which’s, what’s, and why’s...

• I as­sume we need to ask some of these ques­tions in or­der to de­cide if, or in what sense, an AGI needs sec­ond-or­der logic.

• This and pre­vi­ous ar­ti­cles in this se­ries em­pha­size at­tach­ing mean­ing to se­quences of sym­bols via dis­cus­sion and ges­tur­ing to­ward mod­els. That strat­egy doesn’t seem com­pat­i­ble with your ar­ti­cle re­gard­ing sheep and peb­bles. Isn’t there a way to con­nect se­quences of sym­bols to sheep (and food and similar real-world con­se­quences) di­rectly via a dis­ci­pline of “sym­bol­craft”?

I thought peb­ble­craft was an ex­cel­lent de­scrip­tion of how fini­tists and for­mal­ists think about con­fus­ing con­cepts like un­countabil­ity or ac­tual in­finity: Writ­ing down ”… is un­countably in­finite …” is part of our peb­ble­craft, and we have some con­fi­dence that when we get to the bot­tom line of the for­mal ar­gu­ment, we’ll be able to in­ter­pret the re­sults “in the usual way”. The fact that some­one (per­haps an alien) might watch our peb­ble­craft and con­strue us (in me­dias res) to be refer­ring to (ab­stract, math­e­mat­i­cal) things other than the (ab­stract, math­e­mat­i­cal) things that we usu­ally con­strue our­selves to be refer­ring to doesn’t stop the peb­ble­craft magic from work­ing.

• I think I agree. This gives me the feel­ing that ev­ery time the dis­cus­sion re­volves around mod­els, it is get­ting off the point. We can touch proof sys­tems. We can’t touch mod­els.

We can say that mod­els are part of our peb­ble­craft...

The ques­tion, though, is whether we should trust par­tic­u­lar parts of our peb­ble­craft. Should we pre­fer to work with first-or­der logic, or 2nd-or­der logic? Should we be­lieve in such a thing as a stan­dard model of the nat­u­ral num­bers? Should we trust proofs which make use of those con­cepts?

• Feel­ing our way into a new for­mal sys­tem is part of our (messy, in­for­mal) peb­ble­craft. Some­times peo­ple pro­pose for­mal sys­tems start­ing with their in­tended se­man­tics (roughly, model the­ory). Some­times peo­ple pro­pose for­mal sys­tems start­ing with in­tro­duc­tion and elimi­na­tion rules (roughly, proof the­ory). If the lat­ter, peo­ple some­times look for a se­man­tics to go along with the syn­tax (and vice versa, of course).

For ex­am­ple, lambda calcu­lus started with rules for perform­ing beta re­duc­tion. In talk­ing about lambda calcu­lus, peo­ple re­fer to it as “func­tions from func­tions to func­tions”. Math­e­mat­i­ci­ans knew that there was no non­triv­ial set S with a bi­jec­tion to the set of all func­tions from S to S. So some­thing else must be go­ing on. Dana Scott in­vented do­main the­ory par­tially to solve this prob­lem. Do­mains have ad­di­tional struc­ture, such that some do­mains D do have bi­jec­tions with the set of all struc­ture-pre­serv­ing maps from D to D. http://​​en.wikipe­dia.org/​​wiki/​​Do­main_the­ory Similarly, modal log­ics started with just proof the­ory, and then Saul Kripke in­vented a se­man­tics for them. http://​​en.wikipe­dia.org/​​wiki/​​Kripke_semantics

There’s always a syn­tac­ti­cal model which you can con­struct me­chan­i­cally from the proof the­ory (at least, that’s my un­der­stand­ing of the down­ward Lowen­heim-Skolem ar­gu­ment) - but Scott and Kripke were not satis­fied with that syn­tac­ti­cal model, and went look­ing for some­thing else, more in­sight­ful and per­spicu­ous. Ad­ding a “se­man­tic” un­der­stand­ing can in­crease our (in­for­mal) con­fi­dence in a for­mal sys­tem—even a for­mal sys­tem that we were already work­ing with. I’m not sure why it helps, but I think it’s part of our peb­ble­craft that it does help.

Per­haps adding a “se­man­tic” un­der­stand­ing is like an­other bridge be­tween in­for­mal and for­mal rea­son­ing. Th­ese bridges are only partly for­mal—they’re also partly in­for­mal, the con­cepts and ges­tur­ing around the equa­tions and proofs. Hav­ing one bridge is suffi­cient to go onto the Is­land of For­mal­ity, do some stuff, and come off again, but might be more con­ve­nient to have two, or three.

• Per­haps adding a “se­man­tic” un­der­stand­ing is like an­other bridge be­tween in­for­mal and for­mal rea­son­ing. Th­ese bridges are only partly for­mal—they’re also partly in­for­mal, the con­cepts and ges­tur­ing around the equa­tions and proofs.

Nice. This seems like a sur­pris­ingly well-mo­ti­vated way of re­duc­ing ev­ery­thing to physics: there’s just “syn­tac­tic” ma­chin­ery made out of physics, and any se­man­tics that might be at­tributed to parts of this ma­chin­ery is merely a par­tially in­for­mal de­vice (i.e. a col­lec­tion of cog­ni­tive skills) that hu­man math­e­mat­i­ci­ans might use as an aid for rea­son­ing about the ma­chin­ery. Even when the ma­chin­ery it­self might in some sense be said to be se­man­ti­cally rea­son­ing about some­thing or other, this de­scrip­tion of the ma­chin­ery can be traced back to hu­man math­e­mat­i­ci­ans who use it as a par­tially in­for­mal de­vice for un­der­stand­ing the ma­chin­ery, and so it won’t strictly speak­ing be a prop­erty of the ma­chin­ery it­self.

In other words, in this view se­man­tics is an in­for­mal art pri­mar­ily con­cerned with ad­vance­ment of hu­man un­der­stand­ing, and it’s not fun­da­men­tal to the op­er­a­tion of in­tel­li­gence in gen­eral, it’s not needed for prop­erly de­sign­ing things, re­spond­ing to ob­ser­va­tions or mak­ing de­ci­sions, any more than cu­ri­os­ity or vi­sual think­ing.

• Ok, right. I think I didn’t fully ap­pre­ci­ate your point be­fore. So the fact that a par­tic­u­lar many-sorted first-or­der logic with ex­tra ax­ioms is proof-the­o­ret­i­cally equiv­a­lent to (a given proof sys­tem for) 2nd-or­der logic should make me stop ask­ing whether we should build ma­chines with one or the other, and start ask­ing in­stead whether the many-sorted logic with ex­tra ax­ioms is any bet­ter than plain first-or­der logic (to which the an­swer ap­pears to be yes, based on our ad­mit­tedly 2nd-or­der rea­son­ing). Right?

• The prevalence of en­cod­ings means that we might not be able to “build ma­chines with one or the other”. That is, given that there are ba­sic al­ter­na­tives A and B and A can en­code B and B can en­code A, it would take a tech­nol­o­gist spe­cial­iz­ing in hair-split­ting to say whether a ma­chine that pur­port­edly is us­ing A is “re­ally” us­ing A at its low­est level or whether it is “ac­tu­ally” us­ing B and only seems to use A via an en­cod­ing.

If in the im­me­di­ate term you want to work with many-sorted first or­der logic, a rea­son­able first step would be to im­ple­ment sin­gle-sorted first or­der logic, and a pre­pro­ces­sor to trans­late the many-sorted syn­tax into the sin­gle-sorted syn­tax. Then fur­ther de­vel­op­ment, op­ti­miza­tions and so on, might com­pli­cate the story, com­pact­ing the pre­pro­ces­sor and sin­gle-sorted en­g­ine into a sin­gle tech­nol­ogy that you might rea­son­ably say im­ple­ments many-sorted logic “na­tively”.

The prevalence of en­cod­ings means that ap­par­ently differ­ent foun­da­tions don’t always make a differ­ence.

• In this case, that’s not true. The many-sorted logic, with ax­ioms and all to em­u­late 2nd-or­der logic, has differ­ent prop­er­ties than plain 1st-or­der logic (even though we may be em­u­lat­ing it in a plain 1st-or­der en­g­ine).

For ex­am­ple, in 2nd-or­der logic, we can quan­tify over any names we use. In 1st-or­der logic, this is not true: we can quan­tify over 1st-or­der en­tities, but we are un­able to quan­tify over 2nd-or­der en­tities. So, we can have named en­tities (pred­i­cates and re­la­tions) which we are un­able to quan­tify over.

One con­se­quence of this is that, in 1st-or­der logic, we can never prove some­thing non-tau­tolog­i­cal about a pred­i­cate or re­la­tion with­out first mak­ing some as­sump­tions about that spe­cific pred­i­cate or re­la­tion. State­ments which share no pred­i­cates or re­la­tions are log­i­cally in­de­pen­dent!

This limits the in­fluence of con­cepts on one an­other, to some ex­tent.

This sort of thing does not hold for 2nd-or­der logic, so its trans­la­tion doesn’t hold for the trans­la­tion of 2nd-or­der logic into 1st-or­der logic, ei­ther. (Ba­si­cally, all state­ments in this many-sorted logic will use the mem­ber­ship pred­i­cate, which causes us to lose the guaran­tee of log­i­cal in­de­pen­dence for other named items.)

So we have to be care­ful: an en­cod­ing of one thing into an­other thing doesn’t give us ev­ery­thing.

• Sugges­tion: if your goal is AI, maybe aban­don logic and set the­ory in fa­vor of type the­ory. It seems bet­ter suited to com­put­ers any­way thanks to com­pu­ta­tional trini­tar­i­anism. The prob­lem of pin­ning down mod­els uniquely seems to be solved by some­thing called the uni­valence ax­iom, which I cur­rently don’t un­der­stand at all.

• The uni­valence ax­iom has noth­ing to do with pin­ning down mod­els (ex­cept per­haps that if you are talk­ing about mod­els in­ter­nally, it says that you’re pin­ning them down up to iso­mor­phism rather than up to some no­tion of syn­tac­tic equal­ity). Non-stan­dard mod­els are es­sen­tial for get­ting an in­tu­ition about what is and isn’t prov­able; most of the ar­gu­ments that I’ve seen for un­ex­pect­edly in­ter­est­ing terms in­volve pick­ing a non-stan­dard model, find­ing a point in that model with un­ex­pected prop­er­ties (e.g., some­thing that is false in the stan­dard model), and then con­struct that point. See, for ex­am­ple, this post, which con­structs a non­stan­dard point on the cir­cle, one for which you can’t ex­hibit a path from it to the base­point of the cir­cle. More gen­er­ally, see The Ele­ments of an In­duc­tive Type for a good ex­pla­na­tion of why non-stan­dard mod­els are in­escapable.

• I guess that un­der­stand­ing uni­valence ax­iom would be helped by un­der­stand­ing the im­plicit equal­ity ax­ioms.

Uni­valence ax­iom states that two iso­mor­phic types are equal; this means that if type A is iso­mor­phic to type B, and type C con­tains A, C has to con­tain B (and a few similar re­quire­ments).

Re­quiring that two types are equal if they are iso­mor­phic means pro­hibit­ing any­thing that we can write to dis­t­in­guish them (i.e. not to han­dle them equiv­a­lently).

• Just so you know… I want to up­vote you for link­ing to that awe­some type the­ory ar­ti­cle (al­though Jere­myHahn posted it to this dis­cus­sion already), but I also want to vote you down for com­pu­ta­tional trini­tar­i­anism (propo­si­tions are not types! arggggg!). ;)

• I’m gen­er­ally not on board with the whole “pro­grams are proofs” dogma. When I look into the de­tails, all it says is that a suc­cess­ful type-check of a pro­gram gives a proof of the ex­is­tence of a pro­gram of that type. In ex­pres­sively pow­er­ful lan­guages, this seems very un­in­ter­est­ing; so far as I know, you can eas­ily con­struct a pro­gram of any type by just dis­card­ing the in­put and then cre­at­ing an out­put of the de­sired type. (Han­dling the empty type is a lit­tle tricky, but I won­der if in­finite loops or pro­gram failures would do.) So my un­der­stand­ing is that you only get the in­ter­est­ing type struc­ture (where it isn’t triv­ial to prove the ex­is­tence of a type) by re­strict­ing the lan­guage. (Typed lambda calcu­lus is not Tur­ing-com­plete.)

So, while I see the curry-howard stuff as beau­tiful math, I don’t think it jus­tifies the ‘pro­grams as proofs’ mantra that many have taken up! I don’t see the ap­peal.

• I think Harper (who coined the phrase “com­pu­ta­tional trini­tar­i­anism”) doesn’t claim that “pro­grams are proofs”, but rather that proofs are pro­grams. That is, proofs should be writ­ten down in typed cal­culi and (to be in­tu­inis­ti­cally ac­cept­able) should have com­pu­ta­tional con­tent. But pro­gram­ming lan­guages and type sys­tems can be used for more things than just prov­ing the­o­rems.

• you can eas­ily con­struct a pro­gram of any type by just dis­card­ing the in­put and then cre­at­ing an out­put of the de­sired type.

Ac­tu­ally, by dis­card­ing the in­put and then fal­ling into an in­finite loop (you can­not con­struct the out­put of the de­sired type with­out the in­put).

• I don’t know what you mean. Sup­pose you want a func­tion A → Int, where A is some type. \x:A. 6, the func­tion which takes x (of type A) and out­puts 6, seems to do fine. To put it in c-like form, int six(x) {re­turn 6}.

If pro­grams are proofs, then gen­eral pro­gram­ming lan­guages cor­re­spond to triv­ial­ism.

Am I miss­ing some­thing?

• The type con­struc­tors that you’re think­ing of are Ar­row and Int. Fo­rall is an­other type con­struc­tor, for con­struct­ing generic poly­mor­phic types. Some types such as “Fo­rall A, Fo­rall B, A → B” are un­in­hab­ited. You can­not pro­duce an out­put of type B in a generic way, even if you are given ac­cess to an el­e­ment of type A.

The type cor­re­spond­ing to a propo­si­tion like “all com­putable func­tions from the re­als to the re­als are con­tin­u­ous” looks like a func­tion type con­sum­ing some rep­re­sen­ta­tion of “a com­putable func­tion” and pro­duc­ing some rep­re­sen­ta­tion of “that func­tion is con­tin­u­ous”. To rep­re­sent that, you prob­a­bly need de­pen­dent types—this would be a type con­struc­tor that takes an el­e­ment, not a type, as a pa­ram­e­ter. Be­cause not all func­tions are con­tin­u­ous, the codomain rep­re­sent­ing “that func­tion is con­tin­u­ous” isn’t in gen­eral in­hab­ited. So build­ing an el­e­ment of that codomain is not nec­es­sar­ily triv­ial—and the pro­cess of do­ing so amounts to prov­ing the origi­nal propo­si­tion.

What I’m try­ing to say is that the types that type the­o­rists use look a lot more like propo­si­tions than the types that mun­dane pro­gram­mers use.

• You can­not pro­duce an out­put of type B in a generic way, even if you are given ac­cess to an el­e­ment of type A.

...ex­cept in Haskell, where we can make func­tions for gen­er­at­ing in­stances given a de­sired type. (Such as Magic Haskel­ler.) I still feel like it’s ac­cu­rate to say that gen­eral-pur­pose lan­guages cor­re­spond to triv­ial­ism.

Thanks, though! I didn’t think of the generic ver­sion. So, in gen­eral, in­hab­ited types all act like “true” in the iso­mor­phism. (We have A->B wher­ever B is a spe­cific in­hab­ited type.)

What I’m try­ing to say is that the types that type the­o­rists use look a lot more like propo­si­tions than the types that mun­dane pro­gram­mers use.

I’m fully aware of that, and I en­joy the cor­re­spon­dence. What both­ers me is that peo­ple seem to go from a spe­cific iso­mor­phism (which has been ex­tended greatly, granted) to a some­what ill-defined gen­eral prin­ci­ple.

My frus­tra­tion comes largely from hear­ing mis­takes. One ex­treme ex­am­ple is think­ing that (un­typed!) lambda calcu­lus is the uni­ver­sal logic (cap­tur­ing first-or­der logic and any higher-or­der logic we might in­vent; af­ter all, it’s Tur­ing com­plete). Lesser cases are usu­ally just ill-defined in­vo­ca­tions (ie, guess­ing that the iso­mor­phism might be rele­vant in cases where it’s not ob­vi­ous how it would be).

• Magic Haskel­ler and Au­gusts­son’s Djinn are provers (or to say it an­other way, com­pre­hen­si­ble as provers, or to say it an­other way, iso­mor­phic to provers). They at­tempt to prove the propo­si­tion, and if they suc­ceed they out­put the term cor­re­spond­ing (via the Curry-Howard Iso­mor­phism) to the proof.

I be­lieve they can­not out­put a term t :: a->b be­cause there is no such term, be­cause ‘any­thing im­plies any­thing else’ is false.

• The type on the right-hand side is usu­ally some­thing much more com­pli­cated and “cre­at­ing an out­put of the de­sired type” is not triv­ial.

• What is “triv­ial­ism”?

• If pro­grams are proofs, then gen­eral pro­gram­ming lan­guages cor­re­spond to triv­ial­ism.

It is not that difficult to write gen­eral-pur­pose func­tions in to­tal lan­guages like Coq. You just provide the func­tions with “fuel” to “burn” dur­ing com­pu­ta­tion. You don’t have to ex­press all fea­tures of in­ter­est in the type. But then you can­not rely on au­toma­tion.

“Pro­grams are proofs” slo­gan is in­ter­est­ing from yet an­other point of view. What are the the­o­rems for which a com­plex pro­gram is the sim­plest proof? There can be mul­ti­ple such “the­o­rems” (“prin­ci­pal types” prop­erty doesn’t hold in com­plex set­tings).

• Well yes, ei­ther di­rect in­put or wider con­text (i.e. en­vi­ron­ment, library). You can only con­struct val­ues for tau­tol­ogy types with ex­pres­sions not refer­ring to con­text.

• When I look into the de­tails, all it says is that a suc­cess­ful type-check of a pro­gram gives a proof of the ex­is­tence of a pro­gram of that type.

If course this isn’t quite true. Other in­ter­est­ing things which are said in­clude the idea that a proof of a con­junc­tion can be han­dled as a pair of proofs (one for each el­e­ment of the con­junc­tion), and a proof for a dis­junc­tion can be a proof for ei­ther side, and so on; so in a suffi­ciently rich type sys­tem, we can lev­er­age the type ma­chin­ery to do logic, us­ing product types as con­junc­tions, sum types as dis­junc­tions, and so on. A propo­si­tion is then mod­eled as a type for a proof. An im­pli­ca­tion is a func­tion type, proven by a pro­ce­dure which trans­forms a proof of one thing into a proof for an­other.

Still, I can’t swal­low “pro­grams are proofs /​ propo­si­tions are types”.

• ZF shouldn’t be able to prove that some set is a model of ZF, be­cause that’s not true in all mod­els. Many mod­els of ZF don’t con­tain any in­di­vi­d­ual set well-pop­u­lated enough for that one set to be a model of ZF all by it­self.”

Is this true be­cause of the ab­sent C, true in the sense of the (larger) model, or just false?

• ZF can’t prove that mod­els of ZF ex­ist be­cause prov­ing mod­els of ZF ex­ist is equiv­a­lent to prov­ing that ZF is con­sis­tent, and ZF can’t prove its own con­sis­tency (if it is in fact con­sis­tent) by the in­com­plete­ness the­o­rem. I don’t think ZFC can prove the con­sis­tency of ZF ei­ther, but I’m not a set the­o­rist.

• You can sub­sti­tute ZFC for ZF through­out the quoted para­graph and it will still be true. I don’t know what you want me to get from the link given; it doesn’t seem to con­tra­dict the quoted para­graph. The part where the link talks about Vk en­tailing ZFC for in­ac­cessible k is ex­actly what the quoted para­graph is say­ing.

• Zah? The linked an­swer says,

Every model of ZFC has an el­e­ment that is a model of ZFC.

It goes on to say that this need not hold when we in­ter­pret “model of ZFC” in the sense of the (outer) model.

• So af­ter read­ing that, I don’t see how it could be true even in the sense de­scribed in the ar­ti­cle with­out vi­o­lat­ing Well Foun­da­tion some­how, but what it liter­ally says at the link is that ev­ery model of ZFC has an el­e­ment which en­codes a model of ZFC, not is a model of ZFC, which I sup­pose must make a differ­ence some­how—in par­tic­u­lar it must mean that we don’t get A has an el­e­ment B has an el­e­ment C has an el­e­ment D … al­though I don’t see yet why you couldn’t con­struct that set us­ing the model’s model’s model and so on. I am con­fused about this al­though the poster of the link cer­tainly seems like a le­gi­t­i­mate au­thor­ity.

But yes, it’s pos­si­ble that the origi­nal para­graph is just false, and ev­ery model of ZFC con­tains a quoted model of ZFC. Maybe the pair-en­cod­ing of quoted mod­els en­ables there to be an in­finite de­scend­ing se­quence of sub­mod­els with­out there be­ing an in­finite de­scend­ing se­quence of ranks, the way that the even num­bers can en­code the num­bers which con­tain the even num­bers and so on in­definitely, and the rea­son why ZFC doesn’t prove ZFC has a model is that some mod­els have non­stan­dard ax­ioms which the set mod­el­ing stan­dard-ZFC doesn’t en­tail. Any­one else want to weigh in on this be­fore I edit? (PS up­vote par­ent and great-grand­par­ent.)

• I don’t see how it could be true even in the sense de­scribed in the ar­ti­cle with­out vi­o­lat­ing Well Foun­da­tion somehow

Here’s why I think you don’t get a vi­o­la­tion of the ax­iom of well-foun­da­tion from Joel’s an­swer, start­ing from way-back-when-things-made-sense. If you want to skim and in­tuit the con­text, just read the bold parts.

1) Hu­mans are born and see rocks and other ob­jects. In their minds, a lan­guage forms for talk­ing about ob­jects, ex­is­tence, and truth. When they say “rocks” in their head, sen­sory neu­rons as­so­ci­ated with the pres­ence of rocks fire. When they say “rocks ex­ist”, sen­sory neu­rons as­so­ci­ated with “true” fire.

2) Even­tu­ally the hu­mans get re­ally ex­cited and in­vent a sys­tem of rules for mak­ing cave draw­ings like “∃” and “x” and “∈” which they call ZFC, which as­serts the ex­is­tence of in­finite sets. In par­tic­u­lar, many of the hu­mans in­ter­pret the cave draw­ing “∃” to mean “there ex­ists”. That is, many of the same neu­rons fire when they read “∃” as when they say “ex­ists” to them­selves. Some of the hu­mans are care­ful not to nec­es­sar­ily be­lieve the ZFC cave draw­ing, and imag­ine a guy named ZFC who is say­ing those things… “ZFC says there ex­ists...”.

3) Some hu­mans find ways to write a string of ZFC cave draw­ings which, when in­ter­preted—when al­lowed to make hu­man neu­rons fire—in the usual way, mean to the hu­mans that ZFC is con­sis­tent. In­stead of writ­ing out that string, I’ll just write in place of it.

4) Some hu­mans ap­ply the ZFC rules to turn the ZFC ax­iom-cave-draw­ings and the cave draw­ing into a cave draw­ing that looks like this:

“∃ a set X and a re­la­tion e such that <(X,e) is a model of ZFC>”

where <(X,e) is a model of ZFC> is a string of ZFC cave draw­ings that means to the hu­mans that (X,e) is a model of ZFC. That is, for each ax­iom A of ZFC, they pro­duce an­other ZFC cave draw­ing A’ where “∃y” is always re­placed by “∃y∈X”, and “∈” is always re­placed by “e”, and then de­rive that cave draw­ing from the cave draw­ing ” and ” ac­cord­ing to the ZFC rules.

Some cau­tious hu­mans try not to be­lieve that X re­ally ex­ists… only that ZFC and the con­sis­tency of ZFC im­ply that X ex­ists. In fact if X did ex­ist and ZFC meant what it usu­ally does, then X would be in­finite.

4) The hu­mans de­rive an­other cave draw­ing from ZFC+:

“∃Y∈X and f∈X such that <(Y,f) is a model of ZFC>”,

6) The hu­mans de­rive yet an­other cave draw­ing,

“∃ZeY and geX such that <(Z,g) is a model of ZFC>”.

Some of the hu­mans, like me, think for a mo­ment that ZY∈X, and that if ZFC can prove this pat­tern con­tinues then ZFC will as­sert the ex­is­tence of an in­finite regress of sets vi­o­lat­ing the ax­iom of well-foun­da­tion… but ac­tu­ally, we only have “ZeY∈X” … ZFC only says that Z is re­lated to Y by the ex­tra-ar­tifi­cial e-re­la­tion that ZFC said ex­isted on X.

I think that’s why you don’t get a con­tra­dic­tion of well-foun­da­tion.

• Well Foun­da­tion in V_alpha case seems quite sim­ple: you build ex­ter­nally-countable chain of sub­sets which sim­ply can­not be rep­re­sented as a set in­side the first model of ZFC. So the ex­ter­nal WF is not bro­ken be­cause the el­e­ment-re­la­tion in­side the mod­els is differ­ent, and the in­ner WF is fine be­cause the chain of in­ner mod­els of ex­ter­nal ZFC is not an in­ner set.

In the stan­dard case your even-num­bers ex­pla­na­tion nicely shows what goes on — quot­ing is in­volved.

I need to think a bit to say what woud halt our at­tempts to build a chain of tran­si­tive countable mod­els...

• Well, tech­ni­cally not ev­ery model of ZFC has a ZFC-mod­el­ling el­e­ment. There is a model of “ZFC+¬Con(ZFC)”, and no el­e­ment of this mon­ster can be a model of ZFC. Not even with non­stan­dard el­e­ment-re­la­tion.

• Linked im­pres­sive au­thor­ity says the model has a ZFC-model-en­cod­ing el­e­ment, plus enough non­stan­dard quoted ZFC ax­ioms in-model that in-model ZFC doesn’t think it’s a ZFC-model-en­cod­ing el­e­ment. I.e., the for­mula for “se­man­ti­cally en­tails ZFC” is false within the model, but from out­side, us­ing our own stan­dard list of ax­ioms, we think the el­e­ment is a model of ZFC.

• Ah, sorry, I didn’t no­tice that the ques­tion is about model of ZFC in­side a “uni­verse” mod­el­ling ZFC+Con^∞(ZFC)

• The linked an­swer is talk­ing about mod­els of ZFC in an ex­ter­nal sense and Eliezer is talk­ing about them in an in­ter­nal sense.

• I’ll make this more ex­plicit, and note that at pre­sent I don’t see it af­fect­ing any­one’s ar­gu­ment:

ZF can’t prove its own self-con­sis­tency and thus can’t prove it has a model. You lo­gi­cian says that’s as it should be, since not ev­ery model of ZF con­tains an­other model—which seems true pro­vided we treat each model more as an in­ter­pre­ta­tion than as an ob­ject in a larger uni­verse—and then says that a model has to be “well-pop­u­lated”—which AFAICT holds for stan­dard mod­els, but not for all other in­ter­pre­ta­tions like those your lo­gi­cian used in the pre­vi­ous sen­tence.

• I’m still wor­ried about the word “model”. You talk about mod­els of sec­ond-or­der logic, but what is a model of sec­ond-or­der logic? Clas­si­cally speak­ing, it’s a set, and you do talk about ZF prov­ing the ex­is­tence of mod­els of SOL. But if we need to use set the­ory to rea­son about the se­man­tic prop­er­ties of SOL, then are we not then work­ing within a first-or­der set the­ory? And hence we’re vuln­er­a­ble to un­ex­pected “mod­els” of that set the­ory af­fect­ing the the­o­rems we prove about SOL within it.

It seems like you’re treat­ing “model” as if it were a fun­da­men­tal con­cept, when in fact the way it’s used in math­e­mat­ics is nor­mally em­bed­ded within some set the­ory. But this then means you can’t ro­bustly talk about “mod­els” all the way down: at some point your no­tion of model bot­toms out. I don’t think I have a solu­tion to this, but it feels like it’s a prob­lem worth ad­dress­ing.

• My re­sponse to this situ­a­tion is to say that proof the­ory is more fun­da­men­tal and in­ter­est­ing than model the­ory, and prag­matic ques­tions (which the di­a­log at­tempted to ask) are more im­por­tant than model-the­o­retic ques­tions. How­ever, to some ex­tent, the prob­lem is to re­duce model-the­o­retic talk to more prag­matic talk. So it isn’t sur­pris­ing to see model-the­o­retic talk in the post (al­though I did feel that the dis­cus­sion was wan­der­ing from the point when it got too much into mod­els).

• 5 Jan 2013 19:22 UTC
5 points

What I got out of this post is that sec­ond-or­der logic is a lie.

“The math­e­mat­i­ci­ans who deny sec­ond-or­der logic would see that rea­son­ing as in­vok­ing an im­plicit back­ground uni­verse of set the­ory. Every­thing you’re say­ing makes sense rel­a­tive to some par­tic­u­lar model of set the­ory, which would con­tain pos­si­ble mod­els of Peano ar­ith­metic as sets, and could look over those sets and pick out the small­est in that model. Similarly, that set the­ory could look over a pro­posed model for a many-sorted logic, and say whether there were any sub­sets within the larger uni­verse which were miss­ing from the many-sorted model. Ba­si­cally, your brain is in­sist­ing that it lives in­side some par­tic­u­lar model of set the­ory. And then, from that stand­point, you could look over some other set the­ory and see how it was miss­ing sub­sets that your the­ory had.”

Se­cond-or­der logic is how first-or­der set the­ory feels from the in­side.

And now I un­der­stand (and agree with) Johni­cholas’ com­ment from the last post. ZFC is a hack, but it is a very good hack which fits with our in­tu­itions. How­ever, af­ter read­ing this post, I think that to use sec­ond-or­der logic is to de­cieve your­self into think­ing that ZFC is the uni­ver­sally, un­equiv­o­cally best defi­ni­tion of a set. I’m not any­where close to 100% con­fi­dent (though I’m still well over 50%) that ZFC is even con­sis­tent!

Good ques­tion. One ob­vi­ous an­swer is that the AI would be able to in­duct what you would call an in­finite ax­iom schema, as a sin­gle ax­iom—a sim­ple, finite hy­poth­e­sis.

That can eas­ily be worked around. One an­swer (the one I’m in­ti­mately fa­mil­iar with, not nec­es­sar­ily the best one) is to use first-or­der met­a­logic, which proves the­o­rem schema rather than the­o­rems (note that most the­o­rem schema end up iso­mor­phic to the­o­rems, as well). This is the ap­proach Me­ta­math takes, and they have cre­ated a met­a­log­i­cally com­plete ax­iom­a­ti­za­tion of first-or­der met­a­logic with equal­ity, which ends up al­low­ing ZFC to be finitely ax­io­m­a­tized.

• ZFC is the uni­ver­sally, un­equiv­o­cally best defi­ni­tion of a set

Worse. You are be­ing tricked into be­liev­ing that ZFC is at all a defi­ni­tion of a set at all, while it is just a set of re­stric­tions on what we would tol­er­ate.

In some sense, if you be­lieve that there is only one sec­ond-or­der model of nat­u­ral num­bers, you have to make de­ci­sions what are the prop­er­ties of nat­u­ral num­bers that you can range over; as Co­hen has taught us, this in­volves mak­ing a lot of set-the­o­ret­i­cal de­ci­sions with con­tinuum hy­poth­e­sis be­ing only one of them.

• What I gather is that the real ques­tion here isn’t so much “First or sec­ond or­der logic?”, but “Is it sen­si­ble to try to con­vey/​in­still the idea of ‘largest pow­er­set’ to/​in an AI, and if so how?” “Largest pow­er­set” seems both mean­ingful and sim­ple (for the pur­pose of ap­ply­ing Oc­cam’s ra­zor), but we don’t know how to cap­ture its mean­ing us­ing syn­tac­tic proof rules (or rather, we know that we can’t). Maybe we can cap­ture the mean­ing some other way, for ex­am­ple in the form of a “math in­tu­ition mod­ule” that can ar­rive at “prob­a­bil­is­tic” con­clu­sions about math­e­mat­i­cal struc­tures that are defined in part by “largest pow­er­set”?

• We know we can’t do that ei­ther, al­though we can do slightly bet­ter than we can with mono­tonic logic (see my re­sponse).

• I guess the re­sult you’re refer­ring to is that there is no TM whose out­put con­verges to 1 if its in­put is a true state­ment (ac­cord­ing to stan­dard se­man­tics) in sec­ond or­der PA and 0 if it’s false. But don’t we need stronger re­sults to con­clude “we can’t do that ei­ther”? For ex­am­ple, take a hu­man math­e­mat­i­cian and con­sider his or her be­liefs about “the stan­dard num­bers” as time goes to in­finity (given un­limited com­put­ing power). Can you show that those be­liefs can’t come closer to de­scribing the stan­dard num­bers than some class of other math­e­mat­i­cal struc­tures?

• Yes, you’re right. I was be­ing too se­vere. We can’t “cap­ture the mean­ing” prob­a­bil­is­ti­cally any bet­ter than we can with clas­si­cal truth val­ues if we as­sume that “cap­ture the mean­ing” is best un­packed as “com­pute the cor­rect value” or “con­verge to the cor­rect value”. (We can con­verge to the right value in more cases than we can sim­ply ar­rive at the right value, of course. Mov­ing from crisp logic to prob­a­bil­ities doesn’t ac­tu­ally play a role in this im­prove­ment, al­though it seems bet­ter for other rea­sons to define the con­ver­gence with shift­ing prob­a­bil­ities rather than sud­denly-switch­ing non­mono­tonic logic.)

The main ques­tion is: what other no­tion of “cap­ture the mean­ing” is rele­vant here? What other prop­er­ties are im­por­tant, once ac­cu­racy is ac­counted for as best is pos­si­ble? Or should we just set­tle for as much ac­cu­racy as we can get, and then for­get about the rest of what “cap­ture the mean­ing” seems to en­tail?

• Screw set the­ory. I live in the phys­i­cal uni­verse where when you run a Tur­ing ma­chine, and keep watch­ing for­ever in the phys­i­cal uni­verse, you never ex­pe­rience a time where that Tur­ing ma­chine out­puts a proof of the in­con­sis­tency of Peano Arith­metic. Fur­ther­more, I live in a uni­verse where space is ac­tu­ally com­posed of a real field and space is ac­tu­ally in­finitely di­visi­ble and con­tains all the points be­tween A and B, rather than space only con­tain­ing a de­nu­mer­able num­ber of points whose ex­is­tence can be proven from the first-or­der ax­iom­a­ti­za­tion of the real num­bers. So to talk about physics—for­get about math­e­mat­ics—I’ve got to use sec­ond-or­der logic.

Sup­pos­ing that physics is com­putable, this isn’t nec­es­sar­ily true. For ex­am­ple, if physics is com­putable in type-the­ory (or any lan­guage with non-stan­dard mod­els), then the non-stan­dard-ness prop­a­gates not just to your physics, but to the in­di­vi­d­u­als liv­ing in that physics which are try­ing to model it. Con­sider the non-stan­dard model of the calcu­lus of in­duc­tive con­struc­tions (ap­prox­i­mately, Martin-Löf type the­ory) in which types are pointed sets, and func­tions are func­tions be­tween sets which pre­serve the dis­t­in­guished point. Then any rep­re­sen­ta­tion of a per­son in space-time you can con­struct in this model (of the com­pu­ta­tion run­ning physics) will have a dis­t­in­guished point. Since any func­tion to or from such a per­son-lo­cated-in-space-time must pre­serve this point, I’m pretty sure no in­for­ma­tion can prop­a­gate from the the per­son at the non-stan­dard time to the per­son at any stan­dard time. So even if you ex­pect to ob­serve the tur­ing ma­chine to halt when­ever you find your­self in a non-stan­dard model, you should also ex­pect to not be able to tell that it’s non-stan­dard ex­cept when you’re ac­tively notic­ing its non-stan­dard­ness. Fur­ther­more, since func­tions must pre­serve the point, you should ex­pect (prov­ably) that for all times T, the tur­ing ma­chine should not be halted at time T (be­cause the func­tion map­ping times T to state­ments must map the non-stan­dard point of T to the non-stan­dard (and triv­ial) state­ment).

For a more elo­quent ex­pla­na­tion of how non-stan­dard mod­els are un­avoid­able, and of­ten use­ful, see The Ele­ments of an In­duc­tive Type.

• Eliezer—why is it that you seem to adamant that our phys­i­cal re­al­ity is bet­ter mod­eled by the stan­dard nat­u­ral num­bers rather than some su­per­nat­u­ral model? If we try to see whether a phys­i­cal im­ple­men­ta­tion of a Tur­ing ma­chine pro­vides an in­con­sis­tency in PA, then of course it won’t be­cause it will run for fewer than Ack­er­mann(20) iter­a­tions, and all num­ber the­o­ries say that it won’t halt by then. If we in­stead lived in some­thing that looked like the game of life and we did ex­pect it to last in­finitely long why ex­actly would we ex­pect time steps to be in­dexed by some par­tic­u­lar model of the nat­u­ral num­bers when we haven’t yet seen more than finitely many of them (and those finitely many are pre­dicted in all mod­els). Claiming that we don’t ex­pect the Tur­ing ma­chine to ever halt be­cause it hasn’t yet seems a lit­tle like say­ing that all times I’ve seen come be­fore 2014, and there­fore I ex­pect that all times ever will have this prop­erty.

• Sim­plic­ity is an im­por­tant con­sid­er­a­tion. There are many more ways for com­plex uni­verses to have vi­o­lated our ex­pec­ta­tions already, while still mak­ing sapi­ence pos­si­ble, than for sim­ple uni­verses to have done so. That is, the space of sim­ple uni­verses is much more con­strained by an­thropic prin­ci­ples than is the space of com­pli­cated ones; so it is sur­pris­ing that we seem to have found our­selves in a uni­verse largely de­scrib­able with very sim­ple and uniform logic and math­e­mat­ics, since there are many more ways to be com­pli­cated than to be sim­ple. And this sur­pris­ing fact may sug­gest that ex­otic log­i­cal, math­e­mat­i­cal, or meta­phys­i­cal struc­tures do not ex­ist, or are in some other re­spect not can­di­dates for our an­thropic rea­son­ing. In other words, we can draw a ten­ta­tive in­duc­tive con­clu­sion re­gard­ing whether space­time is non­stan­dard over­all by con­sid­er­ing all the ways in which our the­o­ries to date have been sur­pris­ingly suc­cess­ful and gen­eral while ap­peal­ing al­most ex­clu­sively to the sim­plest (which of­ten cor­re­sponds to ‘most stan­dard,’ with a few im­por­tant caveats) con­ceiv­able sys­tems. This won’t be con­clu­sive, but it’s at least sug­ges­tive.

• I guess here’s my point. We would not ex­pect to have seen any em­piri­cal differ­ences be­tween stan­dard and non-stan­dard mod­els of space­time yet. There would not be any ob­serv­able differ­ences un­til we reach su­per­nat­u­ral times. I sup­pose you might dis­count the non­stan­dard mod­els based on their not be­ing com­putable or some­thing, but I’m not con­vinced that this is the right thing to do. I also agree that the stan­dard model should be given more weight (due to its con­cep­tual sim­plic­ity to pin down) than any par­tic­u­lar non­stan­dard model, but I don’t see why it should be given more weight than all of them put to­gether. The even more con­cise (or at least less po­ten­tially prob­le­matic) the­ory would be to say that we are liv­ing in some model of these ax­ioms and not bother to spec­ify which.

• I also agree that the stan­dard model should be given more weight (due to its con­cep­tual sim­plic­ity to pin down) than any par­tic­u­lar non­stan­dard model, but I don’t see why it should be given more weight than all of them put to­gether.

I’m not sure that it does de­serve more weight than all of them put to­gether. That largely rests on how com­pli­cated our need­lessly com­pli­cated mod­els have to be in or­der to re­cover the em­piri­cal world. If there are a lot of rel­a­tively sim­ple ways to perfectly re­pro­duce the em­piri­cal pre­dic­tions of our stan­dard mod­els to date, then that will make a big­ger differ­ence than if there are an even greater num­ber of fan­tas­ti­cally ger­ry­man­dered, con­voluted sys­tems.

Even though there are more com­pli­cated uni­verses that perfectly gen­er­ate our ob­ser­va­tions to date than there are sim­ple uni­verses that do so, given proper sub­sets of our uni­verse, as­sum­ing a struc­turally sim­ple uni­verse tends to yield more ac­cu­rate pre­dic­tions than does as­sum­ing a struc­turally com­plex uni­verse.

Although in ab­solute terms there are so many more ways to be com­pli­cated than to be sim­ple that given al­most any data, most mod­els pre­dict­ing that data are ex­ceed­ingly com­pli­cated and ger­ry­man­dered...

… and al­though it is not clear whether sen­tience ex­ists in a larger pro­por­tion of com­pli­cated wor­lds than of sim­ple ones (or vice versa)...

… nev­er­the­less it seems clear that a larger pro­por­tion of the sim­ple than of com­plex wor­lds that pre­dict some of our ex­pe­rience pre­dict all of it. That is, there are pro­por­tion­ally many more ways to com­pli­cat­edly pre­dict most of our ob­ser­va­tions and then spec­tac­u­larly fail to pre­dict the last few, than there are ways to com­pli­cat­edly pre­dict most of our ob­ser­va­tions and then pre­dict the last few as well.

This, how­ever, is still speak­ing very gen­er­ally; the de­gree of com­plex­ity will make a huge differ­ence here, and not all non-stan­dard mod­els are rad­i­cally more com­pli­cated.

• Even though there are more com­pli­cated uni­verses that perfectly gen­er­ate our ob­ser­va­tions to date than there are sim­ple uni­verses that do so, given proper sub­sets of our uni­verse, as­sum­ing a struc­turally sim­ple uni­verse tends to yield more ac­cu­rate pre­dic­tions than does as­sum­ing a struc­turally com­plex uni­verse.

But when we are talk­ing about which model of set the­ory to use to model our ex­pe­riences, all of the non-stan­dard mod­els give ex­actly the same pre­dic­tions to date (and will con­tinue to do so un­til we reach non-stan­dard times). And not all of these mod­els are hor­ribly com­pli­cated. There’s the model of con­structable set the­ory, where you only have the sets ab­solutely guaran­teed you by the ax­ioms. This gives you a countable model of set the­ory (and hence not the stan­dard one), but it is still eas­ily de­scrib­able.

• There are many more ways for com­plex uni­verses to have vi­o­lated our ex­pec­ta­tions already, while still mak­ing sapi­ence pos­si­ble, than for sim­ple uni­verses to have done so.

How do we know they haven’t?

• Ac­tu­ally, here’s maybe a bet­ter way of say­ing what I’m try­ing to get at here: What ev­i­dence would ac­tu­ally con­vince you that we lived in a uni­verse given by a non-stan­dard model of the nat­u­ral num­bers?

Be­fore you say “I run a com­puter pro­gram which re­turns a proof of 0=1 in PA”, think about it for a while. Ig­nor­ing the fact that in prac­tice, you would prob­a­bly sus­pect that the proof is wrong, might you in­stead take this as ev­i­dence that PA is ac­tu­ally in­con­sis­tent. After all, Godel tells us that we can’t re­ally be sure about that. In fact, if you did live in a non­stan­dard model of PA, and found a non­stan­dard proof of 0=1, wouldn’t it feel from the in­side like you found an in­con­sis­tency in PA rather than that you found a non­stan­dard num­ber?

• “ZF can quote for­mu­las as struc­tured, and en­code mod­els as sets …

Was there sup­posed to be a word af­ter “struc­tured” and be­fore the comma?

• “So the ques­tion you asked me last time was, ‘Why does any­one bother with first-or­der logic at all, if sec­ond-or­der logic is so much more pow­er­ful?’”

For what it’s worth, math­e­mat­i­ci­ans care about first-or­der logic be­cause it satis­fies nicer the­o­rems than sec­ond-or­der logic, in the same way that math­e­mat­i­ci­ans care about Lebesgue in­te­gra­tion be­cause it satis­fies nicer the­o­rems than Hen­stock-Kurzweil in­te­gra­tion. The state­ments that math­e­mat­i­ci­ans talk about in many con­texts are first-or­der any­way, so we might as well figure out what there is to say about first-or­der state­ments so we can ap­ply it to the rele­vant situ­a­tion (in the same way that the func­tions that math­e­mat­i­ci­ans want to in­te­grate in many con­texts are Lebesgue in­te­grable any­way, so...).

In math­e­mat­ics there is usu­ally a trade­off be­tween gen­er­al­ity and power: the more gen­eral a the­ory is, the less it can say. That’s why ab­stract alge­bra text­books gen­er­ally talk about groups in­stead of monoids even though the lat­ter are more gen­eral: there are lots of nice the­o­rems about, say, finite groups that just fail mis­er­ably for finite monoids, and so it’s much harder to say non­triv­ial things about finite monoids than it is to say non­triv­ial things about finite groups.

And first-or­der logic is a sur­pris­ingly pow­er­ful tool in pure math­e­mat­ics: in the guise of model the­ory it has lots of in­ter­est­ing ap­pli­ca­tions, e.g. Hrushovski’s proof of the Mordell con­jec­ture over func­tion fields.

• I do not think the situ­a­tion is as sim­ple as you claim it to be. Con­sider that a cat­e­gory is more gen­eral than a monoid, but there are many in­ter­est­ing the­o­rems about cat­e­gories.

As far as foun­da­tions for math­e­mat­i­cal logic go, any one in­ter­ested in such things should be made aware of the re­cent in­ven­tion of uni­va­lent type the­ory. This can be seen as a logic which is in­her­ently higher-or­der, but it also has many other de­sir­able prop­er­ties. See for in­stance this re­cent blog post: http://​​golem.ph.utexas.edu/​​cat­e­gory/​​2013/​​01/​​from_set_the­ory_to_type_the­ory.html#more

That uni­va­lent type the­ory is only a few years old is a sign we are not close to fully un­der­stand­ing what foun­da­tional logic is most con­ve­nient. For ex­am­ple, one might hope for a fully di­rected ho­mo­topy type the­ory, which I don’t doubt will ap­pear a few years down the line.

• Sure. It has of course been the case that care­ful in­creases in gen­er­al­ity have also led to in­creases in power (hence the weasel word “usu­ally”). There is some­thing like a “pro­duc­tion-pos­si­bil­ities fron­tier” re­lat­ing gen­er­al­ity and power, and some con­cepts are near the fron­tier (and so one can’t gen­er­al­ize them with­out los­ing some power) while some are not.

• Ac­tu­ally, in NBG you have ex­plic­it­ness of as­sump­tions and of first-or­der logic — and at the same time ax­iom of in­duc­tion is a sin­gle ax­iom.

Ac­tu­ally, if you care about car­di­nal­ity, you need a well-speci­fied set the­ory more than just ax­ioms of re­als. Se­cond-or­der the­ory has a unique model, yes, but it has the no­tion of “all” sub­sets, so it just smug­gles some set the­ory with­out spec­i­fy­ing it. As I un­der­stand, this was the mo­ti­va­tion for Henkin se­man­tics.

And if you look for a set the­ory (ex­plicit or im­plicit) for re­als as used in physics, I am not even sur you want ZFC. For ex­am­ple, Solo­vay has shown that you can use a set the­ory where all sets of re­als are mea­surable with­out much risk of con­tra­dic­tions. After all, un­limited ax­iom of choice is not that nat­u­ral for phys­i­cal in­tu­ition.

• Then you could ar­guably deny that shout­ing about the ‘stan­dard’ num­bers wouldn’t mean anything

Too many con­tra­dic­tions, I think: it should read “deny that shout­ing about the stan­dard num­bers would mean any­thing”.

(Re­moved wrong edit here)

• “ZF can quote for­mu­las as struc­tured, and en­code mod­els as sets, and then rep­re­sent a finite ZF-for­mula which says whether or not a set of quoted for­mu­las is true about a model. ZF can then prove that no step of clas­si­cal logic goes from premises that are true in­side a set-model, to premises that are false in­side a set-model. In other words, ZF can rep­re­sent the idea ‘for­mula X is se­man­ti­cally true in model Y’ and ‘these syn­tac­tic rules never pro­duce se­man­ti­cally false state­ments from se­man­ti­cally true state­ments’.”

Wait, then why can’t ZF prove it­self con­sis­tent? If ZF be­lieves in all the ax­ioms of ZF, and it be­lieves that logic never pro­duces false state­ments from true state­ments -

“Ah, but ZF can’t prove that there ex­ists any set which is a model of ZF, so it can’t prove the ZF-ax­ioms are con­sis­tent.

Note: this is ac­tu­ally con­nected to an am­bi­guity in “ZF be­lieves in all the ax­ioms of ZF”. That sys­tem has in­finitely many for­mal ax­ioms, and lacks any finite way to as­sert them all at once (eg, to use them all in a par­tic­u­lar proof). If it could, then it could use a prin­ci­ple ap­pro­pri­ately called “re­flec­tion” to show that they must all be true in­side a par­tic­u­lar set—a nat­u­ral model like the ones Eliezer talks about in the OP, though not prov­ably as large as those—and could thus prove it­self con­sis­tent.

• This dis­cus­sion seems to equate the ques­tion of what lan­guage we want an AGI to most of­ten rea­son with, and what lan­guage we think our uni­verse, or (Teg­mark?-)re­al­ity in gen­eral, is likely ‘writ­ten’ in. But there may be prac­ti­cal virtues of one lan­guage over the other that do not trans­late into dis­tinct rea­sons to be­lieve that re­al­ity is writ­ten in the more prac­ti­cal lan­guage; so the two ques­tions should be kept sep­a­rate to some ex­tent. ‘Think like re­al­ity’ is a heuris­tic for helping alle­vi­ate dis­so­nance be­tween our be­liefs, and bet­ter al­ign­ing our be­liefs with our goals; but it ad­mits of ex­cep­tions, both epistem­i­cally and in­stru­men­tally.

One sim­ple way to an­swer the AGI-spe­cific ques­tion is to have the AGI ‘think’ in whichever lan­guage seems more use­ful (keep­ing in mind that hav­ing true be­liefs is usu­ally, though not always, very use­ful!), but try as much as pos­si­ble to keep the AGI from as­sum­ing that the ax­ioms and in­fer­ence rules with which it (ini­tially?) thinks must be the same as the ones that best char­ac­ter­ize ul­ti­mate re­al­ity. In­stead, whichever lan­guage feels more ‘nat­u­ral’ to the AGI, we want it to be able to do the same sort of in­ner-di­alogue rea­son­ing that Eliezer him­self is do­ing in this se­ries of vi­gnettes, weigh­ing the em­piri­cal ev­i­dence for and against the uni­verse’s ‘ob­jec­tive sec­ond-or­dered­ness’ and reach­ing a con­clu­sion pro­por­tioned to the ev­i­dence. The AGI should, as much as is fea­si­ble and prac­ti­cal, be rea­son­ably skep­ti­cal of its own ba­sic mode of op­er­a­tion, even if we limit its ca­pac­ity to self-mod­ify (ei­ther di­rectly, or via its pri­ors) in such a way that it can’t help but think in terms of cer­tain ax­ioms the ev­i­dence re­veals to be false.

A limit­ing case: Sup­pos­ing the AGI dis­cov­ers Syl­van’s Box, we want it to up­date in fa­vor of di­alethe­ism, even if it was ini­tially most prac­ti­cal to code it to think in terms of clas­si­cal logic. At least, that’s the ideal; it may be that the costs of ac­tu­ally pro­gram­ming an AGI to be able to take se­ri­ously such a vast the­ory-space, given just how un­likely di­alethe­ist sce­nar­ios are, out­weighs the virtue of per­mit­ting the AGI to up­date in fa­vor of ir­rec­on­cilable con­tra­dic­tions. But that’s a prag­matic con­sid­er­a­tion; we should still hold in prin­ci­ple to such Tarski litany-in­stances as:

• If any propo­si­tion is both true and false, I de­sire to be­lieve that some propo­si­tion is both true and false.

• If dis­junc­tive syl­l­o­gism is not always valid, I de­sire to be­lieve that dis­junc­tive syl­l­o­gism is not always valid.

• If some propo­si­tions are nei­ther true nor false, I de­sire to be­lieve that some propo­si­tions are nei­ther true nor false.

Similarly, a strictly first-or­der lo­gi­cian should grant: ‘If the uni­verse is struc­tured sec­ond-or­deredly, I de­sire to be­lieve that the uni­verse is struc­tured sec­ond-or­deredly.’

Even the alien race who can­not think in sec­ond-or­der terms should at least grant: ‘If I can­not con­ceive of how the uni­verse is struc­tured, I de­sire to be­lieve that I can­not con­ceive of how the uni­verse is struc­tured.’

The flip side of this: If we found it more use­ful to make the AGI rea­son para­con­sis­tently (i.e., to no­tice con­tra­dic­tions in its own be­liefs and non-ex­plo­sively rea­son with them in the hopes of root­ing out what went wrong, like hu­mans do), this on its own would in no way give us rea­son to pro­gram the AGI to be­lieve that the phys­i­cal (or pla­tonic) uni­verse har­bors ob­jec­tive con­tra­dic­tions of any sort, or to be eas­ily (or even pos­si­bly) per­suaded of this propo­si­tion by fu­ture ev­i­dence.

• We should get this au­to­mat­i­cally in many cases. Clas­si­cal logic can think about di­alethe­ism and other pos­si­bil­ities by in­vent­ing nega­tive pred­i­cates for ev­ery pred­i­cate, say. The rules re­lat­ing pos­i­tive to nega­tive can then be cho­sen at con­ve­nience. (In first-or­der logic, we would need to in­vent a nega­tive pred­i­cate sep­a­rately for each pos­i­tive; in sec­ond-or­der logic, we could make a gen­eral state­ment to the effect that each pos­i­tive pred­i­cate has a nega­tive, and make state­ments about how they com­bine.)

That’s one rea­son we don’t need to worry too much about which logic is cor­rect; as johni­co­las keeps point­ing out, the ex­is­tence of en­cod­ings means (at least in many cases) we get the same power re­gard­less of what choice we make.

• Clas­si­cal logic can think about di­alethe­ism and other pos­si­bil­ities by in­vent­ing nega­tive pred­i­cates for ev­ery pred­i­cate, say.

In ad­di­tion to con­ven­tional nega­tion, or in lieu of it? Re­mem­ber that, on di­alethe­ism, we may not know in ad­vance which false­hoods are go­ing to end up later turn­ing out to also be true. So we’d never be able to as­sert real nega­tions (as op­posed to ‘nega­tivized’ pred­i­cates) if we want to com­pletely pro­tect our­selves from ever go­ing para­con­sis­tent/​triv­ial­ist (de­pend­ing on where our rea­son­ing sys­tem puts us when we en­counter a con­tra­dic­tion).

This is­sue isn’t just one of com­pu­ta­tional power; it’s of choos­ing pri­ors for our AGI’s ba­sic in­ter­peta­tions of re­al­ity. I dis­t­in­guished the two ques­tions “what lan­guage we want an AGI to most of­ten rea­son with” v. “what lan­guage we want the AGI to think our world is writ­ten in;” if we think that the AGI’s func­tion­al­ity will best be met if it has a some­what false view of the world in some re­spect (which is dan­ger­ous but might have non-in­sane jus­tifi­ca­tions), then the lat­ter ques­tion will be dis­tinct from “what lan­guage the world is ac­tu­ally writ­ten in.” But I can agree with you that it’s mostly just an en­g­ineer­ing prob­lem to an­swer the first ques­tion (i.e., we don’t need to set­tle ev­ery meta­phys­i­cal de­bate be­fore we can start cod­ing!), with­out think­ing this triv­ial­izes the sec­ond ques­tion.

• What I’m say­ing is it doesn’t mat­ter, and there­fore doesn’t ob­vi­ously make sense to ask, what lan­guage the world is writ­ten in. Even if the world ac­tu­ally runs on para­con­sis­tent logic, we can still do fine with clas­si­cal logic. It will al­ter our prior some­what, but not a while lot, be­cause the en­cod­ing of one into the other isn’t so hard.

Eliezer is at­tempt­ing to make some com­par­i­son be­tween the struc­ture of the ac­tual world and the struc­ture of the logic, to de­ter­mine which logic most seems to be what the world is writ­ten in. But the ex­is­tence of en­cod­ings of one logic in an­other makes this ex­er­cise some­what un­nec­es­sary (and also means that the struc­ture of the world is only weak ev­i­dence for “what logic it was writ­ten in”).

To an­swer your ques­tion: in ad­di­tion, be­cause hav­ing strong nega­tion in ad­di­tion to weak nega­tion only in­creases the ex­pres­sive power of the sys­tem. It does not in­crease the risk you men­tion, be­cause the sys­tem is still choos­ing what to be­lieve ac­cord­ing to its learn­ing al­gorithm, and so will not choose to be­lieve strong nega­tives that cause prob­lems.

EDIT:

In re­sponse to srn347, I would like to clar­ify that I do not in­tend to be­lit­tle the im­por­tance of com­par­ing the effec­tive­ness of differ­ent log­ics. The ex­is­tence of an en­cod­ing will only get you spe­cific things (de­pend­ing on the na­ture of the en­cod­ing), and fur­ther­more, en­cod­ings to not always ex­ist. So, speci­fi­cally, I’m only say­ing that I don’t see a need for para­con­sis­tent logic. Other pos­si­bil­ities need to be judged on their mer­its. How­ever, look­ing for en­cod­ings is an im­por­tant tool in that judge­ment.

• it doesn’t mat­ter, and there­fore doesn’t ob­vi­ously make sense to ask [...]

‘Doesn’t mat­ter’ in what sense? ‘Doesn’t make sense’ in what sense?

what lan­guage the world is writ­ten in.

Be care­ful not to con­fuse the literal ques­tion ‘Is our world ac­tu­ally a simu­la­tion pro­grammed in part us­ing sec­ond-or­der logic?’ from the more generic ‘is our world struc­tured in a sec­ond-or­der-logic-ish way?’. The lat­ter, I be­lieve, is what Eliezer was con­cerned with—he wasn’t as­sum­ing we were in a simu­la­tion, he was ask­ing whether our uni­verse’s laws or reg­u­lar­i­ties are prop­erty-gen­er­al­iz­ing in a sec­ond-or­der-style way, or schematic in a first-or­der-style way, the im­por­tance be­ing that schematic laws are more likely to vary across space­time. Similarly, the im­por­tant ques­tion of whether there are di­aletheias should not be con­fused with the more oc­cult ques­tion of whether our uni­verse is a simu­la­tion writ­ten us­ing para­con­sis­tent log­ics (which, of course, need not in­volve any true con­tra­dic­tions).

Even if the world ac­tu­ally runs on para­con­sis­tent logic, we can still do fine with clas­si­cal logic. It will al­ter our prior some­what, but not a while lot, be­cause the en­cod­ing of one into the other isn’t so hard.

This is con­fus­ing a num­ber of differ­ent is­sues. Para­con­sis­tent logic need not be di­alethe­ist, and not all di­alethe­ists fa­vor para­con­sis­tent logic (though, for ob­vi­ous rea­sons, nearly all of them do). You’re re­con­fus­ing the is­sues I dis­en­tan­gled in my ear­lier posts, while (if I’m not mi­s­un­der­stand­ing you) try­ing to make a scram­bled ver­sion of my ear­lier point that our choice of logic to have the AGI rea­son with is partly in­de­pen­dent of our choice of logic to have the AGI think the uni­verse struc­turally re­sem­bles.

If you think it doesn’t make sense to ask what log­ics our uni­verse struc­turally re­sem­bles, re­mem­ber that the struc­tural similar­ity be­tween ar­ith­metic op­er­a­tions and phys­i­cal pro­cesses is what makes math­e­mat­ics use­ful in the first place, and that we should not be sur­prised to see similar­i­ties be­tween the sim­plest sys­tems we can come up with, and a uni­verse with re­mark­ably sim­ple and uniform be­hav­ior, at the fun­da­men­tal level.

It does not in­crease the risk you men­tion, be­cause the sys­tem is still choos­ing what to be­lieve ac­cord­ing to its learn­ing al­gorithm, and so will not choose to be­lieve strong nega­tives that cause prob­lems.

That’s as­sum­ing it never makes a mis­take early on. If it does, and it’s not a para­con­sis­tent rea­soner, its episte­mol­ogy will ex­plode. You un­der­es­ti­mate the virtues of para­con­sis­tent log­ics for com­plex rea­son­ing sys­tems; they’re very valuable for epistemic self-de­bug­ging, as op­posed to shut­ting the whole thing down if even the small­est mis­take oc­curs. And, again, none of this re­quires the slight­est com­mit­ment to di­aletheias be­ing pos­si­ble.

• That’s as­sum­ing it never makes a mis­take early on. If it does, and it’s not a para­con­sis­tent rea­soner, its episte­mol­ogy will ex­plode.

I’m imag­in­ing a bet­ter-de­signed sys­tem than that. In­com­ing facts to­gether with in­ter­nal rea­son­ing would con­stantly be in­val­i­dat­ing var­i­ous the­o­ries; the sys­tem should have a bunch of com­mit­ments which ex­plode on it, but rather, should always be seek­ing out con­tra­dic­tions in or­der to in­val­i­date the­o­ries.

You un­der­es­ti­mate the virtues of para­con­sis­tent log­ics for com­plex rea­son­ing sys­tems; [...] none of this re­quires the slight­est com­mit­ment to di­aletheias be­ing pos­si­ble.

Per­haps so. I don’t ac­tu­ally un­der­stand what it would mean to rea­son with a para­con­sis­tent logic while still be­liev­ing a clas­si­cal logic. Is some­thing like that pos­si­ble? Or (as it seems) would rea­son­ing via a para­con­sis­tent logic have to re­duce the num­ber of tau­tolo­gies I rec­og­nize? I’ve sort of ig­nored para­con­sis­tent thought merely be­cause I’m not in­ter­ested in di­alethe­ism...

You’re re­con­fus­ing the is­sues I dis­en­tan­gled in my ear­lier posts, while (if I’m not mi­s­un­der­stand­ing you) try­ing to make a scram­bled ver­sion of my ear­lier point that our choice of logic to have the AGI rea­son with is partly in­de­pen­dent of our choice of logic to have the AGI think the uni­verse struc­turally re­sem­bles.

Yes, re-read­ing your origi­nal post, I can see that I missed that point (and was par­tially re-stat­ing it). The ac­tual point I’m try­ing to make is ba­si­cally to re­spond to this part:

but try as much as pos­si­ble to keep the AGI from as­sum­ing that the ax­ioms and in­fer­ence rules with which it (ini­tially?) thinks must be the same as the ones that best char­ac­ter­ize ul­ti­mate re­al­ity. In­stead, whichever lan­guage feels more ‘nat­u­ral’ to the AGI, we want it to be able to do the same sort of in­ner-di­alogue rea­son­ing that Eliezer him­self is do­ing in this se­ries of vi­gnettes, [...]

by say­ing that we get much of that au­to­mat­i­cally. Really, what I should have said is that we can achieve this by choos­ing as ex­pres­sive a logic as pos­si­ble, to en­sure that as many other log­ics as pos­si­ble have good em­bed­dings in that logic.

This could al­ter­na­tively be stated as an ex­pres­sion of con­fu­sion about why you would want to ad­dress this con­cern sep­a­rately from the choice of logic, since if we at­tempted to im­ple­ment the afore­men­tioned in­ner-di­a­log rea­son­ing, we would surely have to provide a logic for the rea­son­ing to take place in.

• I don’t ac­tu­ally un­der­stand what it would mean to rea­son with a para­con­sis­tent logic while still be­liev­ing a clas­si­cal logic. Is some­thing like that pos­si­ble?

Not only is it pos­si­ble, but prob­a­bly over 99% of peo­ple who em­ploy para­con­sis­tent log­ics be­lieve in clas­si­cal-logic meta­physics, or at least some­thing a lot closer to clas­si­cal-logic meta­physics than to di­alethe­ism. Para­con­sis­tent logic is just rea­son­ing in such a way that when you ar­rive at a con­tra­dic­tion in your be­lief sys­tem, you try to di­ag­nose and discharge (or oth­er­wise quaran­tine) the con­tra­dic­tion, rather than just con­clud­ing that any­thing fol­lows. Dialethe­ism is one way (or fam­ily of ways) to quaran­tine the con­tra­dic­tion so that it doesn’t yield ex­plo­sion, but it isn’t the stan­dard one. Para­con­sis­tent rea­son­ing is a proof method­ol­ogy, not a meta­phys­i­cal stance in its own right.

what I should have said is that we can achieve this by choos­ing as ex­pres­sive a logic as pos­si­ble, to en­sure that as many other log­ics as pos­si­ble have good em­bed­dings in that logic.

Within rea­son. We may not want the AGI to be so ex­pres­sive that it can ex­press things we think are cat­e­gor­i­cally mean­ingless and/​or use­less. And the power to ex­press some mean­ingful cir­cum­stances might come with costs that out­weigh the ex­pected util­ity. I sup­pose one way to go about this is to pick the op­ti­mal level of ex­pres­sivity given the cir­cum­stances we ex­pect the AGI to run into, but try to make the AGI able to self-mod­ify (or gen­er­ate non­clas­si­cal sub­sys­tems with which it can carry on a rea­son­able, open-minded di­alogue) to in­crease ex­pres­sivity if it runs into situ­a­tions that seem es­pe­cially anoma­lous given its con­cep­tion of what a cir­cum­stance or fact is.

The ba­sic prob­lem is: How does one as­sign a prob­a­bil­ity to there be­ing true propo­si­tions that are in­trin­si­cally in­ef­fable (for non-com­plex­ity-re­lated rea­sons)? A good start­ing place is to imag­ine a be­ing that had far less log­i­cal ex­pres­sivity than we do (e.g., some­one who could say ‘p’ /​ ‘true’ or (as a prim­i­tive con­cept) ‘un­known’ /​ ‘un­proven’ but could not say ‘not-p’ /​ ‘false’), and rea­son by anal­ogy from this base case.

if we at­tempted to im­ple­ment the afore­men­tioned in­ner-di­a­log rea­son­ing, we would surely have to provide a logic for the rea­son­ing to take place in.

Ideally, if two sub­sys­tems of an AGI are de­signed to rea­son with differ­ent log­ics, and we want the two to ar­gue over the best in­ter­pre­ta­tion of a prob­lem case, we would set­tle the dis­pute by some rule like ‘Try to prove to the satis­fac­tion of your op­po­nent that their view leads to too many cir­cum­stances we both agree are prob­lems. Avoid ques­tion-beg­ging ar­gu­ments, i.e., ones that as­sume that your logic is the right one, when that is pre­cisely what is un­der dis­pute; seek ar­gu­ments that both of you can agree are valid, or even ar­gu­ments that you think are in­valid but that you think prob­le­ma­tize your op­po­nent’s po­si­tion.’ Of course, if we need a de­ci­sion pro­ce­dure in cases where the AGI ar­rives at a stale­mate, we may need to as­sume that a cer­tain side ‘wins’ if there’s a draw.

• I think you are over­es­ti­mat­ing the difficulty of the math­e­mat­i­cal prob­lem here! To quote JoshuaFox:

(Two im­pos­si­ble things be­fore break­fast … and maybe a few more? Eliezer seems to be re­build­ing logic, set the­ory, on­tol­ogy, episte­mol­ogy, ax­iol­ogy, de­ci­sion the­ory, and more, mostly from scratch. That’s a lot of im­pos­si­bles.)

But once those prob­lems are solved, we do not need to ad­di­tion­ally solve the prob­lem you are high­light­ing, I think...

‘Try to prove to the satis­fac­tion of your op­po­nent that their view leads to too many cir­cum­stances we both agree are prob­lems. Avoid ques­tion-beg­ging ar­gu­ments, i.e., ones that as­sume that your logic is the right one, when that is pre­cisely what is un­der dis­pute; seek ar­gu­ments that both of you can agree are valid, or even ar­gu­ments that you think are in­valid but that you think prob­le­ma­tize your op­po­nent’s po­si­tion.’

When it comes down to it, wouldn’t this be just like some logic that is the com­mon sub­set of the two, or per­haps some kind of av­er­age (be­tween the prob­a­bil­ity dis­tri­bu­tions on ob­ser­va­tions in­duced by each logic)? Again, I think this will be han­dled well enough (han­dled bet­ter, to be pre­cise) by a more pow­er­ful logic which can ex­press each of the two nar­rower log­ics as a hy­poth­e­sis about the struc­ture in which the en­vi­ron­ment is best defined. Then the hon­est ar­gu­ment you de­scribe will be a re­sult of the hon­est at­tempt of the agent to es­ti­mate prob­a­bil­ities and find plans of ac­tion which yield util­ity re­gard­less of the sta­tus of the un­knowns.

• If I may in­ter­ject (as­sum­ing it isn’t too early to start propos­ing solu­tions), it does turn out to be the case that com­putabil­ity logic is a su­per­set of lin­ear logic, which en­codes re­source-bound­ed­ness and avoids ma­te­rial en­tail­ment para­doxes, in­tu­ition­is­tic logic, which en­codes proof/​jus­tifi­ca­tion, and clas­si­cal logic, which en­codes truth. To ac­cept less is to sac­ri­fice one or more of the above at­tributes in terms of ex­pres­sive­ness.

• Thanks; I cer­tainly didn’t in­tend to say that all log­ics are equiv­a­lent so it doesn’t mat­ter… ed­ited to clar­ify...

• True, but these kinds of tricks might make hash of its util­ity func­tion.

• Very true! But that is a prob­lem that already needs to be solved, sep­a­rately. :)

• Yes, and friendli­ness is already a hard enough prob­lem, with­out peo­ple dump­ing the hard parts of other prob­lems into it.

• I don’t think you can avoid solv­ing this one in the first place, so I wouldn’t call it dump­ing. (The prob­lem is already in­her­ent in the friendli­ness prob­lem, so we’re not adding any­thing new to the friendli­ness prob­lem by keep­ing this out of the logic prob­lem.) The fact is that goals which have to do with ex­ter­nal ob­jects need to make use of “fuzzy” de­scrip­tions of those ob­jects. There should be some rea­son­able way of for­mal­iz­ing fuzzi­ness. (Of course stan­dard fuzzy logic the­ory is woe­fully in­ad­e­quate, but it has roughly the right goal, so we can re-use the name.) Utility needs to be a func­tion of some “em­piri­cal clusters” in re­al­ity, speci­fied in­de­pen­dently from the un­der­ly­ing physics (or com­plete list of bor­der cases) of those en­tities. En­cod­ing a differ­ent logic into our logic is not much differ­ent from en­cod­ing a physics, so I think there is not a sep­a­rate prob­lem to be solved.

• hey that was re­ally in­ter­est­ing about whether or not “all prop­er­ties” in sec­ond or­der logic falls vic­tim to the same weird­ness as the idea of a nat­u­ral num­ber does in first or­der logic. I never thought of that be­fore.

oh yeah do you have a refer­ence for “there’s tricks for mak­ing sec­ond-or­der logic en­code any propo­si­tion in third-or­der logic and so on.”? I would like to study that too, even though you’ve already given me a lot of things to read about already!

• doesnt an­swer my question

• It does.

Sec­tion 4 in the ar­ti­cle linked in the grand­par­ent dis­cusses higher-or­der log­ics and how they can all be simu­lated by and re­duced to sec­ond-or­der logic.

• In re­cent years sec­ond-or­der logic has made some­thing of a re­cov­ery, buoyed by Ge­orge Boolos’ in­ter­pre­ta­tion of sec­ond-or­der quan­tifi­ca­tion as plu­ral quan­tifi­ca­tion over the same do­main of ob­jects as first-or­der quan­tifi­ca­tion (Boolos 1984). Boolos fur­ther­more points to the claimed non­firstorder­iz­abil­ity of sen­tences such as “Some crit­ics ad­mire only each other” and “Some of Fi­anchetto’s men went into the ware­house un­ac­com­panied by any­one else” which he ar­gues can only be ex­pressed by the full force of sec­ond-or­der quan­tifi­ca­tion. How­ever, gen­er­al­ized quan­tifi­ca­tion and par­tially or­dered, or branch­ing, quan­tifi­ca­tion may suffice to ex­press a cer­tain class of pur­port­edly non­firstorder­iz­able sen­tences as well and it does not ap­peal to sec­ond-or­der quan­tifi­ca­tion.

https://​​en.wikipe­dia.org/​​wiki/​​Nonfirstorderizability

• I fail to see how this is ev­i­dence of Non­sec­on­dorder­iz­abil­ity of some pos­si­ble sen­tences.

There is no known trick to en­code all sen­tences ex­press­ible in higher-or­der log­ics into first-or­der logic, but there is such a trick to en­code all sen­tences ex­press­ible in higher-or­der log­ics in sec­ond-or­der logic.

The trick in ques­tion is de­scribed in the SEP ar­ti­cle. Doesn’t that suffice as a refer­ence and start­ing point for study­ing the no­tion that sec­ond-or­der logic can en­code higher-or­der log­ics?

• When I learned about Lowen­hein-Skolem, I just thought “lol first-or­der logic sux”. I didn’t ques­tion whether dis­t­in­guish­ing among in­finite car­di­nal­ities was a mean­ingful thing for hu­mans to do. Eliezer is clearly a higher-or­der stu­dent of logic.

• Er, to be clear about not tak­ing credit, this is a long-run­ning philo­soph­i­cal de­bate in math­e­mat­ics. The parts about “Well, what physics, then?” I haven’t read el­se­where but that could just be limited read­ing.

• And what about Skolem’s Para­dox? Does it make you think “lol set the­ory sux”? :)

• As far as com­plex­ity-of-logic-the­o­ries-for-rea­son-of-be­liev­ing-in-them, that should be pro­por­tional to the min­i­mal Tur­ing ma­chine that would check if some­thing is an ax­iom or not. (Of course, in the case of a finite list, ap­prox­i­mat­ing it to the to­tal length of the ax­ioms is rea­son­able, be­cause the Tur­ing ma­chine that does “check if in­put is equal to fol­low­ing set:” fol­lowed by set adds a con­stant size to the set—but that ap­prox­i­ma­tion breaks down badly for in­finite ax­iom schema).

• That might sound weird, but do we have any ev­i­dence that our time fol­lows the stan­dard num­bers (or a con­tin­u­ous ver­sion of them) only? Is it even pos­si­ble to get such ev­i­dence? Maybe our tur­ing ma­chine (look­ing for con­tra­dic­tions in PA) stops at −2* - “we” can­not see it, as “we” are on the stan­dard num­bers only, ex­pe­rienc­ing only effects of pre­vi­ous stan­dard num­bers.

• Time is a prop­erty of the map, not a prop­erty of the ter­ri­tory. The ter­ri­tory prob­a­bly doesn’t have any­thing that could rea­son­ably be called time, along the lines of the time­less physics posts.

• In this ap­proach, any­thing is a prop­erty of the map only.

(One thing is to say that time isn’t, from the point of view of the fun­da­men­tal micro­scopic laws, as uni­ver­sal and im­por­tant as might in­tu­itively ap­pear. Another thing is to say that it doesn’t ex­ist in the ter­ri­tory. There are nat­u­ral clocks, such as de­cay­ing ra­dioc­tive el­e­ments, whose ex­is­tence has lit­tle com­mon with hu­man “maps”.)

• Bar­bour’s uni­verse/​mul­ti­verse is much more com­plex than a clas­si­cal time­ful uni­verse. And it isn’t es­pe­cialy likely, ei­ther. Specual­tion =/​= proof. To say the least.

• Could you ex­pand on that? I had the im­pres­sion that time­less physics was larger in RAM, but not on disk—where it counts for MML, Solomonoff In­duc­tion, etc.

• I’m afraid I don’t un­der­stand your RAM/​disk metaphor.

• I took it to mean that it re­quires you to keep track of more state at any given time, but that the whole sys­tem has less log­i­cal com­plex­ity when you write it all down. Sort of like the differ­ence be­tween es­ti­mat­ing the fu­ture state of a re­stricted N-body prob­lem with nu­mer­i­cal meth­ods vs. solv­ing it di­rectly.

• There are no “given times” in Bar­bour’s uni­verse. And who’s the “you”? An in­ter­nal or ex­ter­nal ob­server?

• “Doesn’t physics say this uni­verse is go­ing to run out of ne­gen­tropy be­fore you can do an in­finite amount of com­pu­ta­tion?” Ac­tu­ally, there is a [pro­posal that could cre­ate a com­puter that runs for­ever.

• Are these time crys­tals made up of pro­tons? If so, pro­ton de­cay might even­tu­ally cause it to dis­in­te­grate.

• 5 Jan 2013 1:18 UTC
−1 points

“If there’s a col­lec­tion of third-or­der ax­ioms that char­ac­ter­izes a model, there’s a col­lec­tion of sec­ond-or­der ax­ioms that char­ac­ter­izes the same model. Once you make the jump to sec­ond-or­der logic, you’re done—so far as any­one knows (so far as I know) there’s noth­ing more pow­er­ful than sec­ond-or­der logic in terms of which mod­els it can char­ac­ter­ize.”

You clearly can state Con­tinuum Hy­poth­e­sis in the higher or­der logic, while a 2nd or­der for­mu­la­tion seems elu­sive. Are you sure about it?

• 5 Jan 2013 2:22 UTC
4 points
Parent

Eliezer is cor­rect. See SEP on HOL.

• Link is good, but I guess di­rect ex­pla­na­tion of this sim­ple thing could be use­ful.

It is not hard to build ex­plicit map be­tween R and R² (more or less in­ter­leav­ing the bi­nary no­ta­tions for num­bers).

So the claim of Con­tinuum Hy­poth­e­sis is:

For ev­ery prop­erty of real num­bers P there ex­ists such a prop­erty of pairs of real num­bers, Q such that:

1) ∀x (P(x) → ∃! y Q(x,y))

2) ∀x (¬P(x) → ∀y¬Q(x,y))

(i.e. Q de­scribes map­ping from sup­port of P to R)

3) ∀x1,x2,y: ((Q(x1,y)^Q(x2,y)) → x1=x2)

(i.e. the map is an in­jec­tion)

4) (∀y ∃x Q(x,y)) ∨ (∀x∀y (Q(x,y)-> y∈N))

(i.e. map is ei­ther sur­jec­tion to R or in­jec­tion to a sub­set of N)

Th­ese con­di­tions say that ev­ery sub­set of R is ei­ther the size of R or no big­ger than N.

• 4 Jan 2013 21:18 UTC
−2 points

Se­cond or­der logic may be able to ex­press all ex­press­ible state­ments in 3rd or­der logic, 4th or­der, up to any finite Nth or­der, but per­haps there ex­ist ∞th or­der state­ments that 2nd or­der logic can­not ex­press. Albeit, such state­ments could never be fully de­cid­able and would thus be, at best, semide­cid­able or co-semide­cid­able. This may not be com­plete, but sci­ence works the same way.

• Nope, 2nd-or­der logic can dis­cuss Nth or­der where N is an in­finite or­di­nal, as well [for some class of de­scrib­able or­di­nals]. Maybe the ar­gu­ment could be made that 2nd-or­der logic can­not dis­cuss a uni­verse con­tain­ing all the or­di­nals, and in that case maybe we could ar­gue that set the­ory can, and is there­fore more pow­er­ful. But this is not clear.

In­de­pen­dently of that, we might also be­lieve that cat­e­gory the­ory can de­scribe things be­yond set the­ory, be­cause cat­e­gory the­ory reg­u­larly in­ves­ti­gates cat­e­gories which cor­re­spond to “large” sets (such as the set of all sets). There are ways to put cat­e­gory the­ory into set the­ory, but these trans­late the “large” sets into “small” sets such as Grothen­deik uni­verses, so we could still ar­gue that the se­man­tics of cat­e­gory the­ory is big­ger.

How­ever, even if this is the case, it might be that 2nd-or­der logic can en­code cat­e­gory the­ory in the same way that it can en­code 3rd-or­der logic. We can add ax­ioms to 2nd-or­der logic which de­scribe non-stan­dard set the­o­ries con­tain­ing “big” sets, such as NF. This may al­low for an “hon­est” ac­count of cat­e­gory the­ory, in which cat­e­gories re­ally can be defined on “big” sets such as the set of all sets.

• I don’t think that in­fini­tary logic is the same as ω-or­der logic.

• 5 Jan 2013 2:13 UTC
−1 points
Parent

Wouldn’t ω-or­der logic be a sub­set of in­fini­tary logic? Or do I have it back­wards?

• I don’t think they have any­thing to do with each other. In­fini­tary logic is first-or­der logic with in­finite proof lengths. Se­cond-or­der logic is finite proof lengths with quan­tifi­ca­tion over pred­i­cates. I don’t know if there’s any par­tic­u­lar known re­la­tion be­tween what these two the­o­ries can ex­press.

• Third or­der logic talks about some­thing that we don’t un­der­stand, and we don’t un­der­stand it in ex­actly the same way that the first-or­der aliens can’t un­der­stand sec­ond or­der logic.

One of the things third or­der logic might be able to do is de­ter­mine the num­ber of steps a Tur­ing will take be­fore halt­ing in a con­stant-time al­gorithm.

• 8 Jan 2013 5:42 UTC
3 points
Parent

Ac­tu­ally, the halt­ing prob­lem has been proven im­pos­si­ble, for if third or­der logic can be used to de­ter­mine if/​when a tur­ing ma­chine halts, a tur­ing ma­chine could run that on it­self, but would take nec­es­sar­ily more than N steps to de­ter­mine that it halts (where N is the num­ber of steps it halts in), which con­tra­dicts. Third-or­der logic may be able to do the in­con­ceiv­able, but not the non­com­putable.

• That’s not the stan­dard proof, at least not the one I know: You haven’t proven that it would nec­es­sar­ily take more than N steps. The way it goes is that you as­sume there is a ma­chine that halts iff the ma­chine you give it runs for­ever given it­self as in­put; now if you run it on it­self, if it halts, it is wrong, and if it doesn’t halt, it is also wrong, mean­ing our as­sump­tion was wrong.

• Tur­ing ma­chines can’t use third or­der logic. That might be a limi­ta­tion of the pro­gram­mer, or it might be a limi­ta­tion of de­ter­minis­tic ma­chines. A Tur­ing ma­chine can’t ever com­pute any­thing about it’s own out­put; third or­der logic might al­low de­ter­minis­tic ma­chines which can com­pute the re­sults of their own out­put.

I sus­pect that there is a halt­ing prob­lem where the halt­ing sta­tus of ma­chines which only use sec­ond or­der logic can­not be com­puted in third or­der logic, just like limits can­not be com­puted in Peano ar­ith­metic.