# AlephNeil

Karma: 986
Page 1
• You should be able to get it as a corol­lary of the lemma that given two dis­joint con­vex sub­sets U and V of R^n (which are non-zero dis­tance apart), there ex­ists an af­fine func­tion f on R^n such that f(u) > 0 for all u in V and f(v) < 0 for all v in V.

Our two con­vex sets be­ing (1) the image of the sim­plex un­der the F_i : i = 1 … n and (2) the “nega­tive quad­rant” of R^n (i.e. the set of points all of whose co-or­di­nates are non-pos­i­tive.)

• an au­thor­i­ta­tive pay­off ma­trix that X can’t safely calcu­late xer­self.

Why not? Can’t the pay­off ma­trix be “read off” from the “world pro­gram” (as­sum­ing X isn’t just ‘given’ the pay­off ma­trix as an ar­gu­ment.)

1. Ac­tu­ally, this is an open prob­lem so far as I know: show that if X is a Naive De­ci­sion The­ory agent as above, with some an­a­lyz­able in­fer­ence mod­ule like a halt­ing or­a­cle, then there ex­ists an agent Y writ­ten so that X co­op­er­ates against Y in a Pri­soner’s Dilemma while Y defects.

Let me just spell out to my­self what would have to hap­pen in this in­stance. For definite­ness, let’s take the pay­offs in pris­oner’s dilemma to be \$0 (CD), \$1 (DD), \$10 (CC) and \$11 (DC).

Now, if X is go­ing to co-op­er­ate and Y is go­ing to defect then X is go­ing to prove “If I co-op­er­ate then I get \$0”. There­fore, in or­der to co-op­er­ate, X must also prove the spu­ri­ous coun­ter­fac­tual “If I defect then I get \$x” for some nega­tive value of x.

But sup­pose I tweak the defi­ni­tion of the NDT agent so that when­ever it can prove (1) “if out­put = a then util­ity >= u” and (2) “if out­put != a then util­ity ⇐ u” it will im­me­di­ately out­put a. (And if sev­eral state­ments of the forms (1) and (2) have been proved then the agent searches for them in the or­der that they were proved) Note that our agent will quickly prove “if out­put = ‘defect’ then util­ity >= \$1”. So if it ever man­aged to prove “if out­put = ‘co-op­er­ate’ then util­ity = \$0″ it would defect right away.

Since I have tweaked the defi­ni­tion, this doesn’t ad­dress your ‘open prob­lem’ (which I think is a very in­ter­est­ing one) but it does show that if we re­place the NDT agent with some­thing only slightly less naive, then the an­swer is that no such Y ex­ists.

(We could re­place Pri­soner’s Dilemma with an al­ter­na­tive game where each player has a third op­tion called “nu­clear holo­caust”, such that if ei­ther player opts for nu­clear holo­caust then both get (say) -\$1, and ask the same ques­tion as in your note 2. Then even for the tweaked ver­sion of X it’s not clear that no such Y ex­ists.)

ETA: I’m afraid my idea doesn’t work: The prob­lem is that the agent will also quickly prove “if ‘co-op­er­ate’ then I re­ceive at least \$0.” So if it can prove the spu­ri­ous coun­ter­fac­tual “if ‘defect’ then re­ceive −1″ be­fore prov­ing the ‘real’ coun­ter­fac­tual “if ‘co-op­er­ate’ then re­ceive 0” then it will co-op­er­ate.

We could patch this up with a rule that said “if we de­duce a con­tra­dic­tion from the as­sump­tion ‘out­put = a’ then im­me­di­ately out­put a” which, if I re­mem­ber rightly, is Nesov’s idea about “play­ing chicken with the in­con­sis­tency”. Then on de­duc­ing the spu­ri­ous coun­ter­fac­tual “if ‘defect’ then re­ceive −1” the agent would im­me­di­ately defect, which could only hap­pen if the agent it­self were in­con­sis­tent. So if the agent is con­sis­tent, it will never de­duce this spu­ri­ous coun­ter­fac­tual. But of course, this is get­ting even fur­ther away from the origi­nal “NDT”.

• [gen­eral com­ment on se­quence, not this spe­cific post.]

You have such a strong in­tu­ition that no con­figu­ra­tion of clas­si­cal point par­ti­cles and forces can ever amount to con­scious aware­ness, yet you don’t im­me­di­ately gen­er­al­ize and say: ‘no uni­verse ca­pa­ble of ex­haus­tive de­scrip­tion by math­e­mat­i­cally pre­cise laws can ever con­tain con­scious aware­ness’. Why not? Surely what­ever weird and won­der­ful elab­o­ra­tion of quan­tum the­ory you dream up, some­one can ask the same old ques­tion: “why does this bit that you’ve con­ve­niently la­bel­led ‘con­scious­ness’ ac­tu­ally have con­scious­ness?”

So you want to iden­tify ‘con­scious­ness’ with some­thing on­tolog­i­cally ba­sic and unified, with well-defined prop­er­ties (or else, to you, it doesn’t re­ally ex­ist at all). Yet these very things would con­vince me that you can’t pos­si­bly have found con­scious­ness given that, in re­al­ity, it has ragged, ill-defined edges in time, space, even in­tro­spec­tive con­tent.

Step­ping back a lit­tle, it strikes me that the whole con­cept of sub­jec­tive ex­pe­rience has been care­fully re­fined so that it can’t pos­si­bly be tracked down to any­thing ‘out there’ in the world. Kant and Wittgen­stein (among oth­ers) saw this very clearly. There are many pos­si­ble con­clu­sions one might draw—Den­nett de­spairs of philos­o­phy and re­fuses to ac­knowl­edge ‘sub­jec­tive ex­pe­rience’ at all—but I think peo­ple like Chalmers, Pen­rose and your­self are on a hope­less quest.

• The com­pre­hen­sion ax­iom schema (or any other con­struc­tion that can be used by a proof checker al­gorithm) isn’t enough to prove all the state­ments peo­ple con­sider to be in­escapable con­se­quences of sec­ond-or­der logic.

In­deed, since the sec­ond-or­der the­ory of the real num­bers is cat­e­gor­i­cal, and since it can ex­press the con­tinuum hy­poth­e­sis, an or­a­cle for sec­ond-or­der val­idity would tell us ei­ther that CH or ¬CH is ‘valid’.

(“Set the­ory in sheep’s cloth­ing”.)

• But the big­ger prob­lem is that we can’t say ex­actly what makes a “silly” coun­ter­fac­tual differ­ent from a “se­ri­ous” one.

Would it be naive to hope for a crite­rion that roughly says: “A con­di­tional P ⇒ Q is silly iff the ‘most eco­nom­i­cal’ way of prov­ing it is to de­duce it from ¬P or else from Q.” Some­thing like: “there ex­ists a proof of ¬P or of Q which is strictly shorter than the short­est proof of P ⇒ Q”?

A to­tally differ­ent ap­proach starts with the fact that your ‘lemma 1’ could be proved with­out know­ing any­thing about A. Per­haps this could be deemed a suffi­cient con­di­tion for a coun­ter­fac­tual to be se­ri­ous. But I guess it’s not a nec­es­sary con­di­tion?

• Sup­pose we had a model M that we thought de­scribed can­nons and can­non balls. M con­sists of a set of math­e­mat­i­cal as­ser­tions about cannons

In logic, the tech­ni­cal terms ‘the­ory’ and ‘model’ have rather pre­cise mean­ings. If M is a col­lec­tion of math­e­mat­i­cal as­ser­tions then it’s a the­ory rather than a model.

for­mally in­de­pen­dent of the math­e­mat­i­cal sys­tem A in the sense that the ad­di­tion of some ax­iom A0 im­plies Q, while the ad­di­tion of its nega­tion, ~A0, im­plies ~Q.

Here you need to spec­ify that adding A0 or ~A0 doesn’t make the the­ory in­con­sis­tent, which is equiv­a­lent to just say­ing: “Nei­ther Q nor ~Q can be de­duced from A.”

Note: if by M you had ac­tu­ally meant a model, in the sense of model the­ory, then for ev­ery well-formed sen­tence s, ei­ther M satis­fies s or M satis­fies ~s. But then mod­els are ab­stract math­e­mat­i­cal ob­jects (like ‘the in­te­gers’), and there’s usu­ally no way to know which sen­tences a model satis­fies.

• Per­haps a slightly sim­pler way would be to ‘run all al­gorithms si­mul­ta­neously’ such that each one is slowed down by a con­stant fac­tor. (E.g. at time t = (2x + 1) * 2^n, we do step x of al­gorithm n.) When al­gorithms ter­mi­nate, we check (still within the same “pro­cess” and hence slowed down by a fac­tor of 2^n) whether a solu­tion to the prob­lem has been gen­er­ated. If so, we re­turn it and halt.

ETA: Ah, but the busi­ness of ‘switch­ing pro­cesses’ is go­ing to need more than con­stant time. So I guess it’s not im­me­di­ately clear that this works.

• I agree that defi­ni­tions (and ex­pan­sions of the lan­guage) can be use­ful or coun­ter­pro­duc­tive, and hence are not im­mune from crit­i­cism. But still, I don’t think it makes sense to play the Bayesian game here and at­tach prob­a­bil­ities to differ­ent defi­ni­tions/​lan­guages be­ing cor­rect. (Rather like how one can’t ap­ply Bayesian rea­son­ing in or­der to de­cide be­tween ‘the­ory 1’ and ‘the­ory 2’ in my branch­ing vs prob­a­bil­ity post.) There­fore, I don’t think it makes sense to calcu­late ex­pected util­ities by tak­ing a weighted av­er­age over each of the pos­si­ble stances one can take in the mind-body prob­lem.

• I don’t un­der­stand the ques­tion, but per­haps I can clar­ify a lit­tle:

I’m try­ing to say that (e.g.) an­a­lytic func­tion­al­ism and (e.g.) prop­erty du­al­ism are not like in­con­sis­tent state­ments in the same lan­guage, one of which might be con­firmed or re­futed if only we knew a lit­tle more, but in­stead like differ­ent choices of lan­guage, which al­ter the set of propo­si­tions that might be true or false.

It might very well be that the ex­panded lan­guage of prop­erty du­al­ism doesn’t “do” any­thing, in the sense that it doesn’t help us make de­ci­sions.

• Of course, we haven’t had any in­stances of jar­ring phys­i­cal dis­con­ti­nu­ities not be­ing ac­com­panied by ‘func­tional dis­con­ti­nu­ities’ (hope­fully it’s clear what I mean).

But the deeper point is that the whole pre­sump­tion that we have ‘men­tal con­ti­nu­ity’ (in a way that tran­scends func­tional or­ga­ni­za­tion) is an in­tu­ition founded on noth­ing.

(To be fair, even if we ac­cept that these in­tu­itions are in­defen­si­ble, it’s re­mains to be ex­plained where they come from. I don’t think it’s all that “bizarre”.)

• Nice sar­casm. So it must be re­ally easy for you to an­swer my ques­tion then: “How would you show that my sug­ges­tions are less likely?”

Right?

• You re­ally think there is log­i­cal cer­tainty that up­load­ing works in prin­ci­ple and your sug­ges­tions are ex­actly as likely as the sug­ges­tion ‘up­load­ing doesn’t ac­tu­ally work’?

How would you show that my sug­ges­tions are less likely? The thing is, it’s not as though “no­body’s mind has an­nihilated” is data that we can work from. It’s im­pos­si­ble to have such data ex­cept in the first-per­son case, and even there it’s im­pos­si­ble to know that your mind didn’t an­nihilate last year and then recre­ate it­self five sec­onds ago.

We’re pre­dis­posed to say that a jar­ring phys­i­cal dis­con­ti­nu­ity (even if af­ter­wards, we have an agent func­tion­ally equiv­a­lent to the origi­nal) is more likely to cause mind-an­nihila­tion than no such dis­con­ti­nu­ity, but this in­tu­ition seems to be rest­ing on noth­ing what­so­ever.

• The iden­tify of an ob­ject is a choice, a way of look­ing at it. The “right” way of mak­ing this choice is the way that best achieves your val­ues.

I think that’s re­ally the cen­tral point. The meta­phys­i­cal prin­ci­ples which ei­ther al­low or deny the “in­trin­sic philo­soph­i­cal risk” men­tioned in the OP are not like the­o­rems or nat­u­ral laws, which we might hope some day to cor­rob­o­rate or re­fute—they’re more like defi­ni­tions that a per­son ei­ther adopts or does not.

I don’t see ei­ther as irrational

I have to part com­pany here—I think it is ir­ra­tional to at­tach ‘ter­mi­nal value’ to your biolog­i­cal sub­strate (like­wise pa­per­clips), though it’s difficult to ex­plain ex­actly why. Ter­mi­nal val­ues are in­her­ently ir­ra­tional, but valu­ing the con­tinu­ance of your thought pat­terns is likely to be in­stru­men­tally ra­tio­nal for al­most any set of ter­mi­nal val­ues, whereas plac­ing ex­tra value on your biolog­i­cal sub­strate seems like it could only make sense as a ter­mi­nal value (ex­cept in a highly ar­tifi­cial set­ting e.g. Dr Evil has vowed to do some­thing evil un­less you pre­serve your sub­strate).

Of course this raises the ques­tion of why the deferred ir­ra­tional­ity of pre­serv­ing one’s thoughts in or­der to do X is bet­ter than the im­me­di­ate ir­ra­tional­ity of pre­serv­ing one’s sub­strate for its own sake. At this point I don’t have an an­swer.

• For any par­tic­u­lar pro­posal for mind-up­load­ing, there’s prob­a­bly a sig­nifi­cant risk that it doesn’t work, but I un­der­stand that to mean: there’s a risk that what it pro­duces isn’t func­tion­ally equiv­a­lent to the per­son up­loaded. Not “there’s a risk that when God/​Ri­pley is watch­ing ev­ery­one’s viewscreens from the con­trol room, she sees that up­loaded per­son’s thoughts are on a differ­ent screen from the origi­nal.”

• If the rules of this game al­low one side to in­tro­duce a “small in­trin­sic philo­soph­i­cal risk” at­tached to mind-up­load­ing, even though it’s im­pos­si­ble in prin­ci­ple to de­tect whether some­one has suffered ‘ar­bi­trary Sear­lean mind-an­nihili­a­tion’, then surely the other side can pos­tu­late a risk of ar­bi­trary mind-an­nihila­tion un­less we up­load our­selves. (Even ig­nor­ing the fa­mil­iar non-Sear­lean mind-an­nihila­tion that awaits us in old age.)

Per­haps a new­born mind has a half-life of only three hours be­fore spon­ta­neously and un­de­tectably an­nihilat­ing it­self.

• Thanks, this is all fas­ci­nat­ing stuff.

One small sug­ges­tion: if you wanted to, there are ways you could elimi­nate the phe­nomenon of ‘last round defec­tion’. One idea would be to ran­domly gen­er­ate the num­ber of rounds ac­cord­ing to an ex­po­nen­tial dis­tri­bu­tion. This is equiv­a­lent to hav­ing, on each round, a small con­stant prob­a­bil­ity that this is the last round. To be hon­est though, the ‘last round’ phe­nomenon makes things more rather than less in­ter­est­ing.

Other ways to spice things up would be: to cause play­ers to make mis­takes with small prob­a­bil­ity (say a 1% chance of defect­ing when you try to co-op­er­ate, and vice versa); or have some prob­a­bil­ity of mis­re­mem­ber­ing the past.

• Con­versely, when we got trol­led an un­speci­fied length of time ago, an in­com­pe­tent crack­pot troll who shall re­main name­less kept hav­ing all his posts and com­ments up­voted by other trolls.

It would help if there was a re­stric­tion on how much karma one could add or sub­tract from a sin­gle per­son in a given time, as oth­ers are sug­gest­ing.

• What in­ter­ests me about the Boltz­mann brain (this is a bit of a tan­gent) is that it sharply poses the ques­tion of where the bound­ary of a sub­jec­tive state lies. It doesn’t seem that there’s any part X of your men­tal state that couldn’t be re­placed by a mere “im­pres­sion of X”. E.g. an im­pres­sion of hav­ing been to a party yes­ter­day rather than a mem­ory of the party. Or an im­pres­sion that one is aware of two differ­ently-coloured patches rather than the patches them­selves to­gether with their colours. Or an im­pres­sion of ‘differ­ence’ rather than an im­pres­sion of differ­ently coloured patches.

If we imag­ine “you” to be a cir­cle drawn with magic marker around a bunch of mis­cel­la­neous odds and ends (ideas, mem­o­ries etc. but per­haps also bits of the ‘out­side world’, like the tat­toos on the guy in Me­mento) then there seems to be no limit to how small we can draw the cir­cle—how much of your men­tal state can be re­garded as ‘ex­ter­nal’. But if only the ‘in­te­rior’ of the cir­cle needs to be in­stan­ti­ated in or­der to have a copy of ‘you’, it seems like any­thing, no mat­ter how ran­dom, can be re­garded as a “Boltz­mann brain”.