# Should logical probabilities be updateless too?

(This post doesn’t re­quire much math. It’s very spec­u­la­tive and prob­a­bly con­fused.)

Wei Dai came up with a prob­lem that seems equiv­a­lent to a var­i­ant of Coun­ter­fac­tual Mug­ging with some added twists:

• the coin­flip is “log­i­cal”, e.g. the par­ity of the mil­lionth digit of pi;

• af­ter you re­ceive the offer, you will have enough re­sources to calcu­late the coin­flip’s out­come your­self;

• but you need to figure out the cor­rect de­ci­sion al­gorithm ahead of time, when you don’t have these re­sources and are still un­cer­tain about the coin­flip’s out­come.

If you give 5050 chances now to the mil­lionth digit of pi be­ing even or odd, you prob­a­bly want to write the de­ci­sion al­gorithm so it agrees to pay up later even when faced with a proof that the mil­lionth digit of pi is even. But from the de­ci­sion al­gorithm’s point of view, the situ­a­tion looks more like be­ing asked to pay up be­cause 2+2=4. How do we re­solve this ten­sion?

One of the main sel­l­ing points of TDT-style de­ci­sion the­o­ries is elimi­nat­ing the need for pre­com­mit­ment. You’re sup­posed to always do what you would have pre­com­mit­ted to do­ing, even if it doesn’t seem like a very good idea af­ter you’ve done your Bayesian up­dates. UDT solves Coun­ter­fac­tual Mug­ging and similar prob­lems by be­ing “up­date­less”, so you keep car­ing about pos­si­ble wor­lds in ac­cor­dance with their apri­ori prob­a­bil­ities re­gard­less of which world you end up in.

If we take the above prob­lem at face value, it seems to tell us that UDT should treat log­i­cal un­cer­tainty up­date­lessly too, and keep car­ing about log­i­cally im­pos­si­ble wor­lds in ac­cor­dance with their apri­ori log­i­cal prob­a­bil­ities. It seems to hint that UDT should be coded from the start with a “log­i­cal prior” over math­e­mat­i­cal state­ments, which en­codes the cre­ator’s ar­bi­trary “log­i­cal de­grees of car­ing”, just like its reg­u­lar prior en­codes the cre­ator’s ar­bi­trary de­grees of car­ing over physics. Then the AI must keep fol­low­ing that prior for­ever af­ter. But that’s a very tall or­der. Should you re­ally keep car­ing about log­i­cally im­pos­si­ble wor­lds where 2+2=5, and ac­cept bar­gains that help copies of you in such wor­lds, even af­ter you calcu­late that 2+2=4?

That con­clu­sion is pretty startling, but con­sider what hap­pens if you re­ject it:

1. Precom­mit­ment can be mod­eled as a de­ci­sion prob­lem where an AI is asked to write a suc­ces­sor AI.

2. Imag­ine the AI is asked to write a pro­gram P that will be faced with Coun­ter­fac­tual Mug­ging with a log­i­cal coin. The AI doesn’t have enough re­sources to calcu­late the coin’s out­come, but P will have as much com­put­ing power as needed. The re­sult­ing util­ity goes to the AI.

3. Writ­ing P is equiv­a­lent to sup­ply­ing one bit: should P pay up if asked?

4. Sup­ply­ing that bit is equiv­a­lent to ac­cept­ing or re­fus­ing the bet “win \$10000 if the mil­lionth digit of pi is odd, lose \$100 if it’s even”.

So if your AI treats log­i­cal un­cer­tainty similarly enough to prob­a­bil­ities that it can make bets on digits of pi, re­flec­tive con­sis­tency seems to force it to have an un­chang­ing “log­i­cal prior”, and keep pay­ing up in Coun­ter­fac­tual Mug­ging even when the log­i­cal coin­flip looks as ob­vi­ous to the AI as 2+2=4. Is there any way to es­cape this con­clu­sion? (Nesov has an idea, but I can’t parse it yet.) And what could a for­mal­iza­tion of “log­i­cal pri­ors” pos­si­bly look like?

• In prin­ci­ple, I see no rea­son to treat log­i­cal prob­a­bil­ity differ­ently from other prob­a­bil­ity—if this can be done con­sis­tently.

Say there were some em­piri­cal fact about the world, con­cealed in­side a safe. And say you can crack the safe’s com­bi­na­tion as soon as you can solve a cer­tain log­i­cal fact. Then “ig­no­rance about the con­tents of the safe”, a very stan­dard type of ig­no­rance, feels ex­actly like “ig­no­rance about the log­i­cal fact” in this case.

I think you can gen­er­ally trans­form one type of un­cer­tainty into an­other in a way that leaves the in­tu­itions vir­tu­ally iden­ti­cal.

• This is not re­ally what the prob­lem dis­cussed in this post about. Given a set­ting where there are many pos­si­ble wor­lds for all kinds of al­ter­na­tive ob­ser­va­tions, we have three ba­sic kinds of un­cer­tainty: log­i­cal un­cer­tainty, un­cer­tainty about the joint state of all pos­si­ble wor­lds (“state un­cer­tainty”), and un­cer­tainty about lo­ca­tion within the col­lec­tion of these pos­si­ble wor­lds (in­dex­i­cal un­cer­tainty). If there are enough pos­si­ble wor­lds in our set­ting, then most ob­ser­va­tions of the kind “Is this box empty?” cash out as in­dex­i­cal un­cer­tainty: in some pos­si­ble wor­lds, it’s empty, and in oth­ers it’s not, so the only ques­tion is about which wor­lds it’s empty in, a ques­tion of find­ing the lo­ca­tions within the over­all col­lec­tion that fit the query.

Of these, log­i­cal un­cer­tainty is closer to state un­cer­tainty than to in­dex­i­cal un­cer­tainty: if you figure out some ab­stract fact, that may also tell you what all pos­si­ble (non-bro­ken) calcu­la­tors will say, but some of the boxes will be full, and some will be empty. Of course, there is no clear di­vid­ing line, it’s the struc­ture of the col­lec­tion of your pos­si­ble wor­lds and prior over it that tells you what ob­ser­va­tions are more like calcu­la­tors (re­lated to ab­stract facts), and which are more like boxes (un­re­lated to ab­stract facts, mostly only tel­ling you which pos­si­ble wor­lds you ob­serve).

(The UDT’s se­cret weapon is that it re­duces all ob­ser­va­tions to in­dex­i­cal un­cer­tainty, it com­pletely ig­nores their epistemic sig­nifi­cance (in­ter­pre­ta­tion as ab­stract facts), and in­stead re­lies on its own “pro­tected” in­fer­ence ca­pac­ity, to re­solve de­ci­sion prob­lems that are set up across its col­lec­tion of pos­si­ble wor­lds in ar­bi­trar­ily bizarre fash­ion. But when it starts rely­ing on ob­ser­va­tions, it has to be more clever than that.)

Now, you are talk­ing about how log­i­cal un­cer­tainty is similar to state un­cer­tainty, which I mostly agree with, while the prob­lem un­der dis­cus­sion is that log­i­cal un­cer­tainty seems to be un­like in­dex­i­cal un­cer­tainty, in par­tic­u­lar for the pur­poses of ap­ply­ing UDT-like rea­son­ing.

• I was un­der the im­pres­sion that there were clear ex­am­ples where log­i­cal un­cer­tainty was differ­ent than reg­u­lar un­cer­tainty. I can’t think of any, though, so per­haps I’m mis­s­re­mem­ber­ing. I would be very in­ter­ested in the solu­tion tho this.

• Log­i­cal un­cer­tainty is still a more sub­tle beastie me­thinks—but for the ex­am­ples given here, I think it should be treated like nor­mal un­cer­tainty.

• I think CM with a log­i­cal coin is not well-defined. Say Omega de­ter­mines whether or not the mil­lionth digit of pi is even. If it’s even, you ver­ify this and then Omega asks you to pay \$1000; if it’s odd Omega gives you \$1000000 iff. you would have paid Omega had the mil­lionth digit of pi been even. But the coun­ter­fac­tual “would you have paid Omega had the mil­lionth digit of pi been even and you ver­ified this” is un­defined if the digit is in fact odd, since you would have re­al­ized that it is odd dur­ing ver­ifi­ca­tion. If you don’t ac­tu­ally ver­ify it, then the prob­lem is well-defined be­cause Omega can just lie to you. I guess you could ask the coun­ter­fac­tual “what if your digit ver­ifi­ca­tion pro­ce­dure malfunc­tioned and said the digit was even”, but now we’re get­ting into doubt­ing your own men­tal fac­ul­ties.

• Per­haps I am miss­ing the ob­vi­ous, but why is this a hard prob­lem? So our pro­tag­o­nist AI has some al­gorithm to de­ter­mine if the mil­lionth digit of pi is odd- he can­not run it yet, but he has it. Lets call that func­tion f{}, that re­turns a 1 if the digit is odd, or a 0 if it is even. He also has some other func­tion like: sub pay_or_no { if (f{}) { pay(1000); }

In this fash­ion, Omega can ver­ify the al­gorithm that re­turns the mil­lionth digit of pi, in­de­pen­dently ver­ify the al­gorithm that pays based on that re­turn, and our pro­tag­o­nist gets his money.

• !!!!

This seems to be the cor­rect an­swer to ja­cobt’s ques­tion. The key is look­ing at the length of proofs. The gen­eral rule should go like this: when you’re try­ing to de­cide which of two im­pos­si­ble coun­ter­fac­tu­als “a()=x im­plies b()=y” and “a()=x im­plies b()=z” is more true even though “a()=x” is false, go with the one that has the shorter proof. We already use that rule when im­ple­ment­ing agents that com­pute coun­ter­fac­tu­als about their own ac­tions. Now we can just im­ple­ment Omega us­ing the same rule. If the mil­lionth digit of pi is in fact odd, but the state­ment “mil­lionth digit of pi is even ⇒ agent pays up” has a much shorter proof than “mil­lionth digit of pi is even ⇒ agent doesn’t pay up”, Omega should think that the agent would pay up.

The idea seems so ob­vi­ous in ret­ro­spect, I don’t un­der­stand how I missed it. Thanks!

• If the mil­lionth digit of pi is in fact odd, but the state­ment “mil­lionth digit of pi is even ⇒ agent pays up” has a much shorter proof than “mil­lionth digit of pi is even ⇒ agent doesn’t pay up”, Omega should think that the agent would pay up.

This seems equiv­a­lent to:

has a much shorter proof than “mil­lionth digit of pi is odd”

But does that make sense? What if it were pos­si­ble to have re­ally short proofs of whether the n-th digit of pi is even or odd and it’s im­pos­si­ble for the agent to ar­range to have a shorter proof of “mil­lionth digit of pi is even ⇒ agent pays up”? Why should the agent be pe­nal­ized for that?

• Maybe the whole point of a log­i­cal coin­flip is about be­ing harder to prove than sim­ple state­ments about the agent. If the coin­flip were sim­ple com­pared the the agent, like “1!=1”, then a CDT agent would not have pre­com­mit­ted to co­op­er­ate, be­cause the agent would have figured out in ad­vance that 1=1. So it’s not clear that a UDT agent should co­op­er­ate ei­ther.

• I agree, this seems like a rea­son­able way of defin­ing de­pen­den­cies be­tween con­stant sym­bols. In case of log­i­cal un­cer­tainty, I think you’d want to look into how rel­a­tive lengths of proofs de­pend on adding more the­o­rems as ax­ioms (so that they don’t cost any proof length to use). This way, differ­ent agents or an agent in differ­ent situ­a­tions would have differ­ent ideas about which de­pen­den­cies are nat­u­ral.

This goes all the way back to try­ing to define de­pen­den­cies by anal­ogy with AIXI/​K-com­plex­ity, I think we were talk­ing about this on the list in spring 2011.

• Good point, thanks. You’re right that even-world looks just as im­pos­si­ble from odd-world’s POV as odd-world looks from even-world, so Omega also needs to com­pute im­pos­si­ble coun­ter­fac­tu­als when de­cid­ing whether to give you the mil­lion. The challenge of solv­ing the prob­lem now looks very similar to the challenge of for­mu­lat­ing the prob­lem in the first place :-)

• I pointed out the same is­sue be­fore, but it doesn’t seem to af­fect my bar­gain­ing prob­lem.

• Why not? It seems to me that to de­ter­mine that the sta­ple max­i­mizer’s offer is fair, you need to look at the sta­ple max­i­mizer’s as­sess­ment of you in the im­pos­si­ble world where it gets con­trol. That’s very similar to look­ing at Omega’s as­sess­ment of you in the im­pos­si­ble world where it’s de­cid­ing whether to give you the mil­lion. Or maybe I’m wrong, all this re­cur­sion is mak­ing me con­fused...

• What I meant is, in my ver­sion of the prob­lem, you don’t have to solve the prob­lem (say what Omega does ex­actly) in or­der to for­mu­late the prob­lem, since “the sta­ple max­i­mizer’s as­sess­ment of you in the im­pos­si­ble world where it gets con­trol” is part of the solu­tion, not part of the prob­lem speci­fi­ca­tion.

• It seems to hint that UDT should be coded from the start with a “log­i­cal prior” over math­e­mat­i­cal state­ments, which en­codes the cre­ator’s ar­bi­trary “log­i­cal de­grees of car­ing”, just like its reg­u­lar prior en­codes the cre­ator’s ar­bi­trary de­grees of car­ing over physics.

This seems like a type er­ror. State­ments are refer­ences to col­lec­tions of struc­tures, they are not them­selves the kind of stuff agent should care about, they are el­e­ments of agent’s map, not ter­ri­tory. Car­ing about physics gen­er­al­izes to car­ing about ab­stract struc­tures, not to car­ing about state­ments that de­scribe ab­stract struc­tures. So there might be some no­tion of a prior over ab­stract struc­tures, pos­si­bly ex­pressed in terms of log­i­cal state­ments about them, but it wouldn’t be a prior over the state­ments.

• “Nor­mal” pri­ors are about com­par­a­tive value of wor­lds, with ob­ser­va­tions only re­solv­ing in­dex­i­cal un­cer­tainty about your lo­ca­tion among these wor­lds. In UDT, there is typ­i­cally an as­sump­tion that an agent has ex­ces­sive com­pu­ta­tional re­sources, and so the only pur­pose of ob­ser­va­tions is in re­solv­ing this in­dex­i­cal un­cer­tainty. A UDT agent is work­ing with a fixed col­lec­tion of pos­si­ble wor­lds, and it doesn’t learn any­thing about these wor­lds from ob­ser­va­tion. It de­vises a gen­eral strat­egy that is eval­u­ated by look­ing how it fares at all lo­ca­tions that use it, across the fixed col­lec­tion of pos­si­ble wor­lds.

In con­trast, log­i­cal un­cer­tainty is not about lo­ca­tion within the col­lec­tion of pos­si­ble wor­lds, it’s about the state of those wor­lds, or even about pres­ence of spe­cific wor­lds in the col­lec­tion. The value of any given strat­egy that re­sponds to ob­ser­va­tions would then de­pend on the state of log­i­cal un­cer­tainty, and so eval­u­at­ing a strat­egy is not as sim­ple as tak­ing the cur­rent epistemic state’s point of view.

A new pos­si­bil­ity opens: some ob­ser­va­tions can com­mu­ni­cate not just in­dex­i­cal in­for­ma­tion, but also log­i­cal in­for­ma­tion (al­ter­na­tively, in­for­ma­tion about the state of the col­lec­tion of pos­si­ble wor­lds, not just lo­ca­tion in the wor­lds of the col­lec­tion). This pos­si­bil­ity calls for some­thing analo­gous to an­thropic rea­son­ing: the fact that an agent ob­serves some­thing tells it some­thing about the big world, not just about which small world it’s lo­cated in. Another anal­ogy is value un­cer­tainty: re­solv­ing log­i­cal un­cer­tainty es­sen­tially re­solves un­cer­tainty about agent’s util­ity defi­ni­tion (and this is an­other way of gen­er­at­ing thought ex­per­i­ments about this is­sue).

So when an agent is on a branch of a strat­egy that in­di­cates some­thing new about the col­lec­tion of pos­si­ble wor­lds, the agent would eval­u­ate the whole strat­egy differ­ently from when it started out. But when it started out, it could also pre­dict how the ex­pected value of the strat­egy would look given that hy­po­thet­i­cal ob­ser­va­tion, and also given the al­ter­na­tive hy­po­thet­i­cal ob­ser­va­tions. How does it bal­ance these pos­si­ble points of view? I don’t know, but this is a new prob­lem that breaks UDT’s as­sump­tions, and at least to this puz­zle the an­swer seems to be “don’t pay up”.

• Our set of pos­si­ble wor­lds comes from some­where, some sort of crite­ria. What­ever gen­er­ates that list passes it to our choice al­gorithm, which be­gins branch­ing. Lets say we re­ceive an ob­ser­va­tion that con­tains both Log­i­cal and In­dex­i­cal up­dates- could we not just take our cur­rent set of pos­si­ble wor­lds, with our cur­rent set of data on them, up­date the list against our log­i­cal up­date, and pass that list on to a new copy of the func­tion? The col­lec­tion re­mains fixed as far as each copy of the func­tion is con­cerned, but re­tains the abil­ity to up­date on new in­for­ma­tion. When finished, the path re­turned will be the most likely given all new ob­ser­va­tions.

• I don’t think there’s any par­tic­u­lar rea­son that be­ing up­date­less should be use­ful in gen­eral—it’s just that “find the win­ning strat­egy, and then do that” is equiv­a­lent to be­ing up­date­less, and works re­ally well on most prob­lems.

In fact, I’d ex­pect a com­pu­ta­tion-aware de­ci­sion the­ory to not say “find the win­ning strat­egy, and then do that,” be­cause it couldn’t always ex­pect to do that ex­cept in the limit of in­finite com­put­ing time (even if the prob­lem is guaran­teed solv­able) - and that limit would also elimi­nate log­i­cal un­cer­tainty.

• (Sorry—I’ve just stum­bled on this thread and I’m not sure whether I should post this com­ment here or on Wei Dai’s origi­nal post).

It seems to me that the right thing for clippy to do is choose 10^20 sta­ples in­stead of 10^10 pa­per­clips, but not be­cause of any­thing to do with log­i­cal un­cer­tainty.

There are pre­sum­ably a vast num­ber of par­allel (and/​or non-log­i­cally-coun­ter­fac­tual) uni­verses where games of this na­ture are be­ing played out. In al­most ex­actly half of them, the roles of clippy and sta­ply will be re­versed. A UDT clippy will hap­pily co­op­er­ate in this case know­ing that the sta­plies in par­allel uni­verses will do the same and gen­er­ate lots of pa­per­clips in the pro­cess.

This would still be the case if there’s no pi calcu­la­tion in­volved, and Omega just flies to a planet, finds the first two util­ity max­i­miz­ers with or­thog­o­nal util­ity func­tions and offers one of them a choice of 10^10 units of util­ity (con­fu­sion note: ac­cord­ing to what scale???) or 10^20 units of the other agent’s util­ity.

1. A pa­per­clip max­i­mizer seems to have an in­her­ent ad­van­tage against a pa­per­clip min­i­mizer (i.e. even­tu­ally Omega is un­able to de­stroy any more pa­per­clips). But what if you were a (pa­per­clip minus sta­ple) max­i­mizer? Any win in one uni­verse is go­ing to be ex­actly can­celed out by an analo­gous win for a (sta­ple minus pa­per­clip) max­i­mizer in an­other uni­verse, if we as­sume that those will be spawned with equal prob­a­bil­ity.

2. My ar­gu­ment only works be­cause the con­cept of pa­per­clips seems non-en­tan­gled with the con­cept of the mil­lionth digit of pi. What if you re­place “pa­per­clip” with “piece of pa­per say­ing the mil­lionth digit of pi is odd”?

• What do you think of this thread, in par­tic­u­lar the part quoted in the last com­ment?

• The quote was:

I think it’s right to co­op­er­ate in this thought ex­per­i­ment only to the ex­tent that we ac­cept the im­pos­si­bil­ity of iso­lat­ing this thought ex­per­i­ment from its other pos­si­ble instances

I agree with one di­rec­tion of im­pli­ca­tion—if we ac­cept the im­pos­si­bil­ity of iso­lat­ing this thought ex­per­i­ment form its other pos­si­ble in­stances (e.g. cases with the ex­act same word­ing but with pa­per­clips and sta­ples swapped) then it’s right to co­op­er­ate.

If we don’t ac­cept that then I will ad­mit to be­ing con­fused and have noth­ing mean­ingful to say ei­ther way. I ac­cept the “least con­ve­nient world” prin­ci­ple for when some­one sug­gests a re­ally bizarre thought ex­per­i­ment, but I’m hav­ing trou­ble with the con­cept of “least con­ve­nient set of pos­si­ble coun­ter­fac­tual wor­lds”. Is this con­cept worth ex­plor­ing in its own right?

• I get the im­pres­sion that your mod­els of de­ci­sion the­ory im­plic­itly as­sume that agents have log­i­cal om­ni­science. Log­i­cal un­cer­tainty con­tra­dicts that, so any the­ory with both will end up con­fused and shuffling round an in­con­sis­tency.

I think to solve this you’re go­ing to have to ex­plic­itly model an agent with bounded com­put­ing power.

• This prob­lem is even trick­ier, you have to ex­plic­itly model an agent with un­limited com­put­ing power that mo­men­tar­ily adopts the prefer­ences of a bounded agent.

• Ques­tion: is there a proof that strate­gies that fol­low “always do­ing what you would have pre­com­mit­ted to do­ing” always dom­i­nate strate­gies that do not, in some sense? Or per­haps “adding a pre­com­mitt­ment stage always im­proves util­ity”?

• No—if the uni­verse pun­ishes you for be­hav­ing some way (e.g. adding a pre­comitt­ment stage), then do­ing that is dom­i­nated. There are no dom­i­nant strate­gies against all pos­si­ble states of the uni­verse.

• Fair enough, but per­haps there proofs that it dom­i­nates un­less you are pun­ished speci­fi­cally for do­ing it?

How sure are peo­ple that “do what you would have pre­com­mit­ted to do­ing” is a good strat­egy? Want­ing to build it into the de­ci­sion the­ory seems to sug­gest very high cer­tainty.

• Well, it seems ob­vi­ous that it’s true—but tricky to for­mal­ise. Sub­tle prob­lems like agent simu­lates pre­dic­tor (when you know more than Omega) and maybe some di­ag­o­nal agents (who ap­ply di­ag­o­nal rea­son­ing to you) seem to be rel­a­tively be­liev­able situ­a­tions. It’s a bit like Godel’s the­o­rem—ini­tially, the only ex­am­ples were weird and speci­fi­cally con­structed, but then peo­ple found more nat­u­ral ex­am­ples.

But “do what you would have pre­com­mit­ted to do­ing” seems to be much bet­ter than other strate­gies, even if it’s not prov­ably ideal.

• I posted this ar­ti­cle to the de­ci­sion the­ory group a mo­ment ago. It seems highly rele­vant to think­ing con­cretely about log­i­cal un­cer­tainty in the con­text of de­ci­sion the­ory, and pro­vides what looks to be a rea­son­able met­ric for eval­u­at­ing the value of com­pu­ta­tion­ally use­ful in­for­ma­tion.

ETA: plus there is an in­ter­est­ing tie-in to cog­ni­tive heuris­tics/​bi­ases.

• Fixed

• I won­der if the down­vote I see here would have been an up­vote if you used ‘re­quest’ in­stead of ‘nit­pick’ (not that I mind ei­ther way).

• which en­codes the cre­ator’s ar­bi­trary “log­i­cal de­grees of car­ing”, just like its reg­u­lar prior en­codes the cre­ator’s ar­bi­trary de­grees of car­ing over physics

(Has there been any work on mov­ing from an ar­bi­trary point of times­tamp­ing to some­thing that obeys some­thing like dy­namic con­sis­tency re­quire­ments? One could think up de­ci­sion prob­lems where two UDTs would have co­or­di­nated ex­cept their mo­ment of car­ing-en­cod­ing was ar­bi­trar­ily sin­gle-pointed in space­time, then try to use such ex­am­ples to mo­ti­vate gen­er­al­ized prin­ci­ples or no­tions of con­sis­tency. That’s a differ­ent way of ad­vanc­ing UDT that seems some­what or­thog­o­nal to the fo­cus on self-refer­ence and log­i­cal un­cer­tainty. (The in­tu­ition be­ing, of course, that ar­bi­trari­ness is in­el­e­gant, and if you see it that’s a sign you need to go meta.) Maybe Nesov’s de­sire to fo­cus more on pro­cesses and pieces rather than agents will nat­u­rally tend in this di­rec­tion.)

• Um, aren’t times­tamped prefer­ences already dy­nam­i­cally con­sis­tent?

• But from the de­ci­sion al­gorithm’s point of view, the situ­a­tion looks more like be­ing asked to pay up be­cause 2+2=4. How do we re­solve this ten­sion?

If prob­a­bil­ity is all in the mind, how is this differ­ent? What is the differ­ence be­tween even­tu­ally calcu­lat­ing an un­known digit or sim­ply wait­ing for the world to de­ter­mine the out­come of a coin toss? I don’t see any differ­ence at all.

• Log­i­cal un­cer­tainty is weird be­cause it doesn’t ex­actly obey the rules of prob­a­bil­ity. You can’t have a con­sis­tent prob­a­bil­ity as­sign­ment that says ax­ioms are 100% true but the mil­lionth digit of pi has a 50% chance of be­ing odd. So I won’t be very sur­prised if the cor­rect way to treat log­i­cal un­cer­tainty turns out to be not com­pletely Bayesian.

• You can’t have a con­sis­tent prob­a­bil­ity as­sign­ment that says ax­ioms are 100% true but the mil­lionth digit of pi has a 50% chance of be­ing odd.

Why not? If you haven’t ac­tu­ally worked out the mil­lionth digit of pi, then this prob­a­bil­ity as­sign­ment is con­sis­tent given your cur­rent state of knowl­edge. It’s in­con­sis­tent given log­i­cal om­ni­science, but then if you were log­i­cally om­ni­scient you wouldn’t as­sign a 50% chance in the first place. The act of ob­serv­ing a log­i­cal fact doesn’t seem differ­ent from the act of mak­ing any other ob­ser­va­tion to me.

If you knew that the uni­verse doesn’t obey New­to­nian me­chan­ics pre­cisely, it would be in­con­sis­tent to as­sign them a high prob­a­bil­ity, but that doesn’t mean that an early physi­cist who doesn’t have that knowl­edge is vi­o­lat­ing the rules of prob­a­bil­ity by think­ing that the uni­verse does fol­low New­to­nian me­chan­ics. It’s only af­ter you make that ob­ser­va­tion that such an as­sign­ment be­comes in­con­sis­tent.

• Ac­tu­ally this strikes me as a spe­cial case of deal­ing with the fact that your own de­ci­sion pro­cess is im­perfect.

• I think that in the Coun­ter­fac­tual Mug­ging with a log­i­cal coin, un­like CM with a quan­tum coin, it is in­cor­rect to ab­stract away from Omega’s in­ter­nal al­gorithm. In the CM with a quan­tum coin, the coin toss screens off Omega’s causal in­fluence. With the log­i­cal coin it is not so.

• Good point, but what if you care about only a sin­gle world pro­gram where Omega is hard­coded to use the mil­lionth digit of pi as the coin­flip, and you have log­i­cal un­cer­tainty about that digit?

• Then I’ll have to know (or have prob­a­bil­ity dis­tri­bu­tion over) where the world pro­gram comes from. If it was cre­ated by Omega-2, then I’m back at square 1. If the world pro­gram is laws of physics, then I sup­pose, log­i­cal un­cer­tainty is equiv­a­lent to log­i­cal prob­a­bil­ity, like in a reg­u­lar (non-quan­tum) coin toss. But then, the CM prob­lem is a very poor model of typ­i­cal laws of physics. And, with laws of physics, you’ll never need to ac­cept bar­gains con­di­tioned on 2+2=5...

• But you might need to ac­cept bar­gains con­di­tioned on more com­pli­cated log­i­cal facts, and the bar­gains may in­volve fu­ture ver­sions of you who will find these log­i­cal facts as triv­ial as 2+2=4.

If we want de­ci­sion the­ory to an­swer the ques­tion “what kind of AI should I write?”, then us­ing “log­i­cal pri­ors” is very likely to be the right an­swer. But Wei has a differ­ent way of look­ing at the prob­lem which seems to make it much harder: he asks “what de­ci­sion the­ory should I be fol­low­ing?”

• I think I need a differ­ent prob­lem as an in­tu­ition pump for this. Can’t re­for­mu­late the CM prob­lem satis­fac­to­rily. It all comes back to Omega’s origi­nal mo­ti­va­tion. Either it was “fair”, and so equiv­a­lent to a quan­tum coin toss at some point in the causal chain, or not. If it was fair, then it’s equiv­a­lent to reg­u­lar CM, so I should ac­cept the bar­gain and not “up­date” later, even for 2+2=5. If not, then it all de­pends on what it is...

• Quan­tum juju has noth­ing to do with de­ci­sion the­ory. I guess I should have in­cluded it in the post about com­mon mis­takes. What would you say about this prob­lem if you lived in a de­ter­minis­tic uni­verse and never heard about quan­tum physics? You know that bound­edly ra­tio­nal agents in such uni­verses can ob­serve things that look awfully similar to ran­dom noise, right?

• Well, toss­ing a quan­tum coin is a sim­ple way to prov­ably sever causal links. In a de­ter­minis­tic uni­verse with bound­edly ra­tio­nal agents, I sup­pose, there could be cryp­to­graph­i­cal schemes pro­duc­ing equiv­a­lent guaran­tees.

• What if I re­for­mu­late as fol­lows: Omega says that it tossed a coin and so chose to check ei­ther the odd­ness or the even­ness of the mil­lionth digit of pi. Coin in­di­cated the “odd­ness”, so bla bla bla.

The prop­er­ties of the prob­lem ap­pear the same as the log­i­cal-coin CM, ex­cept now the pos­si­ble causal in­fluence from Omega is sev­ered.

• If I’m Omega, and I de­cide to check whether the 10^10 th digit of pi is 0, 2 or 5, and re­ward you if it is… how would you feel about that? I chose those num­bers be­cause we have ten fingers, and I chose re­ward be­cause “e” is the 5th let­ter in the alpha­bet (I went through the let­ters of “re­ward” and “pun­ish” un­til I found one that was the 10th (J), 5th (E) or 2nd (B) let­ter).

Or a sec­ond var­i­ant: I im­ple­ment the log­i­cal-coin CM that can be de­scribed in python in the most com­pact way.

• If it’s true that you chose the num­bers be­cause we have ten fingers (and be­cause of noth­ing else), and I can ver­ify that, then I feel I should be­have as if the event is ran­dom with prob­a­bil­ity 0.3, even if it was the 10-th digit of pi, not 10^10-th.

• Yep—wel­come to log­i­cal un­cer­tainty!

• I never had any­thing against log­i­cal un­cer­tainty :)

The point, though, is that this setup—where I can ver­ify Omega’s hon­est at­tempt at ran­dom­ness—does not pro­duce the para­doxes. In par­tic­u­lar, it does not al­low some­one to pump money out of me. And so it seems to me that I can and should “keep pay­ing up in Coun­ter­fac­tual Mug­ging even when the log­i­cal coin­flip looks as ob­vi­ous as 2+2=4.”