# Counterfactual Mugging

Imag­ine that one day, Omega comes to you and says that it has just tossed a fair coin, and given that the coin came up tails, it de­cided to ask you to give it \$100. What­ever you do in this situ­a­tion, noth­ing else will hap­pen differ­ently in re­al­ity as a re­sult. Nat­u­rally you don’t want to give up your \$100. But see, Omega tells you that if the coin came up heads in­stead of tails, it’d give you \$10000, but only if you’d agree to give it \$100 if the coin came up tails.

Omega can pre­dict your de­ci­sion in case it asked you to give it \$100, even if that hasn’t ac­tu­ally hap­pened, it can com­pute the coun­ter­fac­tual truth. Omega is also known to be ab­solutely hon­est and trust­wor­thy, no word-twist­ing, so the facts are re­ally as it says, it re­ally tossed a coin and re­ally would’ve given you \$10000.

From your cur­rent po­si­tion, it seems ab­surd to give up your \$100. Noth­ing good hap­pens if you do that, the coin has already landed tails up, you’ll never see the coun­ter­fac­tual \$10000. But look at this situ­a­tion from your point of view be­fore Omega tossed the coin. There, you have two pos­si­ble branches ahead of you, of equal prob­a­bil­ity. On one branch, you are asked to part with \$100, and on the other branch, you are con­di­tion­ally given \$10000. If you de­cide to keep \$100, the ex­pected gain from this de­ci­sion is \$0: there is no ex­change of money, you don’t give Omega any­thing on the first branch, and as a re­sult Omega doesn’t give you any­thing on the sec­ond branch. If you de­cide to give \$100 on the first branch, then Omega gives you \$10000 on the sec­ond branch, so the ex­pected gain from this de­ci­sion is

-\$100 * 0.5 + \$10000 * 0.5 = \$4950

So, this straight­for­ward calcu­la­tion tells that you ought to give up your \$100. It looks like a good idea be­fore the coin toss, but it starts to look like a bad idea af­ter the coin came up tails. Had you known about the deal in ad­vance, one pos­si­ble course of ac­tion would be to set up a pre­com­mit­ment. You con­tract a third party, agree­ing that you’ll lose \$1000 if you don’t give \$100 to Omega, in case it asks for that. In this case, you leave your­self no other choice.

But in this game, ex­plicit pre­com­mit­ment is not an op­tion: you didn’t know about Omega’s lit­tle game un­til the coin was already tossed and the out­come of the toss was given to you. The only thing that stands be­tween Omega and your 100\$ is your rit­ual of cog­ni­tion. And so I ask you all: is the de­ci­sion to give up \$100 when you have no real benefit from it, only coun­ter­fac­tual benefit, an ex­am­ple of win­ning?

P.S. Let’s as­sume that the coin is de­ter­minis­tic, that in the over­whelming mea­sure of the MWI wor­lds it gives the same out­come. You don’t care about a frac­tion that sees a differ­ent re­sult, in all re­al­ity the re­sult is that Omega won’t even con­sider giv­ing you \$10000, it only asks for your \$100. Also, the deal is unique, you won’t see Omega ever again.

• Yes, I think you still owe him the \$100.

But I like how you made it into a rel­a­tively re­al­is­tic sce­nario.

• Con­sid­er­ing the ticket was worth \$5,000 when he bought it, sure.

• Did you give the same an­swer to Omega? The cases are ex­actly analo­gous. (Or do you ar­gue that they are not?)

• The dis­anal­ogy here is that you have a long term so­cial re­la­tion­ship with Bob that you don’t have with Omega, and the \$100 are an in­vest­ment into that re­la­tion­ship.

• Also, there is the pos­si­bil­ity of fu­ture sce­nar­ios aris­ing in which Bob could choose to take com­pa­rable ac­tions, and we want to en­courage him in do­ing so. I agree that the cases are not ex­actly analo­gous.

• The out­comes don’t seem to be tied to­gether as they were in the origi­nal prob­lem; is it true that if had he won, he would only then have given you the money if, had he not won, you would have given him the \$100 back? That isn’t clear.

• The coun­ter­fac­tual anti-mug­ging: One day No-mega ap­pears. No-mega is com­pletely trust­wor­thy etc. No-mega de­scribes the coun­ter­fac­tual mug­ging to you, and pre­dicts what you would have done in that situ­a­tion not hav­ing met No-mega, if Omega had asked you for \$100.

If you would have given Omega the \$100, No-mega gives you noth­ing. If you would not have given Omega \$100, No-mega gives you \$10000. No-mega doesn’t ask you any ques­tions or offer you any choices. Do you get the money? Would an ideal ra­tio­nal­ist get the money?

Okay, next sce­nario: you have a magic box with a num­ber p in­scribed on it. When you open it, ei­ther No-mega comes out (prob­a­bil­ity p) and performs a coun­ter­fac­tual anti-mug­ging, or Omega comes out (prob­a­bil­ity 1-p), flips a fair coin and pro­ceeds to ei­ther ask for \$100, give you \$10000, or give you noth­ing, as in the coun­ter­fac­tual mug­ging.

Be­fore you open the box, you have a chance to pre­com­mit. What do you do?

• If you would have given Omega the \$100, No-mega gives you noth­ing. If you would not have given Omega \$100, No-mega gives you \$10000. No-mega doesn’t ask you any ques­tions or offer you any choices. Do you get the money? Would an ideal ra­tio­nal­ist get the money?

I would have no ac­tion­able sus­pi­cion that I should give Omega the \$100 un­less I knew about No-mega. So I get the \$10000 only if No-mega asks the ques­tion “What would Eliezer do know­ing about No-mega?” and not if No-mega asks the ques­tion “What would Eliezer do not know­ing about No-mega?”

• You for­got about Me­taOmega, who gives you \$10,000 if and only if No-mega wouldn’t have given you any­thing, and O-mega, who kills your fam­ily un­less you’re an Alpha­betic De­ci­sion The­o­rist. This com­ment doesn’t seem speci­fi­cally anti-UDT—af­ter all, Omega and No-mega are ap­prox­i­mately equally likely to ex­ist; a ra­tio of 1:1 if not an ac­tual p of .5 -- but it still has the ring of Just Cheat­ing. Ad­mit­tedly, I don’t have any for­mal way of tel­ling the differ­ence be­tween de­ci­sion prob­lems that feel more or less le­gi­t­i­mate, but I think part of the an­swer might be that the Coun­ter­fac­tual Mug­ging isn’t re­ally about how to act around su­per­in­tel­li­gences: It illus­trates a more gen­eral need to con­di­tion our de­ci­sions based on coun­ter­fac­tu­als, and as EY pointed out, UDT still wins the No-mega prob­lem if you know about No-mega, so whether or not we should sub­scribe to some de­ci­sion the­ory isn’t all that de­pen­dent on which su­per­in­tel­li­gences we en­counter.

I’m necro­ing pretty hard and might be as­sum­ing too much about what Caspian origi­nally meant, so the above is more me work­ing this out for my­self than any­thing else. But if any­one can ex­plain why the No-mega prob­lem feels like cheat­ing to me, that would be ap­pre­ci­ated.

• Do you have a point?

• Yes, that there can just as eas­ily be a su­per­in­tel­li­gence that re­wards peo­ple pre­dicted to act one way as one that re­wards peo­ple pre­dicted to act the other. Which pre­com­mit­ment is most ra­tio­nal de­pends de­pends on the which type you ex­pect to en­counter.

I don’t ex­pect to en­counter ei­ther, and on the other hand I can’t rule out fal­lible hu­man analogues of ei­ther. So for now I’m not pre­com­mit­ting ei­ther way.

• You don’t pre­com­mit to “give away the \$100, to any­one who asks”. You pre­com­mit to give away the \$100 in ex­actly the situ­a­tion I de­scribed. Or, gen­er­al­iz­ing such pre­com­mit­ments, you just com­pute your de­ci­sions on the spot, in a re­flec­tively con­sis­tent fash­ion. If that’s what you want do to with your fu­ture self, that is.

• there can just as eas­ily be a su­per­in­tel­li­gence that re­wards peo­ple pre­dicted to act one way as one that re­wards peo­ple pre­dicted to act the other.

Yeah, now. But af­ter Omega re­ally, re­ally, ap­pears in front of you, chance of Omega ex­ist­ing is about 1. Chance of No-Mega is still al­most non-ex­is­tent. In this prob­lem, ex­is­tence of Omega is given. It’s not some­thing you are ex­pect­ing to en­counter now, just as we’re not ex­pect­ing to en­counter ec­cen­tric Kavkan billion­aires that will give you money for tox­i­cat­ing your­self. The Kavka’s Toxin and the coun­ter­fac­tual mug­ging pre­sent a sce­nario that is given, and ask you how would you act then.

• But you aren’t sup­posed to be up­dat­ing… the essence of UDT, I be­lieve, is that your policy should be set NOW, and NEVER UPDATED.

So… ei­ther:

1. You con­sider the choice of policy based on the prior where you DIDN’T KNOW whether you’d face Nomega or Omega, and NEVER UPDATE IT (this seems ob­vi­ously wrong to me: why are you us­ing your old prior in­stead of your cur­rent pos­te­rior?). or

2. You con­sider the choice of policy based on the prior where you KNOW that you are fac­ing Omega AND that the coin is tails, in which case pay­ing Omega only loses you money.

• It doesn’t pre­vent do­ing differ­ent ac­tions in differ­ent cir­cum­stances, though. That’s not what “up­date­less” means. It means that you should act as your past self would have pre­com­mit­ted to do­ing in your situ­a­tion. Your prob­a­bil­ity es­ti­mate for “I see Omega” should be sig­nifi­cantly greater than “I see Omega, and also Nomega is watch­ing and de­cid­ing how to act”, so your de­ci­sion should be mostly de­ter­mined by Omega, not Nomega. (The Me­tanomega also ap­plies—there’s a roughly equal chance of Me­tanomega or Nomega wait­ing and watch­ing. [Me­tanomega = Nomega re­versed; gives pay­off iff pre­dicts you pay­ing.])

• I see where I went wrong. I as­sumed that the im­pact of one’s re­sponse to Omega is limited to the num­ber of wor­lds in which Omega ex­ists. That is, my rea­son­ing is in­valid if (“what I do in sce­nario X” is mean­ingful and af­fects the world even if sce­nario X never hap­pens). In other words, when one is be­ing coun­ter­fac­tu­ally mod­eled, which is ex­actly the topic of dis­cus­sion.

• Thanks for point­ing that out. The an­swer is, as ex­pected, a func­tion of p. So I now find ex­pla­na­tions of why UDT gets mugged in­com­plete and mis­lead­ing.

Here’s my anal­y­sis:

The ac­tion set is {give, don’t give}, which I’ll iden­tify with {1, 0}. Now, the pos­si­ble de­ter­minis­tic poli­cies are sim­ply ev­ery map­ping from {N,O} --> {1,0}, of which there are 4.

We can dis­re­gard the poli­cies for which pi(N) = 1, since giv­ing money to Nomega serves no pur­pose. So we’re left with

pi_give

and

pi_don’t,

which give/​don’t, re­spec­tively, to Omega.

Now, we can eas­ily com­pute ex­pected value, as fol­lows:

r (pi_give(N)) = 0

r (pi_give(0, tails)) = −1

r (pi_don’t(N)) = 10

r (pi_don’t(0)) = 0

So now:

Eg := E_give(r) = 0 p + .5 (10-1) * (1-p)

Ed := E_don’t(r) = 10 p + 0 (1-p)

Eg > Ed when­ever 4.5 (1-p) > 10 p,

i.e. when­ever 4.5 > 14.5 p

i.e. when­ever 929 > p

So, whether you should pre­com­mit to be­ing mugged de­pends on how likely you are to en­counter N vs. O, which is in­tu­itively ob­vi­ous.

• Philoso­pher Kenny Easwaran re­ported in 2007 that:

Josh von Korff, a physics grad stu­dent here at Berkeley, and ver­sions of New­comb’s prob­lem. He shared my gen­eral in­tu­ition that one should choose only one box in the stan­dard ver­sion of New­comb’s prob­lem, but that one should smoke in the smok­ing le­sion ex­am­ple. How­ever, he took this in­tu­ition se­ri­ously enough that he was able to come up with a de­ci­sion-the­o­retic pro­to­col that ac­tu­ally seems to make these recom­men­da­tions. It ends up mak­ing some other re­ally strange pre­dic­tions, but it seems in­ter­est­ing to con­sider, and also ends up re­sem­bling some­thing Kan­tian!

The ba­sic idea is that right now, I should plan all my fu­ture de­ci­sions in such a way that they max­i­mize my ex­pected util­ity right now, and stick to those de­ci­sions. In some sense, this policy ob­vi­ously has the high­est ex­pec­ta­tion over­all, be­cause of how it’s de­signed.

Korff also rein­vents coun­ter­fac­tual mug­ging:

Here’s an­other situ­a­tion that Josh de­scribed that started to make things seem a lit­tle more weird. In An­cient Greece, while wan­der­ing on the road, ev­ery day one ei­ther en­coun­ters a beg­gar or a god. If one en­coun­ters a beg­gar, then one can choose to ei­ther give the beg­gar a penny or not. But if one en­coun­ters a god, then the god will give one a gold coin iff, had there been a beg­gar in­stead, one would have given a penny. On en­coun­ter­ing a beg­gar, it now seems in­tu­itive that (speak­ing only out of self-in­ter­est), one shouldn’t give the penny. But (as­sum­ing that gods and beg­gars are ran­domly en­coun­tered with some mid­dling prob­a­bil­ity dis­tri­bu­tion) the de­ci­sion pro­to­col out­lined above recom­mends giv­ing the penny any­way.

In a sense, what’s hap­pen­ing here is that I’m giv­ing the penny in the ac­tual world, so that my clos­est coun­ter­part that runs into a god will re­ceive a gold coin. It seems very odd to be­have like this, but from the point of view be­fore I know whether or not I’ll en­counter a god, this seems to be the best over­all plan. But as Josh points out, if this was the only way peo­ple got food, then peo­ple would see that the gen­er­ous were do­ing well, and gen­eros­ity would spread quickly.

And he looks into gen­er­al­iz­ing to the al­gorith­mic ver­sion:

If we now imag­ine a multi-agent situ­a­tion, we can get even stronger (and per­haps stranger) re­sults. If two agents are play­ing in a pris­oner’s dilemma, and they have com­mon knowl­edge that they are both fol­low­ing this de­ci­sion pro­to­col, then it looks like they should both co­op­er­ate. In gen­eral, if this de­ci­sion pro­to­col is some­how con­sti­tu­tive of ra­tio­nal­ity, then ra­tio­nal agents should always act ac­cord­ing to a maxim that they can in­tend (con­sis­tently with their goals) to be fol­lowed by all ra­tio­nal agents. To get ei­ther of these con­clu­sions, one has to con­di­tion one’s ex­pec­ta­tions on the propo­si­tion that other agents fol­low­ing this pro­ce­dure will ar­rive at the same choices.

Korff is now an Asst. Prof. at Ge­orgie State.

• In An­cient Greece, while wan­der­ing on the road, ev­ery day one ei­ther en­coun­ters a beg­gar or a god.

If it’s an iter­ated game, then the de­ci­sion to pay is a lot less un­in­tu­itive.

• My two bits: Omega’s re­quest is un­rea­son­able.

Precom­mit­ting is some­thing that you can only do be­fore the coin is flipped. That’s what the “pre” means. Omega’s game re­wards a pre­com­mit­ment, but Omega is ask­ing for a com­mit­ment.

Precom­mit­ting is a ra­tio­nal thing to do be­cause be­fore the coin toss, the re­sult is un­known and un­know­able, even by Omega (I as­sume that’s what “fair coin” means). This is a com­pletely differ­ent course of ac­tion than com­mit­ting af­ter the coin toss is known! The util­ity com­pu­ta­tion for pre­com­mit­ment is not and should not be the same as the one for com­mit­ment.

In the ex­am­ple, you have ac­cess to in­for­ma­tion that pre-you doesn’t (the out­come of the flip). If ra­tio­nal­ists are sup­posed to up­date on new in­for­ma­tion, then it is ir­ra­tional for you to be­have like pre-you.

• Precom­mit­ting is some­thing that you can only do be­fore the coin is flipped. That’s what the “pre” means. Omega’s game re­wards a pre­com­mit­ment, but Omega is ask­ing for a com­mit­ment.

Precom­mitt­ment does make one box­ing on New­comblike prob­lems a whole lot eas­ier. But it isn’t nec­es­sar­ily re­quired. That’s why Vladimir made an effort to ex­culde pre­com­mit­ment.

In the ex­am­ple, you have ac­cess to in­for­ma­tion that pre-you doesn’t (the out­come of the flip). If ra­tio­nal­ists are sup­posed to up­date on new in­for­ma­tion, then it is ir­ra­tional for you to be­have like pre-you.

I don’t agree. I sug­gest that pre-you has ex­actly the same in­for­ma­tion that you have. The pre-you must be con­sid­ered to have been given ex­actly the same in­puts as you to the ex­tent that they in­fluence the de­ci­sion. That is im­plied by the abil­ity of the Omega to make the ac­cu­rate pre­dic­tion that we have been as­sured he made.

• By defi­ni­tion, pre-you only has ac­cess to the coin’s prob­a­bil­ity dis­tri­bu­tion, while you have ac­cess to the re­sult of the coin flip. Surely you don’t mean to say that’s the same thing?

From the per­spec­tive of a non-su­per­in­tel­li­gence, Omega’s pre­dic­tion abil­ities are in­dis­t­in­guish­able from magic. Hu­man be­ings can’t tell what they “im­ply.” Try­ing to figure out the im­pli­ca­tions with a pri­mate brain will only get you into a para­dox like claiming a fact is the same as a prob­a­bil­ity dis­tri­bu­tion. All we can rea­son­ably do is stipu­late Omega’s abil­ities needed to make the prob­lem work and no fur­ther.

• We’re as­sum­ing Omega is trust­wor­thy? I’d give it the \$100, of course.

• Had the coin come up differ­ently, Omega might have ex­plained the se­crets of friendly ar­tifi­cial gen­eral in­tel­li­gence. How­ever, he now asks that you mur­der 15 peo­ple.

Omega re­mains com­pletely trust­wor­thy, if a bit sick.

• Ha, I’ll re-raise: Had the coin come up differ­ently, Omega would have filled ten Hub­ble vol­umes with CEV-out­put. How­ever, he now asks that you blow up this Hub­ble vol­ume.

(Not only do you blow up the uni­verse (end­ing hu­man­ity for eter­nity) you’re glad that Omega showed to offer this trans­par­ently ex­cel­lent deal. Mor­bid, ne?)

• Ouch.

• For some rea­son, rais­ing the stakes in these hy­po­thet­i­cals to the point of ac­tual pain has be­come re­flex for me. I’m not sure if it’s to help train my emo­tions to be able to make the right choices in hor­rible cir­cum­stances, or just my years in the Bardic Con­spir­acy look­ing for an out­let.

• Rais­ing the stakes in this way does not work, be­cause of the is­sue de­scribed in Eth­i­cal In­junc­tions: it is less likely that Omega has pre­sented you with this choice, than that you have gone in­sane.

• So imag­ine your­self in the most in­con­ve­nient pos­si­ble world where Omega is a known fea­ture of the en­vi­ron­ment and has long been seen to fol­low through on promises of this type; it does not par­tic­u­larly oc­cur to you or any­one that be­liev­ing this fact makes you in­sane.

When I phrase it that way—imag­ine my­self in a world full of other peo­ple con­fronted by similar Omega-in­duced dilem­mas—I sud­denly find that I feel sub­stan­tially less un­com­fortable; in­di­cat­ing that some of what I thought was pure eth­i­cal con­straint is ac­tu­ally so­cial eth­i­cal con­straint. Still, it may func­tion to the same self-pro­tec­tive effect as eth­i­cal con­straint.

• To add to the com­ments be­low, if you’re go­ing to take this route, you might as well have already de­cided that en­coun­ter­ing Omega at all is less likely than that you have gone in­sane.

• That may be true, but it’s still a dodge. Con­di­tional on not be­ing in­sane, what’s your an­swer?

Ad­di­tion­ally, I don’t see why Omega ask­ing you to give it 100 dol­lars vs 15 hu­man lives nec­es­sar­ily crosses the thresh­old of “more likely that I’m just a nut­bar”. I don’t ex­pect to talk to Omega any­time soon...

• We’re as­sum­ing Omega is trust­wor­thy? I’d mur­der 15 peo­ple, of course.

I’ll note that the as­sump­tion that I trust the Omega up to stakes this high is a big one. I imag­ine that the al­ter­a­tions be­ing done to my brain in the coun­ter­fac­tu­al­i­sa­tion pro­cess would have rather wide­spread im­pli­ca­tions on many of my thought pro­cesses and be­liefs once I had time to pro­cess it.

• I’ll note that the as­sump­tion that I trust the Omega up to stakes this high is a big one

Com­pletely agreed, a ma­jor prob­lem in any re­al­is­tic ap­pli­ca­tion of such sce­nar­ios.

I imag­ine that the al­ter­a­tions be­ing done to my brain in the coun­ter­fac­tu­al­i­sa­tion pro­cess would have rather wide­spread im­pli­ca­tions on many of my thought pro­cesses and be­liefs once I had time to pro­cess it.

I’m afraid I don’t fol­low.

• Can you please ex­plain the rea­son­ing be­hind this? Given all of the re­stric­tions men­tioned (no iter­a­tions, no pos­si­ble benefit to this self) I can’t see any rea­son to part with my hard earned cash. My “gut” says “Hell no!” but I’m cu­ri­ous to see if I’m miss­ing some­thing.

• There are var­i­ous in­tu­ition pumps to ex­plain the an­swer.

The sim­plest is to imag­ine that a mo­ment from now, Omega walks up to you and says “I’m sorry, I would have given you \$10000, ex­cept I simu­lated what would hap­pen if I asked you for \$100 and you re­fused”. In that case, you would cer­tainly wish you had been the sort of per­son to give up the \$100.

Which means that right now, with both sce­nar­ios equally prob­a­ble, you should want to be the sort of per­son who will give up the \$100, since if you are that sort of per­son, there’s half a chance you’ll get \$10000.

If you want to be the sort of per­son who’ll do X given Y, then when Y turns up, you’d bet­ter bloody well do X.

• If you want to be the sort of per­son who’ll do X given Y, then when Y turns up, you’d bet­ter bloody well do X.

Well said. That’s a lot of the mo­ti­va­tion be­hind my choice of de­ci­sion the­ory in a nut­shell.

• Thanks, it’s good to know I’m on the right track =)

I think this core in­sight is one of the clear­est changes in my thought pro­cess since start­ing to read OB/​LW—I can’t imag­ine my­self leap­ing to “well, I’d hand him \$100, of course” a cou­ple years ago.

• If you want to be the sort of per­son who’ll do X given Y, then when Y turns up, you’d bet­ter bloody well do X.

I think this de­scribes one of the core prin­ci­ples of virtue the­ory un­der any eth­i­cal sys­tem.

I won­der how much it de­pends upon ac­ci­dents of hu­man psy­chol­ogy, like our ten­dency to form habits, and how much of it is defi­ni­tional (if you don’t X when Y, then you’re sim­ply not the sort of per­son who Xes when Y)

• That’s not the situ­a­tion in ques­tion. The sce­nario laid out by Vladimir_Nesov does not al­low for an equal prob­a­bil­ity of get­ting \$10000 and pay­ing \$100. Omega has already flipped the coin, and it’s already been de­cided that I’m on the “los­ing” side. Join that with the fact that me giv­ing \$100 now does not in­crease the chance of me get­ting \$10000 in the fu­ture be­cause there is no rep­e­ti­tion.

Per­haps there’s some­thing fun­da­men­tal I’m miss­ing here, but the lin­ear­ity of events seems pretty clear. If Omega re­ally did calcu­late that I would give him the \$100 then ei­ther he mis­calcu­lated, or this situ­a­tion can­not ac­tu­ally oc­cur.

-- EDIT --

There is a third pos­si­bil­ity af­ter read­ing Cameron’s re­ply… If Omega is cor­rect and hon­est, then I am in­deed go­ing to give up the money.

But it’s a bit of a trick ques­tion, isn’t it? I’m go­ing to give up the money be­cause Omega says I’m go­ing to give up the money and ev­ery­thing Omega says is gospel truth. How­ever, if Omega hadn’t said that I would give up the money, then I wouldn’t of given up the money. Which makes this a bit of an im­pos­si­ble situ­a­tion.

As­sum­ing the ex­is­tence of Omega, his in­tel­li­gence, and his hon­esty, this sce­nario is an im­pos­si­bil­ity.

• I feel like a man in an Escher paint­ing, with all these re­cur­sive hy­po­thet­i­cal mes, hy­po­thet­i­cal kuriges, and hy­po­thet­i­cal omegas.

I’m say­ing, go ahead and start by imag­in­ing a situ­a­tion like the one in the prob­lem, ex­cept it’s all hap­pen­ing in the fu­ture—you don’t yet know how the coin will land.

You would want to de­cide in ad­vance that if the coin came up against you, you would cough up \$100.

The abil­ity to pre­com­mit in this way gives you an ad­van­tage. It gives you half a chance at \$10000 you would not oth­er­wise have had.

So it’s a shame that in the prob­lem as stated, you don’t get to pre­com­mit.

But the fact that you don’t get ad­vance knowl­edge shouldn’t change any­thing. You can just de­cide for your­self, right now, to fol­low this sim­ple rule:

If there is an ac­tion to which my past self would have pre­com­mited, given perfect knowl­edge, and my cur­rent prefer­ences, I will take that ac­tion.

By adopt­ing this rule, in any prob­lem in which the op­pur­tu­nity for pre­com­mit­ing would have given you an ad­van­tage, you wind up gain­ing that ad­van­tage any­way.

• If there is an ac­tion to which my past self would have pre­com­mited, given perfect knowl­edge, and my cur­rent prefer­ences, I will take that ac­tion.

That one sums it all up nicely!

• I’m ac­tu­ally not quite satis­fied with it. Prob­a­bil­ity is in the mind, which makes it difficult to know what I mean by “perfect knowl­edge”. Perfect knowl­edge would mean I also knew in ad­vance that the coin would come up tails.

I know giv­ing up the \$100 is right, I’m just hav­ing a hard time figur­ing out what wor­lds the agent is sum­ming over, and by what rules.

ETA: I think “if there was a true fact which my past self could have learned, which would have caused him to pre­com­mit etc.” should do the trick. Gonna have to sleep on that.

ETA2: “What would you do in situ­a­tion X?” and “What would you like to pre-com­mit to do­ing, should you ever en­counter situ­a­tion X?” should, to a ra­tio­nal agent, be one and the same ques­tion.

• ETA2: “What would you do in situ­a­tion X?” and “What would you like to pre-com­mit to do­ing, should you ever en­counter situ­a­tion X?” should, to a ra­tio­nal agent, be one and the same ques­tion.

...and that’s an even bet­ter way of putting it.

• Note that this doesn’t ap­ply here. It’s “What would you do if you were coun­ter­fac­tu­ally mugged?” ver­sus “What would you like to pre-com­mit to do­ing, should you ever be told about the coin flip be­fore you knew the re­sult?”. X isn’t the same.

• “Perfect knowl­edge would mean I also knew in ad­vance that the coin would come up tails.”

This seems cru­cial to me.

Given what I know when asked to hand over the \$100, I would want to have pre-com­mit­ted to not pre-com­mit­ting to hand over the \$100 if offered the origi­nal bet.

Given what I would know if I were offered the bet be­fore dis­cov­er­ing the out­come of the flip I would wish to pre-com­mit to hand­ing it over.

From which in­for­ma­tion set I should eval­u­ate this? The in­for­ma­tion set I am ac­tu­ally at seems the most nat­u­ral choice, and it also seems to be the one that WINS (at least in this world).

What am I miss­ing?

• I’ll give you the quick and dirty patch for deal­ing with omega: There is no way to know that, at that mo­ment, you are not in­side of his simu­la­tion. by giv­ing him the 100\$, there is a chance you are tran­fer­ing that money from within a simu­la­tion-which is about to be ter­mi­nated-to out­side of the simu­la­tion, with a nice big mul­ti­plier.

• MBlume:

“What would you do in situ­a­tion X?” and “What would you like to pre-com­mit to do­ing, should you ever en­counter situ­a­tion X?” should, to a ra­tio­nal agent, be one and the same ques­tion.

This phras­ing sounds about right. What­ever de­ci­sion-mak­ing al­gorithm you have draw­ing your de­ci­sion D when it’s in situ­a­tion X, should also come to the same con­di­tional de­ci­sion be­fore the situ­a­tion X ap­peared, “if(X) then D”. If you ac­tu­ally don’t give away \$100 in situ­a­tion X, you should also plan to not give away \$100 in case of X, be­fore (or ir­re­spec­tive of whether) X hap­pens. Whichever de­ci­sion is the right one, there should be no in­con­sis­tency of this form. This grows harder if you must pre­serve the whole prefer­ence or­der.

• “What would you do in situ­a­tion X?” and “What would you like to pre-com­mit to do­ing, should you ever en­counter situ­a­tion X?” should, to a ra­tio­nal agent, be one and the same ques­tion.

Not if pre­com­mit­ing po­ten­tially has other nega­tive con­se­quences. As Caspian sug­gested el­se­where in the thread, you should also con­sider the pos­si­bil­ity that the uni­verse con­tains No-megas who pun­ish peo­ple who would co­op­er­ate with Omega.

• ...why should you also con­sider that pos­si­bil­ity?

• Be­cause if that pos­si­bil­ity ex­ists, you should not nec­es­sar­ily pre­com­mit to co­op­er­ate with Omega, since that risks be­ing pun­ished by No-mega. In a uni­verse of No-megas, pre­com­mit­ing to co­op­er­ate with Omega loses. This seems to me to cre­ate a dis­tinc­tion be­tween the ques­tions “what would you do upon en­coun­ter­ing Omega?” and “what will you now pre­com­mit to do­ing upon en­coun­ter­ing Omega?”

I sup­pose my real ob­jec­tion is that some peo­ple seem to have con­cluded in this thread that the cor­rect thing to do is to, in ad­vance, make some blan­ket pre­com­mit­ment to do the equiv­a­lent of co­op­er­at­ing with Omega should they ever find them­selves in any similar prob­lem. But I feel like these peo­ple have im­plic­itly made some as­sump­tions about what kind of Omega-like en­tities they are likely to en­counter: for in­stance that they are much more likely to en­counter Omega than No-mega.

• But No-mega also pun­ishes peo­ple who didn’t pre­com­mit but would have cho­sen to co­op­er­ate af­ter meet­ing Omega. If you think No-mega is more likely than Omega, then you shouldn’t be that kind of per­son ei­ther. So it still doesn’t dis­t­in­guish be­tween the two ques­tions.

• |Perfect knowledge

use a Quan­tum coin-it con­ve­niently comes up both.

• I don’t see this situ­a­tion is im­pos­si­ble, but I think it’s be­cause I’ve in­ter­preted it differ­ently from you.

First of all, I’ll as­sume that ev­ery­one agrees that given a 5050 bet to win \$10′000 ver­sus los­ing \$100, ev­ery­one would take the bet. That’s a straight­for­ward ap­pli­ca­tion of util­i­tar­i­anism + prob­a­bil­ity the­ory = ex­pected util­ity, right?

So Omega cor­rectly pre­dicts that you would have taken the bet if he had offered it to you (a real no brainer; I too can pre­dict that you would have taken the bet had he offered it).

But he didn’t offer it to you. He comes up now, tel­ling you that he pre­dicted that you would ac­cept the bet, and then car­ried out the bet with­out ask­ing you (since he already knew you would ac­cept the bet), and it turns out you lost. Now he’s ask­ing you to give him \$100. He’s not pre­dict­ing that you will give him that num­ber, nor is he de­mand­ing or com­mand­ing you to give it. He’s merely ask­ing. So the ques­tion is, do you do it?

I don’t think there’s any in­con­sis­tency in this sce­nario re­gard­less of whether you de­cide to give him the money or not, since Omega hasn’t told you what his pre­dic­tion would be (though if we ac­cept that Omega is in­fal­lible, then his pre­dic­tion is ob­vi­ously ex­actly what­ever you would ac­tu­ally do in that situ­a­tion).

• Omega hasn’t told you his pre­dic­tions in the given sce­nario.

• Per­haps there’s some­thing fun­da­men­tal I’m miss­ing here, but the lin­ear­ity of events seems pretty clear. If Omega re­ally did calcu­late that I would give him the \$100 then ei­ther he mis­calcu­lated, or this situ­a­tion can­not ac­tu­ally oc­cur.

That’s ab­solutely true. In ex­actly the same way, if the Omega re­ally did calcu­late that I wouldn’t give him the \$100 then ei­ther he mis­calcu­lated, or this situ­a­tion can­not ac­tu­ally oc­cur.

The differ­ence be­tween your coun­ter­fac­tual in­stance and my coun­ter­fac­tual in­stance is that yours just has a weird guy has­sling you with deal you want to re­ject while my coun­ter­fac­tual is log­i­cally in­con­sis­tent for all val­ues of ‘me’ that I iden­tify as ‘me’.

• Thank you. Now I grok.

So, if this sce­nario is log­i­cally in­con­sis­tent for all val­ues of ‘me’ then there re­ally is noth­ing that I can learn about ‘me’ from this prob­lem. I wish I hadn’t thought about it so hard.

• Log­i­cally in­con­sis­tent for all val­ues of ″ that would hand over the \$100. For all val­ues of ″ that would keep the \$100 it is log­i­cally con­sis­tent but rather obfus­cated. It is difficult to an­swer a mul­ti­ple choice ques­tion when con­sid­er­ing the cor­rect an­swer throws null.

• The sim­plest is to imag­ine that a mo­ment from now, Omega walks up to you and says “I’m sorry, I would have given you \$10000, ex­cept I simu­lated what would hap­pen if I asked you for \$100 and you re­fused”. In that case, you would cer­tainly wish you had been the sort of per­son to give up the \$100.

I liked this po­si­tion—in­sight­ful, so I’m definitely up­vot­ing.

But I’m not al­to­gether con­vinced it’s a com­pletely com­pel­ling ar­gu­ment. With the amounts re­versed, Omega could have walked up to you and said “I would have given you \$100 ex­cept if I asked you for \$10.000 you would have re­fused.” You’d then cer­tainly wish to have been the sort of per­son to coun­ter­fac­tu­ally have given up the \$10000, be­cause in the real world it’d mean you’d get \$100, even though you’d cer­tainly REJECT that bet if you had a choice for it in ad­vance.

• Not nec­es­sar­ily; it de­pends on rel­a­tive fre­quency. If Omega has a 10^-9 chance of ask­ing me for \$10000 and oth­er­wise will simu­late my re­sponse to judge whether to give me \$100, and if I know that (per­haps Omega ear­lier warned me of this), I would want to be the type of per­son who gives the money.

• If you want to be the sort of per­son who’s known to do X given Y, then when Y turns up, you’d bet­ter bloody well do X.

Is that an ac­cept­able cor­rec­tion?

• Well, with a be­ing like Omega run­ning around, the two be­come more or less iden­ti­cal.

• If we’re go­ing to in­vent some­one who can read thoughts perfectly, we may as well in­vent some­one who can con­ceal thoughts perfectly.

Any­way, there aren’t any be­ings like Omega run­ning around to my knowl­edge. If you think that con­ceal­ing mo­ti­va­tions is harder than I think, and that the only way to make an­other hu­man think you’re a cer­tain way is to be that way, say that.

• And if Omega comes up to me and says “I was go­ing to kill you if you gave me \$100. But since I’ve worked out that you won’t, I’ll leave you alone.” then I’ll be damn glad I wouldn’t agree.

This re­ally does seem like pointless spec­u­la­tion.

Of course, I live in a world where there is no be­ing like Omega that I know of. If I knew oth­er­wise, and knew some­thing of their prop­er­ties, I might gov­ern my­self differ­ently.

• We’re not talk­ing Pas­cal’s Wager here, you’re not guess­ing at the be­havi­our of capri­cious om­nipo­tent be­ings. Omega has told you his prop­er­ties, and is as­sumed to be trust­wor­thy.

• You are stat­ing that. But as far as I can tell Omega is tel­ling me its a capri­cious om­nipo­tent be­ing. If there is a dis­tinc­tion, I’m not see­ing it. Let me break it down for you:

1) Capri­cious → I am com­pletely un­able to pre­dict its ac­tions. Yes.
2) Om­nipo­tent → Can do the seem­ingly im­pos­si­ble. Yes.

So, what’s the differ­ence?

• It’s not capri­cious in the sense you give: you are ca­pa­ble of pre­dict­ing some of its ac­tions: be­cause it’s as­sumed Omega is perfectly trust­wor­thy, you can pre­dict with cer­tainty what it will do if it tells you what it will do.

So, if it says it’ll give you 10k\$ in some con­di­tion (say, if you one-box its challenge), you can pre­dict that it’ll give it the money if that con­di­tion arises.

If it were capri­cious in the sense of com­plete in­abil­ity of be­ing pre­dicted, it might am­pu­tate three of your toes and give you a flower gar­land.

Note that the prob­lem sup­poses you do have cer­tainty that Omega is trust­wor­thy; I see no way of reach­ing that episte­molog­i­cal state, but then again I see no way Omega could be om­nipo­tent, ei­ther.

On an some­what un­re­lated note, why would Omega ask you for 100\$ if it had simu­lated you wouldn’t give it the money? Also, why would it do the same if it had simu­lated you would give it the money? What pos­si­ble use would an om­nipo­tent agent have for 100\$?

• Omega is as­sumed to be mildly bored and mildly an­thropic. And his ask­ing you for 100\$ could always be PART of the simu­la­tion.

• And his ask­ing you for 100\$ could always be PART of the simu­la­tion.

Yes, it’s quite rea­son­able that if it was cu­ri­ous about you it would simu­late you and ask the simu­la­tion a ques­tion. But once it did that, since the simu­la­tion was perfect, why would it waste the time to ask the real you? After all, in the time it takes you to un­der­stand Omega’s ques­tion it could prob­a­bly simu­late you many times over.

So I’m start­ing to think that en­coun­ter­ing Omega is ac­tu­ally pretty strong ev­i­dence for the fact that you’re simu­lated.

• Maybe Omega rec­og­nizes in ad­vance that you might think this way, doesn’t want it to hap­pen, and so pre­com­mits to ask­ing the real you. With the ex­is­tence of this pre­com­mit­ment, you may not prop­erly make this rea­son­ing. More­over, you should be able to figure out that Omega would pre­com­mit, thus mak­ing it un­nec­es­sary for him to ex­plic­itlyy tell you he’s do­ing so.

• Maybe Omega [...] doesn’t want it to hap­pen [...] More­over, you should be able to figure out that Omega would precommit

(Em­pha­sis mine.)

I don’t think, given the usual prob­lem for­mu­la­tion, that one can figure out what Omega wants with­out Omega ex­plic­itly say­ing it, and maybe not even in that case.

It’s a bit like a deal with a not-nec­es­sar­ily-evil devil. Even if it tells you some­thing and you’re sure it’s not ly­ing and you think you the word­ing is perfectly clear, you should still as­sign a very high prob­a­bil­ity that you have no idea what’s re­ally go­ing on and why.

• If we as­sume I’m ra­tio­nal, then I’m not go­ing to as­sume any­thing about Omega. I’ll base my de­ci­sions on the given ev­i­dence. So far, that ap­pears to be de­scribed as be­ing no more and no less than what Omega cares to tell us.

• Fine, then in­ter­change “as­sume omega is hon­est” with, say, “i’ve played a billiion rounds of one-box two-box with him” …It should be close enough.

• I re­al­ize this is fight­ing the prob­lem, but: If I re­mem­ber play­ing a billion rounds of the game with Omega, that is pretty strong ev­i­dence that I’m a (slightly al­tered) simu­la­tion. An av­er­age hu­man takes about a ten mil­lion breaths each year...

OK, so as­sume that I’m a tran­shu­man and can ac­tu­ally do some­thing a billion times. But if Omega can simu­late me perfectly, why would it ac­tu­ally waste the time to ask you a ques­tion, once it simu­lated you an­swer­ing it? Let alone do that a billion times… This also seems like ev­i­dence that I’m ac­tu­ally simu­lated. (I no­tice that in most state­ments of the prob­lem, the word­ing is such that it is im­plied but not clearly stated that the non-simu­lated ver­sion of you is ever in­volved.)

• I work on AI. In par­tic­u­lar, on de­ci­sion sys­tems sta­ble un­der self-mod­ifi­ca­tion. Any agent who does not give the \$100 in situ­a­tions like this will self-mod­ify to give \$100 in situ­a­tions like this. I don’t spend a whole lot of time think­ing about de­ci­sion the­o­ries that are un­sta­ble un­der re­flec­tion. QED.

• Even con­sid­er­ing situ­a­tions like this and hav­ing spe­cial cases for them sounds like it would add a bit much cruft to the sys­tem.

Do you have a work­ing AI that I could look at to see how this would work?

• If you need spe­cial cases, your de­ci­sion the­ory is not con­sis­tent un­der re­flec­tion. In other words, it should sim­ply always do the thing that it would pre­com­mit to do­ing, be­cause, as MBlume put it, the de­ci­sion the­ory is for­mu­lated in such fash­ion that “What would you pre­com­mit to?” and “What will you do?” work out to be one and the same ques­tion.

• But this is pre­cisely what hu­mans don’t do, be­cause we re­spond to a “near” situ­a­tion differ­ently than a “far” one. Your ad­vance pre­dic­tion of your de­ci­sion is un­trust­wor­thy un­less you can suc­cess­fully simu­late the real fu­ture en­vi­ron­ment in your mind with suffi­cient sen­sory de­tail to in­voke “near” rea­son­ing. Other­wise, you will fail to reach a con­sis­tent de­ci­sion in the ac­tual situ­a­tion.

Un­less of course, In the ac­tual situ­a­tion, you’re pro­ject­ing back, “What would I have de­cided in ad­vance to do had I thought about this in ad­vance?”—and you suc­cess­fully miti­gate all prim­ing effects and situ­a­tion­ally-mo­ti­vated rea­son­ing.

Or to put all of the above in short, com­mon-wis­dom form: “that’s easy for you to say NOW...” ;-)

• Here is one in­tu­itive way of look­ing at it:

Be­fore toss­ing the coin, the Omega perfectly em­u­lates my de­ci­sion mak­ing pro­cess. In this em­u­la­tion he tells me that I lost the coin toss, ex­plains the deal and asks me to give him \$100. If this em­u­lated me gives up the \$100 then he has a good chance of get­ting \$10,000.

I have ab­solutely no way of know­ing whether I am the ‘em­u­lated me’ or the real me. Vlad­mir’s speci­fi­ca­tion is quite un­am­bigu­ous. I, me, the one do­ing the de­cid­ing right now in this real world, am the same me as the one in­side the Omega’s head. If the em­u­la­tion is in any way differ­ent to me then the Omega isn’t the Omega. The guy in the Omega’s head has been offered a deal that any ra­tio­nal man would ac­cept, and I am that man.

So, it may sound stupid that I’m giv­ing up \$100 with no hope of get­ting any­thing back. But that’s be­cause the coun­ter­fac­tual is stupid, not me.

• So, it may sound stupid that I’m giv­ing up \$100 with no hope of get­ting any­thing back. But that’s be­cause the coun­ter­fac­tual is stupid, not me.

(Dis­claimer: I’m go­ing to use the ex­act lan­guage you used, which means I will call you “stupid” in this post. I apol­o­gize if this comes off as trol­lish. I will ad­mit that I am also quite torn about this de­ci­sion, and I feel quite stupid too.)

No offense, but as­sum­ing free will, you are the one who is de­cid­ing to ac­tu­ally hand over the \$100. The con­ter­fac­tual isn’t the one mak­ing the de­ci­sion. You are. You are in a situ­a­tion, and there are two pos­si­ble ac­tions (lose \$100 or don’t lose \$100), and you are choos­ing to lose \$100.

So again, are you sure you are not stupid?

• And now I try to calcu­late what you should treat as be­ing the prob­a­bil­ity that you’re be­ing em­u­lated. As­sume that Omega only em­u­lates you if the coin comes up heads.

Sup­pose you de­cide be­fore­hand that you are go­ing to give Omega the \$100, as you ought to. The ex­pected value of this is \$4950, as has been calcu­lated.

Sup­pose that in­stead, you de­cide be­fore­hand that E is the prob­a­bil­ity you’re be­ing em­u­lated as­sum­ing you hear that came up tails. You’ll still de­cide to give Omega the \$100; there­fore, your ex­pected value if you hear that it came up heads is \$10,000. Your ex­pected value if you hear that the coin came up tails is -\$100(1-E) + \$10,000E.

The prob­a­bil­ity that you hear that the coin comes up tails should be given by P(H) + P(T and ~E) + P(T and E) = 0, P(H) = P(T and ~E), P(T and ~E) = P(T) - P(T and E), P(T and E) = P(E|T) * P(T). Solv­ing these equa­tions, I get P(E|T) = 2, which prob­a­bly means I’ve made a mis­take some­where. If not, c’est l’Omega?

• um… lets see....

to REALLY eval­u­ate that, we tech­ni­cally need to know how long omega runs the simu­la­tion for.

now, we have two op­tions: one, as­sume omega keeps run­ning the simu­la­tion in­definitely. two, as­sume that omega shuts the simu­la­tion down once he has the info he’s look­ing for (and be­fore he has to worry about de­bug­ging the simu­la­tion.)

in # 1, what we are left with is p(S)=1/​3, p(H)=1/​3, p(t)=1/​3, which means we’re mov­ing 200\$/​3 from part of our pos­si­bil­ity cloud to gain 10,000\$/​3 in an­other part.
In #2, we’re mov­ing a to­tal of 1002 \$ to gain 100002. The 100\$ in the simu­la­tion is quan­tum-vir­tual.

so, un­less you have rea­son to sus­pect omega is run­ning a LOT of simu­la­tions of you, AND not ter­mi­nat­ing them af­ter a minute or so...(aka, is not in­ad­ver­tently simu­la­tion-mug­ging you)...

You can gen­er­ally treat Omega’s simu­la­tion ca­pac­ity as a dashed causal­ity ar­row from one uni­verse to an­other-sortof like the shadow pro­duced by the simu­la­tion...

• So from my and Omega’s per­spec­tive this coin is ran­dom and my be­hav­ior is pre­dictable. Amus­ing. My ques­tion: What if Omega says “due to quirks in your neu­rol­ogy, had I re­quested it, you would have pre-com­mit­ted to bet \$100 against \$46.32. As it hap­pens, you lost any­way, but you would have taken an un­fa­vor­able deal. Would you pay then?

• Nope. I don’t care what quirks in my neu­rol­ogy do—I don’t care what an­swer the ma­te­rial calcu­la­tor re­turns, only the an­swer to 2 + 2 = ?

• Meh, the origi­nal is badly worded.

Take 2. Omega no­tices a neuro-quirk. Then, based on what he’s no­ticed, he offers you a 5050 bet of 100\$ to 43.25 dol­lars at just the right time with just the right in­to­na­tion...

NOW do you take that bet?

...Why yes, yes you do. Even you. And you know it. it’s re­lated to why you don’t think box­ing an AI is the an­swer. only, Omega’s already out of the box, and so can ad­just your vi­sual and au­di­tory in­put with a much higher de­gree of pre­ci­sion.

• Meh, the origi­nal is badly worded.

No it isn’t. Your ‘Take 2’ is an en­tirely differ­ent ques­tion. One that seems to miss the point. The ques­tion “Can Omega ex­ploit a vuln­er­a­bil­ity of hu­man psy­chol­ogy?” isn’t a par­tic­u­larly in­ter­est­ing one and be­comes even less so when by the defi­ni­tion of Omega and the prob­lem speci­fi­ca­tion the an­swer is ei­ther “Yes” or “I deny the coun­ter­fac­tual” re­gard­less of any­thing to do with vuln­er­a­bil­ities in hu­man in­tel­lec­tual ca­pa­bil­ities.

• oh. whoops.… so more like a way of pok­ing holes in the strat­egy “i will do what­ever I would have pre­com­mit­ted to do”?

• oh. whoops.… so more like a way of pok­ing holes in the strat­egy “i will do what­ever I would have pre­com­mit­ted to do”?

A way of try­ing to, yes.

• The coin toss may be known to Omega and pre­dicted in ad­vance, it only needs to ini­tially have 5050 odds to you for the ex­pected gain calcu­la­tion to hold. When Omega tells you about the coin, it com­mu­ni­cates to you its knowl­edge about the toss, about an in­de­pen­dent vari­able of ini­tial 5050 odds. For ex­am­ple, Omega may tell you that it hasn’t tossed the coin yet, it’ll do so only a thou­sand years from now, but it pre­dicted that the coin will come up tails, so it asks you for your \$100.

• This re­quires though that Omega have de­cided to make the bet in a fash­ion which ex­hibited no de­pen­dency on its ad­vance knowl­edge of the coin.

• This is a big is­sue which I un­sucess­fully tried to ad­dress in my non-ex­ist­ing 6+ para­graph ex­pla­na­tion. Why the heck is Omega mak­ing bets if he can already pre­dict ev­ery­thing any­way?

That said, it’s not clear that when Omega offers you a bet, you should au­to­mat­i­cally re­fuse it un­der the as­sump­tion that Omega is try­ing to “beat” you. It seems like Omega doesn’t re­ally mind giv­ing away money (pretty rea­son­able for an om­ni­scient en­tity), since he seems to be will­ing to leave boxes with mil­lions of dol­lars in them just ly­ing around.

What is Omega’s pur­pose is en­tirely un­known. Maybe he wants you to win these bets. If you’re a ra­tio­nal per­son who “wants to win”, I think you can just “not worry” about what Omega’s in­tents are, and figure out what se­quence of ac­tions max­i­mizes your util­ity (which in these ex­am­ples always seems to di­rectly trans­late into max­i­miz­ing the amount of money you get).

• Quan­tum Coins. se­ri­ously. they’re easy enough to pre­dict if you ac­cept many wor­lds.
as for the rest… en­ter­tain­ment? Could be a case of “even though I can pre­dict these hu­mans so well, it’s fas­ci­nat­ing as to just how many of them two-box no mat­ter how ob­vi­ous i make it.”
It’s not im­pos­si­ble-we know that we ex­ist, it is not im­pos­si­ble that some race re­sem­bling our own figured out a suffi­cient solu­tion to the lob prob­lem and be­came a race of omegas...

• That’s just like play­ing “Eeny, meeny, miny, moe” to de­ter­mine who’s ‘it’. Once you figure out if there’s an even or odd num­ber of words, you know the an­swer, and it isn’t ran­dom to you any­more. This may be great as a kid choos­ing who gets a cookie (wow! I win again!), but you’re no longer talk­ing about some­thing that can go ei­ther way.

For a ran­dom out­put of a known func­tion, you still need a ran­dom in­put.

• The trick with eeny-meeny-miney-moe is that it’s long enough for us to not con­sciously and quickly iden­tify whether the say­ing is odd or even, gives a 0, 1, or 2 on mod­ulo 3, etc, un­less we TRY to re­mem­ber what it pro­duces, or TRY to re­mem­ber if it’s odd or even be­fore point­ing it out. Know­ing that do­ing so con­sciously ru­ins its ca­pac­ity, we can turn to mem­ory de­cay to re­store some of the pseudo-ran­dom qual­ity. ba­si­cally, by suffi­ciently de­cou­pling “point at A” from “choose A” to our in­ter­nal cog­ni­tive al­gorithms...we change the way we route vi­sual in­put and spit out a “point at X”.

THAT”S where the ran­dom­ness of eeny-meeny-miney-moe comes in...though I’ve prob­a­bly got only one use left of it when it comes to situ­a­tions with 2 items thanks to writ­ing this up...

• There ex­ist QUANTUM coins, you know. when they see a fork in the road, they take it.

I’d be feel­ing a lit­tle queasy if omega came up to me and said that. maybe I’d say “erm, thanks for not tak­ing ad­van­tage of me, then...I guess?”

• You know, if Omega is truly do­ing a full simu­la­tion of my cog­ni­tive al­gorithm, then it seems my in­ter­ac­tions with him should be dom­i­nated by my de­sire for him to stop it, since he is effec­tively cre­at­ing and mur­der­ing copies of me.

• The de­ci­sion doesn’t need to be read off from a straight­for­ward simu­la­tion, it can be an on-de­mand, so to say, re­con­struc­tion of the out­come from the coun­ter­fac­tual. I be­lieve it should be pos­si­ble to calcu­late just your de­ci­sion, with­out con­struct­ing a morally sig­nifi­cant com­pu­ta­tion. Know­ing your de­ci­sion may be as sim­ple as check­ing whether you ad­here a cer­tain de­ci­sion the­ory.

• There is no rule that says I need to care what the Omega does in his own head. If you ob­ject to be­ing tem­porar­ily em­u­lated then I can cer­tainly see why you would be ad­verse to that. But I don’t hap­pen to ob­ject and nor do I feel in any way oblidged to. Even if I’m the em­u­lated me.

• but...if you’re the em­u­lated you...you’re go­ing to die af­ter you give him/​don’t give him the money.

• I can see why that both­ers you and it is perfectly rea­son­able to in­clude a penalty for be­ing tem­porar­ily em­u­lated in your util­ity func­tion.

The way I see it is that I started off as one me, had an ex­tra em­u­lated me for a bit and then was back to me again. There is just as much util­ity left over as there was at the start, with a bit ex­tra thrown in for the favourable dice game. I have lost noth­ing.

The em­u­lated me gains noth­ing to gain be car­ing whether he is the real me or not. If he had the op­tion of break­ing free and be­ing Cam II then by all means he would do so. But as it hap­pens em­u­lated me ex­ists for a finite term and I have no way of ex­ploit­ing that through my de­ci­sion mak­ing. I just make what­ever de­ci­sion benefits the real me, whether that be this me or the other me.

This is the way I see things: I am an AI run­ning on a in­stal­la­tion of Pen Drive Linux, an op­er­at­ing sys­tem that runs off a USB stick. Omega tells me that he has purch­esed 10,000 iden­ti­cal USB sticks and du­pli­cated me onto each of them. He tells me that in 1 hour he is go­ing to de­stroy 10,000 of the USB sticks but dou­ble the pro­cess­ing power and RAM on the com­puter that runs me. He plans to re­peat the same pro­ce­dure ev­ery day.

I have one hour. Do I use that time to ex­ploit my net con­nec­tion, ap­pro­pri­ate funds and hire an as­sas­sain to kill Omega so he ceases his evil mass mur­der?

As it hap­pens, I don’t. It both­ers me not at all which of the 10,001 clones I am. There’s still go­ing to be a me that is more or less the same as me. If it hap­pens that the copy of me cur­rently run­ning is de­stroyed I am quite will­ing to ac­cept that. I don’t con­sider my­self ‘kil­led’ or ‘dead’. I con­sider that I lost the mem­ory of one con­ver­sa­tion with some crazy Omega but gained a bunch of pro­cess­ing power and ram. What­ever. Go ahead, keep at it big O.

In sum­mary: I just don’t think my in­stinc­tive aver­sion do death ap­plies rea­son­ably to situ­a­tions where clones of me are be­ing cre­ated and de­stroyed all willy nilly. In such situ­a­tions I mea­sure util­ity more ab­stractly.

• It’s not just about the USB sticks—to me that seems in­ert. But if he’s run­ning you off those USB sticks for (let’s say) a few hours ev­ery day, then you could (in fact there is a 1000/​1001 chance that you will) wake up to­mor­row morn­ing and find your­self run­ning from one of those drives, and know that there is a clear hori­zon of a few hours on the sub­jec­tive ex­pe­riences you can an­ti­ci­pate. This is a prospect which I, at least, would find ter­rify­ing.

• Maybe Omega ex­ists in a higher spa­tial di­men­sion and just takes an in­stan­ta­neous snap­shot of the uni­ver­sal finite state au­tomata you ex­ist in (as a p-zom­bie).

• Hi,

My name is Omega. You may have heard of me.

Any­way, I have just tossed a fair coin, and given that the coin came up tails, I’m gonna have to ask each of you to give me \$100. What­ever you do in this situ­a­tion, noth­ing else will hap­pen differ­ently in re­al­ity as a re­sult. Nat­u­rally you don’t want to give up your \$100. But see, if the coin came up heads in­stead of tails, I’d have given you each \$10000, but only to those that would agree to give me \$100 if the coin came up tails.

• You for­got to add that we have suffi­cient rea­son to be­lieve ev­ery­thing you say.

• I don’t be­lieve you.

• I re­ally fail to see why you’re all so fas­ci­nated by New­comb-like prob­lems. When you break causal­ity, all logic based on causal­ity doesn’t func­tion any more. If you try to model it math­e­mat­i­cally, you will get in­con­sis­tent model always.

• There’s no need to break causal­ity. You are a be­ing im­ple­mented in chaotic wet­ware. How­ever, there’s no rea­son to think we couldn’t have ra­tio­nal agents im­ple­mented in much more pre­dictable form, as python rou­tines for ex­am­ple, so that any be­ing with su­pe­rior com­pu­ta­tion power could sim­ply in­spect the source and de­ter­mine what the out­put would be.

In such a case, New­comb-like prob­lems would arise, perfectly lawfully, un­der nor­mal physics.

• In fact, New­comb-like prob­lems fall nat­u­rally out of any abil­ity to simu­late and pre­dict the ac­tions of other agents. Omega as de­scribed is es­sen­tially the limit as pre­dic­tive power goes to in­finity.

• This gives me the in­tu­ition that try­ing to de­cide whether to one-box or two box on new­comb is like try­ing to de­cide what 0^0 is; you get your in­tu­ition by fol­low­ing a limit pro­cess, but that limit pro­cess pro­duces differ­ent re­sults de­pend­ing on the path you take.

It would be in­ter­est­ing to look at finitely good pre­dic­tors. Per­haps we can find some­thing analo­gous to the re­sult that lim_(x, y -->0) (x^y) is path de­pen­dent.

• If we define an im­perfect pre­dic­tor as a perfect pre­dic­tor plus noise, i.e. pro­duces the cor­rect pre­dic­tion with prob­a­bil­ity p re­gard­less of the cog­ni­tion al­gorithm it’s try­ing to pre­dict, then New­comb-like prob­lems are very ro­bust to im­perfect pre­dic­tion: for any p > .5 there is some pay­off ra­tio great enough to pre­serve the para­dox, and the re­quired ra­tio goes down as the pre­dic­tion im­proves. e.g. if 1-box­ing gets 100 utilons and 2-box­ing gets 1 utilon, then the pre­dic­tor only needs to be more than 50.5% ac­cu­rate. So the limit in that di­rec­tion fa­vors 1-box­ing.

What other di­rec­tion could there be? If the pre­dic­tion ac­cu­racy de­pends on the al­gorithm-to-be-pre­dicted (as it would in the real world), then you could try to be an al­gorithm that is mis­pre­dicted in your fa­vor… but a mis­pre­dic­tion in your fa­vor can only oc­cur if you ac­tu­ally 2-box, so it only takes a mod­icum of ac­cu­racy be­fore a 1-boxer who tries to be pre­dictable is bet­ter off than a 2-boxer who tries to be un­pre­dictable.

I can’t see any other way for the limit to turn out.

• If you have two agents try­ing to pre­com­mit not to be black­mailed by each other /​ pre­com­mit not to pay at­ten­tion to the oth­ers pre­com­mit­ment, then any at­tempt to take a limit of this New­comblike prob­lem does de­pend on how you ap­proach the limit. (I don’t know how to solve this prob­lem.)

• The value(s) for which the limit is be­ing taken here is uni­di­rec­tional pre­dic­tive power, which is loosely a func­tion of the differ­ence in in­tel­li­gence be­tween the two agents; in­tu­itively, I think a case could be made that (as­sum­ing ideal ra­tio­nal­ity) the to­tal ac­cu­racy of mu­tual be­hav­ior pre­dic­tion be­tween two agents is con­served in some fash­ion, that dou­bling the pre­dic­tive power of one un­avoid­ably would roughly halve the pre­dic­tive power of the other. Omega rep­re­sents an en­tity with a delta-g so large vs. us that pre­dic­tive power is es­sen­tially com­pletely one-sided.

From that ba­sis, al­low­ing the uni­di­rec­tional pre­dic­tive power of both agents to go to in­finity is prob­a­bly in­her­ently ill-defined and there’s no rea­son to ex­pect the prob­lem to have a solu­tion.

• You can­not do that with­out break­ing Rice’s the­o­rem. If you as­sume you can find out the an­swer from some­one else’s source code → in­stant con­tra­dic­tion.

You can­not work around Rice’s the­o­rem or around causal­ity by spec­i­fy­ing 50.5% ac­cu­racy in­de­pen­dently of mod­eled sys­tem, any ac­cu­racy higher than 50%+ep­silon is equiv­a­lent to in­definitely good ac­cu­racy by re­peat­edly pre­dict­ing (stan­dard cryp­to­graphic re­sult), and 50%+ep­silon doesn’t cause the para­dox.

Give me one se­ri­ous math model of New­comb-like prob­lems where the para­dox emerges while pre­serv­ing causal­ity. Here are some ex­am­ples. Then you model it, you ei­ther get triv­ial solu­tion to one-box, or causal­ity break, or omega loses.

• You de­cide first what you would do in ev­ery situ­a­tion, omega de­cides sec­ond, and now you only im­ple­ment your ini­tial de­ci­sion table and are not al­lowed to switch. Game the­ory says you should im­ple­ment one-box­ing.

• You de­cide first what you would do in ev­ery situ­a­tion, omega de­cides sec­ond, and now you are al­lowed to switch. Game the­ory says you should pre­com­mit to one-box, then im­ple­ment two-box­ing, omega loses.

• You de­cide first what you would do in ev­ery situ­a­tion, omega de­cides sec­ond, and now you are al­lowed to switch. If omega always de­cides cor­rectly, then he bases his de­ci­sion on your switch, which ei­ther turns it into model #1 (you can­not re­ally switch, pre­com­mit­ment is bind­ing), or breaks causal­ity.

• Rice’s the­o­rem says you can’t pre­dict ev­ery pos­si­ble al­gorithm in gen­eral. Plenty of par­tic­u­lar al­gorithms can be pre­dictable. If you’re run­ning on a clas­si­cal com­puter and Omega has a copy of you, you are perfectly pre­dictable.

And all of your choices are just as real as they ever were, see the OB se­quence on free will (I think some­one referred to it already).

• And the ar­gu­ment that omega just needs pre­dic­tive power of 50.5% to cause the para­dox only works if it works against ANY ar­bi­trary al­gorithm. Hav­ing that power against any ar­bi­trary al­gorithm breaks Rice’s The­o­rem, hav­ing that power (or even 100%) against just limited sub­set of al­gorithms doesn’t cause the para­dox.

If you take strict de­ci­sion tree pre­com­mit­ment in­ter­pre­ta­tion, then you fix causal­ity. You de­cide first, omega de­cides sec­ond, game the­ory says one-box, prob­lem solved.

De­ci­sion tree pre­com­mit­ment is never a prob­lem in game the­ory, as pre­com­mit­ment of the en­tire tree com­mutes with de­ci­sions by other agents:

• A de­cides what f(X), f(Y) to do if B does X or Y. B does X. A does f(X)

• B does X. A de­cides what f(X), f(Y) to do if B does X or Y. A does f(X)

are iden­ti­cal, as B can­not de­cide based on f. So the chang­ing your mind prob­lem never oc­curs.

With omega:

• A de­cides what f(X), f(Y) to do if B does X or Y. B does X. A does f(X) - B can an­swer de­pend­ing on f

• B does X. A de­cides what f(X), f(Y) to do if B does X or Y. A does f(X) - some­how not al­lowed any more

I don’t think the para­dox ex­ist in any plau­si­ble math­ema­ti­za­tion of the prob­lem. It looks to me like an­other of those philo­soph­i­cal prob­lems that ex­ist be­cause of slop­piness of nat­u­ral lan­guage and very lit­tle more, I’m just sur­prised that OB/​LW crowd cares about this one and not about oth­ers. OK, I ad­mit I re­ally en­joyed it the first time I saw it but just as some­thing fun, noth­ing more than that.

• I don’t think the para­dox ex­ist in any plau­si­ble math­ema­ti­za­tion of the prob­lem.

I don’t know why no­body men­tioned this at the time, but that’s hardly an un­pop­u­lar view around here (as I’m sure you’ve no­ticed by now).

The in­ter­est­ing thing about New­comb had noth­ing to do with think­ing it was a gen­uine para­dox—just coun­ter­in­tu­itive for some.

• there’s no rea­son to think we couldn’t have ra­tio­nal agents im­ple­mented in much more pre­dictable form, as python rou­tines for ex­am­ple, so that any be­ing with su­pe­rior com­pu­ta­tion power could sim­ply in­spect the source and de­ter­mine what the out­put would be.

Such a be­ing would be differ­ent from a hu­man in fun­da­men­tal ways. Imag­ine know­ing with cer­tainty that your ac­tions can be pre­dicted perfectly by the guy next door, even tak­ing into ac­count that you are try­ing to be hard to pre­dict?

A (quasi)ra­tio­nal agent with ac­cess to gen­uine ran­dom­ness (such as a hu­man) is a differ­ent mat­ter. A su­per­in­tel­li­gence could al­most perfectly pre­dict the prob­a­bil­ity dis­tri­bu­tion over my ac­tions, but by quan­tum en­tan­gle­ment it would not be able to pre­dict my ac­tual ac­tions.

• A (quasi)ra­tio­nal agent with ac­cess to gen­uine ran­dom­ness (such as a hu­man)

Whad­daya mean hu­mans are ra­tio­nal agents with ac­cess to gen­uine ran­dom­ness? That’s what we’re ar­gu­ing about in the first place!

A su­per­in­tel­li­gence could al­most perfectly pre­dict the prob­a­bil­ity dis­tri­bu­tion over my ac­tions, but by quan­tum en­tan­gle­ment it would not be able to pre­dict my ac­tual ac­tions.

Per­haps Omega is en­tan­gled with your brain such that in all the wor­lds in which you would choose to one-box, he would pre­dict that you one-box, and all the wor­lds in which you would choose to two-box, he would pre­dict that you two-box?

• In the origi­nal for­mu­la­tion, if Omega ex­pects you to flip a coin, he leaves box B empty.

• Imag­ine know­ing with cer­tainty that your ac­tions can be pre­dicted perfectly by the guy next door, even tak­ing into ac­count that you are try­ing to be hard to pre­dict?

You wouldn’t know this with cer­tainty* be­cause it wouldn’t be true.

(*un­less you were delu­sional)

The guy next door is on roughly your men­tal level. Thus, the guy next door can’t pre­dict your ac­tions perfectly, be­cause he can’t run a perfect simu­la­tion of your mind that’s faster than you. He doesn’t have the ca­pac­ity.

And he cer­tainly doesn’t have the ca­pac­ity to simu­late the en­vi­ron­ment, in­clud­ing other peo­ple, while do­ing so.

A (quasi)ra­tio­nal agent with ac­cess to gen­uine ran­dom­ness (such as a hu­man) is a differ­ent mat­ter.

Hu­mans may or may not gen­er­ally have ac­cess to gen­uine ran­dom­ness.

It’s as yet un­known whether we even have run on quan­tum ran­dom­ness; and its also un­prov­able that quan­tum ran­dom­ness is ac­tu­ally gen­uine ran­dom­ness, and not just based on effects we don’t yet un­der­stand, as so many other types of ran­dom­ness have been.

• You wouldn’t know this with cer­tainty* be­cause it wouldn’t be true.

You’re not tak­ing this in the least con­ve­nient pos­si­ble world. Surely it’s not im­pos­si­ble in prin­ci­ple that your neigh­bor can simu­late you and your en­vi­ron­ment. Per­haps your neigh­bor is su­per­in­tel­li­gent?

• It’s ALSO not im­pos­si­ble in prin­ci­ple in the real world. A su­per­in­tel­li­gent en­tity could, in prin­ci­ple, perfectly pre­dict my ac­tions. Re­mem­ber, in the Least Con­ve­nient Pos­si­ble World quan­tum “ran­dom­ness” isn’t ran­dom.

As such, this ISN’T a fun­da­men­tal differ­ence be­tween hu­mans and “such be­ings”. Which was all I set out to demon­strate.

I was us­ing the “most plau­si­ble world” on the ba­sis that it seemed pretty clear that that was the one Roko in­tended. (Where your neigh­bour isn’t in fact Yah­weh in dis­guise). EDIT: Prob­a­bly should spec­ify wor­lds for things in this kind of en­vi­ron­ment. Thanks, the crit­i­cal en­vi­ron­ment here is helping me think about how I think/​ar­gue.

• It’s as yet un­known whether we even have run on quan­tum ran­dom­ness; and its also un­prov­able that quan­tum ran­dom­ness is ac­tu­ally gen­uine ran­dom­ness, and not just based on effects we don’t yet un­der­stand, as so many other types of ran­dom­ness have been.

If you be­lieve the Many Wor­lds In­ter­pre­ta­tion, then quan­tum ran­dom­ness just cre­ates copies in a de­ter­minis­tic way.

• They don’t re­quire break­ing causal­ity. The ar­gu­ment works if Omega is barely pre­dict­ing you above chance. I’m sure there are plenty of nor­mal peo­ple who can do that just by talk­ing to you.

There are also more im­por­tant rea­sons. Take the dooms­day ar­gu­ment. You can use the fact that you’re al­ive now to pre­dict that we’ll die out “soon”. Sup­pose you had a choice be­tween sav­ing a life in a third-world coun­try that likely wouldn’t amount to any­thing, or donat­ing to SIAI to help in the dis­tant fu­ture. You know it’s very un­likely for there to be a dis­tant fu­ture. It’s like Omega did his coin toss, and if it comes up tails, we die out early and he asks you to waste the money by donat­ing to SIAI. If it comes up heads, you’re in the fu­ture, and it’s bet­ter if you would have donated.

That’s not some thing that might hap­pen. That’s a de­ci­sion you have to make be­fore you pick a char­ity to donate to. Lives are rid­ing on this. That’s if the coin lands on tails. If it lands on heads, there is more life rid­ing on it than has so far ex­isted in the known uni­verse. Please choose care­fully.

• The ar­gu­ment works if Omega is barely pre­dict­ing you above chance.

Ar­gu­ments like these re­mind me of stu­dents’ mis­takes from Al­gorithms and Data Struc­tures 101 - state­ments like that are very in­tu­itive, ab­solutely wrong, and once you figure out why this rea­son­ing doesn’t work it’s easy to for­get that most peo­ple didn’t go through this ever.

What is re­quired is Omega pre­dict­ing bet­ter than chance in the worst case. Pre­dict­ing cor­rectly with ridicu­lously tiny chance of er­ror against “av­er­age” per­son is worth­less.

To avoid Omega and causal­ity silli­ness, and just demon­strate this in­tu­ition—let’s take a slightly mod­ified ver­sion of Boolean satis­fi­a­bil­ity—but in­stead of one for­mula we have three for­mu­las of the same length. If all three are iden­ti­cal, re­turn true or false de­pend­ing on its satis­fi­a­bil­ity, if they’re differ­ent re­turn true if num­ber of one bits in prob­lem is odd (or some other triv­ial prop­erty).

It is ob­vi­ously NP-com­plete, as any satis­fi­a­bil­ity prob­lem re­duces to it by con­cate­nat­ing it three times. If we use ex­po­nen­tial brute force to solve the hard case, av­er­age run­ning time is O(n) for scan­ning the string plus O(2^(n/​3)) for brute forc­ing but only 2^-(2n/​3) of the time, that is O(1). So we can solve NP-com­plete prob­lems in av­er­age lin­ear time.

What hap­pened? We were led astray by in­tu­ition, and as­sumed that prob­lems that are difficult in worst case can­not be triv­ial on av­er­age. But this equal weight­ing is an ar­ti­fact—if you tried re­duc­ing any other NP prob­lem into this, you’d be get­ting very difficult ones nearly all the time, as if by magic.

Back to Omega—even if Omega pre­dicts nor­mal peo­ple very well, as long as there are any think­ing be­ing who is can­not pre­dict—Omega must break causal­ity. And such be­ing are not just hy­po­thet­i­cal—peo­ple who de­cide based on a coin toss are ex­actly like that. Silly rules about dis­al­low­ing chance merely make coun­terex­am­ples more com­pli­cated, Omega and New­comb are still as much based on sloppy think­ing as ever.

• I don’t know any rea­son why a coin toss would be the best choice in New­comb’s para­dox. If you de­cide based on rea­son, and don’t de­cide to flip a coin, and Omega knows you well, he can pre­dict your ac­tion above chance. The para­dox stands.

• Omega can­not know coin flip re­sults with­out vi­o­lat­ing causal­ity. So he ei­ther puts that mil­lion in the box or not. As a re­sult, no mat­ter which way he de­cides, Omega has 50% chance of vi­o­lat­ing own rules, which was sup­pos­edly im­pos­si­ble, break­ing the prob­lem.

• What I mean is, if you change the sce­nario so he only has to pre­dict above chance if you don’t flip a coin, and he isn’t always get­ting it right any­way, the same ba­sic prin­ci­ple ap­plies, but it doesn’t vi­o­late causal­ity.

• The ob­vi­ous ex­ten­sions of the prob­lem to cases with failable Omega are:

1. P( \$1,000,000) = P(onebox)

2. Re­ward = \$1,000,000 * P(onebox)

• In Bayesian in­ter­pre­ta­tion P() would be Omega’s sub­jec­tive prob­a­bil­ity. In fre­quen­tist in­ter­pre­ta­tion, the ques­tion doesn’t make any sense as you make a sin­gle box­ing de­ci­sion, not large num­ber of tiny box­ing de­ci­sions. Either way P() is very ill-defined.

• Either way P() is very ill-defined.

No more so than other prob­a­bil­ities. Prob­a­bil­ities about fu­ture de­ci­sions of other ac­tors aren’t dis­priv­ileged, that would be free will con­fu­sion. And are you se­ri­ously claiming that the prob­a­bil­ities of a coin flip don’t make sense in a fre­quen­tist in­ter­pre­ta­tion? That was the con­text. In the gen­eral case it would be the long term rel­a­tive fre­quency of pos­si­ble ver­sions of you similar enough to you to be in­dis­t­in­guish­able for Omega de­cid­ing that way or some­thing like that, if you in­sisted on us­ing fre­quen­tist statis­tics for some rea­son.

• (this com­ment as­sumes “Re­ward = \$1,000,000 * P(onebox)”)

You mi­s­un­der­stand fre­quen­tist in­ter­pre­ta­tion—sam­ple size is 1 - you ei­ther de­cide yes or de­cide no. To gen­er­al­ize from a sin­gle de­cider needs prior refer­ence class (“toin cosses”), get­ting us into Bayesian sub­jec­tive in­ter­pre­ta­tions. Fre­quen­tists don’t have any con­cept of “prob­a­bil­ity of hy­poth­e­sis” at all, only “prob­a­bil­ity of data given hy­poth­e­sis” and the only way to con­nect them is us­ing pri­ors. “Fre­quency among pos­si­ble wor­lds” is also a Bayesian thing that weirds fre­quen­tists out.

Any­way, if Omega has amaz­ing pre­dic­tion pow­ers, and P() can be de­ter­minis­ti­cally known by look­ing into the box this is far more valuable than mere \$1,000,000! Let’s say I make my de­ci­sion by ran­domly gen­er­at­ing some string and check­ing if it’s a valid proof of Rie­mann hy­poth­e­sis—if P() is non-zero, I made my­self \$1,000,000 any­way.

I un­der­stand that there’s an ob­vi­ous tech­ni­cal prob­lem if Omega rounds the num­ber to whole dol­lars, but that’s just minor de­tail.

And ac­tu­ally, it is a lot worse in pop­u­lar prob­lem for­mu­la­tion of “if your de­ci­sion re­lies on ran­dom­ness, there will be no mil­lion” that tries to work around coin toss­ing. In such case a per­son ran­domly try­ing to prove false state­ment gets a mil­lion (as no proof could work, so his de­ci­sion was re­li­able), and a per­son ran­domly try­ing to prove true state­ment gets \$0 (as there’s non-zero chance of him ran­domly gen­er­at­ing cor­rect proof).

Another fun idea would be mea­sur­ing both po­si­tion and ve­loc­ity of an elec­tron—toss­ing a coin to de­cide ei­ther way, mea­sur­ing one and get­ting the other from Omega.

Pos­si­bil­ities are just end­less.

• The is­sue was whether the for­mu­la­tion makes sense, not whether it makes fre­quen­tial­ists freak out (and it’s not sub­stan­tially differ­ent than e. g. draw­ing from an urn for the first time). In ei­ther case P() was the prob­a­blitity of an event, not a hy­poth­e­sis.

In these sorts of prob­lems you are sup­posed to as­sume that the dol­lar amounts match your ac­tual util­ities (as you ob­serve your ex­ploit doesn’t work any­way for tests with a prob­a­bil­ity of <0.5*10^-9 if round­ing to cents, and you could just as­sume that you already have gained all knowl­edge you could gain through such test, or that Omega pos­sesses ex­actly the same knowl­edge as you ex­cept for hu­man psy­chol­ogy, or what­ever).

• I re­ally fail to see why you’re all so fas­ci­nated by New­comb-like prob­lems.

Agreed. This prob­lem seems un­in­ter­est­ing to me too. Though more re­al­is­tic new­comb-like prob­lems are in­ter­est­ing; for there are parts of life where new­com­bian rea­son­ing works for real.

On sec­ond thoughts, since many clever philoso­phers spend ca­reers on these prob­lems, I may be miss­ing some­thing.

The ob­vi­ous com­plaint about “would you choose X or Y given that Omega already knows your ac­tions” is that it is log­i­cally in­con­sis­tent; if Omega already knows your ac­tions, the word “choose” is non­sense. Strictly speak­ing, “choose” is non­sense any­way; it takes the naive free will point of view in its ev­ery­day us­age.

In or­der to un­tan­gle this, a so­phis­ti­cated un­der­stand­ing of what we mean by “choose” is needed. I may post on this. My in­tu­ition is that if we stick to a rigor­ous mean­ing of “choose”, the ques­tion will have a well-defined an­swer that no-one will dis­pute, how­ever what this an­swer is will de­pend on the defi­ni­tion of “choose” that you, um, choose, so to speak…

• This prob­lem seems un­in­ter­est­ing to me too. Though more re­al­is­tic new­comb-like prob­lems are in­ter­est­ing; for there are parts of life where new­com­bian rea­son­ing works for real.

I find the prob­lem in­ter­est­ing, so I’ll try to ex­plain why I find it in­ter­est­ing.

So there are these blogs called Over­com­ing Bias and Less Wrong, and the peo­ple post­ing on it seem like very smart peo­ple, and they say very rea­son­able things. They offer to teach how to be­come ra­tio­nal, in the sense of “win­ning more of­ten”. I want to win more of­ten too, so I read the blogs.

Now a lot of what these peo­ple are say­ing sounds very rea­son­able, but it’s also clear that the peo­ple say­ing these things are much smarter than me; so much so that al­though their con­clu­sions sound very rea­son­able, I can’t always fol­low all the ar­gu­ments or steps used to reach those con­clu­sions. As part of my ra­tio­nal­ist train­ing, I try to no­tice when I can fol­low the steps to a con­clu­sion, and when I can’t, and re­mem­ber which con­clu­sions I be­lieve in be­cause I fully un­der­stand it, and which con­clu­sions I am “ten­ta­tively be­liev­ing in” be­cause some­one smart said it, and I’m just tak­ing their word for it for now.

So now Vladimir Nesov pre­sents this puz­zle, and I re­al­ize that I must not have un­der­stood one of the con­clu­sions (or I did un­der­stand them, and the smart peo­ple were mis­taken), be­cause it sounds like if I were to fol­low the ad­vice of this blog, I’d be do­ing some­thing re­ally stupid (de­pend­ing on how you an­swered VN’s prob­lem, the stupid thing is ei­ther “wast­ing \$100” or “wast­ing \$4950″).

So how do I rec­on­cile this with ev­ery­thing I’ve learned on this blog?

Think of most of the blog as a text­book, with VN’s post be­ing an “ex­er­cise to the reader” or a “home­work prob­lem”.

• The pri­mary rea­son for re­solv­ing New­comb-like prob­lems is to ex­plore the fun­da­men­tal limi­ta­tions of de­ci­sion the­o­ries.

It sounds like you are still con­fused about free will. See Right­ing a Wrong Ques­tion, Pos­si­bil­ity and Could-ness, and Daniel Den­nett’s lec­ture here.

• yes, I am con­fused about free will, but I think that this con­fu­sion is le­gi­t­i­mate given our cur­rent lack of knowl­edge about how the hu­man mind works.

I hope I’m not mak­ing ob­vi­ous er­rors about free will. But if I am, then I’d like to know...

• I think I’m not con­fused about free will, and that the links I gave should help to re­solve most of the con­fu­sion. Maybe you should write a blog post/​LW ar­ti­cle where you for­mu­late the na­ture of your con­fu­sion (if you still have it af­ter read­ing the rele­vant ma­te­rial), I’ll re­spond to that.

• Not re­ally—all that is nec­ces­sary is that Omega is a suffi­ciently ac­cu­rate pre­dic­tor that the pay­off ma­trix, tak­ing this ac­cu­racy into ques­tion, still amounts to a win for the given choice. There is no need to be a perfect pre­dic­tor. And if an im­perfect, 99.999% pre­dic­tor vi­o­lates free will, then it’s clearly a lost cause any­way (I can pre­dict with similar pre­ci­sion many be­havi­ours about peo­ple based on no more ev­i­dence than their be­havi­our and speech, never mind godlike brain in­tro­spec­tion) Do you have no “choice” in de­cid­ing to come to work to­mor­row, if I pre­dict based on your record that you’re 99.99% re­li­able? Where is the cut-off that free will gets lost?

• Do you have no “choice” in de­cid­ing to come to work to­mor­row, if I pre­dict based on your record that you’re 99.99% re­li­able?

Hu­mans are sub­tle beasts. If you tell me that you have pre­dicted that I will go to work based upon my 99.99% at­ten­dance record, the prob­a­bil­ity that I will go to work drops dra­mat­i­cally upon me re­ceiv­ing that in­for­ma­tion, be­cause there is a good chance that I’ll not go just to be awk­ward. This op­tion of “tak­ing your pre­dic­tion into ac­count, I’ll do the op­po­site to be awk­ward” is why it feels like you have free will.

• Chances are I can pre­dict such a re­sponse too, and so won’t tell you of my pre­dic­tion (or tell you in such a way that you will be more likely to at­tend: eg. “I’ve a \$50 bet you’ll at­tend to­mor­row. Be there and I’ll split it 50:50”). It doesn’t change the fact that in this par­tic­u­lar in­stance I can fortell the fu­ture with a high de­gree of ac­cu­racy. Why then would it vi­o­late free will if Omega could pre­dict your ac­cu­racy in this differ­ent situ­a­tion (one where he’s also able to pre­dict the effects of him tel­ling you) to a similar pre­ci­sion?

• Why then would it vi­o­late free will if Omega could pre­dict your ac­cu­racy in this differ­ent situ­a­tion (one where he’s also able to pre­dict the effects of him tel­ling you) to a similar pre­ci­sion?

Be­cause that’s pretty much our in­tu­itive defi­ni­tion of free will; that it is not pos­si­ble for some­one to pre­dict your ac­tions, an­nounce it pub­li­cly, and still be cor­rect. If you dis­agree, we are dis­agree­ing about the in­tu­itive defi­ni­tion of “free will” that most peo­ple carry around in their heads. At least ad­mit that most peo­ple would be un­sur­prised if a per­son pre­dicted that they would (e.g.) brush their teeth in the morn­ing (with­out tel­ling them in ad­vance that it had pre­dicted that), ver­sus pre­dict­ing that they would knock a vase over, and then as a re­sult of that pre­dic­tion, the vase ac­tu­ally get­ting knocked over.

• Then take my bet situ­a­tion. I an­nounce your at­ten­dance, and cut you in with a \$25 stake in at­ten­dance. I don’t think it would be un­usual to find some­one who would in­deed ap­pear 99.99% of the time—does that mean that per­son has no free will?

Peo­ple are highly, though not perfectly, pre­dictable un­der a large num­ber of situ­a­tions. Re­veal­ing knowl­edge about the pre­dic­tion com­pli­cates things by adding feed­back to the sys­tem, but there are lots of cases where it still doesn’t change mat­ters much (or even in­creases pre­dictabil­ity). There are ob­vi­ously some situ­a­tions where this doesn’t hap­pen, but for New­combe’s para­dox, all that is needed is a pre­dic­tor for the par­tic­u­lar situ­a­tion de­scribed, not any gen­eral situ­a­tion. (In fact New­combe’s para­dox is equally bro­ken by a similar rev­e­la­tion of knowl­edge. If Omega were to re­veal its pre­dic­tion be­fore the boxes are cho­sen, a per­son de­ter­mined to do the op­po­site of that pre­dic­tion opens it up to a sim­ple Epi­menides para­dox.)

• On sec­ond thoughts, since many clever philoso­phers spend ca­reers on these prob­lems, I may be miss­ing some­thing.

Nah, they just need some­thing to talk about.

• I con­vinced my­self to one-box in New­comb by sim­ply treat­ing it as if the con­tents of the boxes mag­i­cally change when I made my de­ci­sion. Sim­ply draw the de­ci­sion tree and max­i­mize u-value.

I con­vinced my­self to co­op­er­ate in the Pri­soner’s Dilemma by treat­ing it as if what­ever de­ci­sion I made the other per­son would mag­i­cally make too. Sim­ply draw the de­ci­sion tree and max­i­mize u-value.

It seems that Omega is differ­ent be­cause I ac­tu­ally have the in­for­ma­tion, where in the oth­ers I don’t.

For ex­am­ple, In New­comb, if we could see the con­tents of both boxes, then I should two-box, no? In the Pri­soner’s Dilemma, if my op­po­nent de­cides be­fore me and I ob­serve the de­ci­sion, then I should defect, no?

I sus­pect that this means that my thought pro­cess in New­comb and the Pri­soner’s Dilemma is in­cor­rect. That there is a bet­ter way to think about them that makes them more like Omega. Am I cor­rect? Does this make sense?

• Yes, the ob­jec­tive in de­sign­ing this puz­zle was to con­struct an ex­am­ple where ac­cord­ing to my un­der­stand­ing of the cor­rect way to make de­ci­sion, the cor­rect de­ci­sion looks like los­ing. In other cases you may say that you close your eyes, pre­tend that your de­ci­sion de­ter­mines the past or other agents’ ac­tions, and just make the de­ci­sion that gives the best out­come. In this case, you choose the worst out­come. The ar­gu­ment is that on re­flec­tion it still looks like the best out­come, and you are given an op­por­tu­nity to think about what’s the cor­rect per­spec­tive from which it’s the best out­come. It binds the state of re­al­ity to your sub­jec­tive per­spec­tive, where in many other thought ex­per­i­ments you may dis­pense with this con­nec­tion and fo­cus solely on the re­al­ity, with­out pay­ing any spe­cial at­ten­tion to the de­ci­sion-maker.

• In New­comb, be­fore know­ing the box con­tents, you should one-box. If you know the con­tents, you should two-box (or am I wrong?)

In Pri­soner, be­fore know­ing the op­po­nent’s choice, you should co­op­er­ate. After know­ing the op­po­nent’s choice, you should defect (or am I wrong?).

If I’m right in the above two cases, doesn’t Omega look more like the “af­ter know­ing” situ­a­tions above? If so, then I must be wrong about the above two cases...

I want to be some­one who in situ­a­tion Y does X, but when Y&Z hap­pens, I don’t nec­es­sar­ily want to do X. Here, Z is the ex­tra in­for­ma­tion that I lost (in Omega), the op­po­nent has cho­sen (in Pri­soner) or that both boxes have money in them (in New­comb). What am I miss­ing?

• No—in the pris­on­ers’ dilemma, you should always defect (pre­sum­ing the pay­off ma­trix rep­re­sents util­ity), un­less you can some­how col­lec­tively pre-com­mit to co-op­er­at­ing, or it is iter­a­tive. This dis­tinc­tion you’re think­ing of only ap­plies when re­verse cau­sa­tion comes into play.

• I guess I’m a bit tired of “God was un­able to make the show to­day so the part of Om­ni­scient be­ing will be played by Omega” puz­zles, even if in my mind Omega looks amus­ingly like the Fly­ing Spaghetti Mon­ster.

Par­tic­u­larly in this case where Omega is be­ing ex­plic­itly dishon­est—Omega is claiming to be ei­ther be suffi­ciently om­ni­scient to pre­dict my ac­tions, or in­suffi­ciently om­ni­scient to pre­dict the re­sult of a ‘fair’ coin, ex­cept that the ‘fair’ coin is ex­plic­itly pre­de­ter­mined to always give the same re­sult . . . ex­cept . . .

What’s the point of us­ing ra­tio­nal­ism to think things through log­i­cally if you keep plac­ing your­self into illog­i­cal philo­soph­i­cal wor­lds to test the logic?

• Par­tic­u­larly in this case where Omega is be­ing ex­plic­itly dishon­est—Omega is claiming to be ei­ther be suffi­ciently om­ni­scient to pre­dict my ac­tions, or in­suffi­ciently om­ni­scient to pre­dict the re­sult of a ‘fair’ coin, ex­cept that the ‘fair’ coin is ex­plic­itly pre­de­ter­mined to always give the same result

Coin is not pre­de­ter­mined, and it doesn’t mat­ter if Omega has hand-se­lected ev­ery re­sult of the coin toss, as long as we don’t have any rea­son to slide the prob­a­bil­ity of the re­sult to ei­ther di­rec­tion.

• Could be a quan­tum coin, which is un­pre­dictable un­der cur­rent laws of physics. Any­way, this stuff ac­tu­ally does have ap­pli­ca­tions in de­ci­sion the­ory. Quib­bling over the prac­ti­cal im­ple­men­ta­tions of the thought ex­per­i­ment is not ac­tu­ally use­ful to you or any­body else.

• Could be a quan­tum coin, which is un­pre­dictable un­der cur­rent laws of physics.

More pre­cisely it is ex­actly pre­dictable but for most prac­ti­cal pur­poses can be treated as equiv­a­lent to an un­pre­dictable coin.

• 28 Mar 2012 3:18 UTC
−1 points
Parent

By ‘un­pre­dictable’ I mean ‘un­der cur­rent for­mal­isms of physics it is not pos­si­ble for us to ac­cu­mu­late enough in­for­ma­tion to pre­dict it’.

• By ‘un­pre­dictable’ I mean ‘un­der cur­rent for­mal­isms of physics it is not pos­si­ble for us to ac­cu­mu­late enough in­for­ma­tion to pre­dict it’.

By ‘more pre­cisely’ I mean… no. The way you have phrased it makes your state­ment false.

You can pre­dict what the fu­ture out­come of a quan­tum coin will be (along the lines of branches with heads and tails their re­spec­tive am­pli­tudes). A re­lated pre­dic­tion you can­not make—when the quan­tum event has already oc­curred but you have not yet ob­served it you can­not pre­dict what your ob­ser­va­tion will be (now that ‘your’ refers only to the ‘you’ in the spe­cific branch).

Again, for prac­ti­cal pur­poses—for most peo­ple’s way of valu­ing most fu­ture out­comes—the fu­ture coin can be treated as though it is an un­pre­dictable coin.

• I was us­ing ‘you’ and ‘us’ in the col­lo­quial sense of the sub­jec­tive ex­pe­riences of a spe­cific, ar­bi­trary con­ti­nu­ity cho­sen at ran­dom from of the set of Everette branches in the hy­po­thet­i­cal branch of the world tree that this coun­ter­fac­tual oc­curs in.

Now, I CAN start list­ing my pre­cise defi­ni­tions for ev­ery po­ten­tially am­bigu­ous term I use, or we could sim­ply agree not to pick im­prob­a­ble and in­con­sis­tent in­ter­pre­ta­tions of the other’s words. Frankly, I’d much pre­fer the lat­ter, as I can­not abide pedants.

EDIT: Or you could down­vote all my posts. That’s cool too.

• Since the dis­tinc­tion is of de­ci­sion the­o­ret­i­cal rele­vance and the source of much con­fu­sion I choose to clar­ify in­cor­rect us­ages of ‘un­pre­dictable’ in this par­tic­u­lar en­vi­ron­ment. By phras­ing it as ‘more pre­cisely’ I leave plenty of scope for the origi­nal speaker to be as­sumed to be just speak­ing loosely.

Un­for­tu­nately you chose to for­tify and defend an in­cor­rect po­si­tion in­stead of al­low­ing the ad­di­tional de­tail. Now you have given a very nice defi­ni­tion of ‘you’ but even with that defi­ni­tion both of your claims are just as in­cor­rect as when they started. Fix­ing ‘you’ misses the point.

You are prob­a­bly too en­trenched in your po­si­tion to work with but for any­one else who wants to talk about ‘un­pre­dictable’ quan­tum coins, qual­ifiers like (“for most in­tents and pur­poses”, “effec­tively”) are awe­some!

• By read­ing the quan­tum coin flip, you definitely en­tan­gle your­self with it, and there’s no way you’re go­ing to stay co­her­ent.

As a hard-core Everettian, I find the origi­nal us­age and the fol­lowup to­tally un­ob­jec­tion­able in prin­ci­ple. Your clar­ifi­ca­tion was good ex­cept for the part where it said Ati’s state­ment was wrong. There ex­ists a read­ing of the terms which leaves those wrong, yes. So don’t use that one.

• EDIT: Or you could down­vote all my posts. That’s cool too.

It should be noted that ‘all my posts’ does not re­fer to karma-as­sas­saina­tion here. Rather, that three com­ments here were down­voted. This is cor­rect (and in ac­cord with my down­vot­ing policy).

• And I per­ceived you as be­ing need­lessly pedan­tic and choos­ing im­plau­si­ble in­ter­pre­ta­tions of my words so that you could cor­rect me. You’ll note that your com­ment karma stands. I am, in fact, aware of quan­tum me­chan­ics, and you are, of course, en­tirely cor­rect. Coins be­have in pre­cisely de­ter­minis­tic ways, even if they rely on, say, ra­dioac­tive de­cay. The causal­ity just oc­curs in many Everette branches. That said, there is no way that be­fore ‘you’ ‘flip the coin’ you can make a pre­dic­tion about its sub­jec­tive fu­ture state, and have more than half of your fu­ture selves be right. If that’s not ‘un­pre­dictable’ by the word’s col­lo­quial defi­ni­tion, then I’m not sure the word has any mean­ing.

You will no­tice that when I said that the coin is un­pre­dictable, I did not claim, or even im­ply that the world was un­de­ter­minis­tic, or that quan­tum me­chan­ics was wrong. If I had said such a thing, you would have right to cor­rect me. As it is, you took the op­por­tu­nity to jump on my phras­ing to cor­rect me of a mis­con­cep­tion that I did not, in fact, pos­sess. That is be­ing pedan­tic, it is pointless, and above all it is an­noy­ing. I apol­o­gize for rude­ness, but try­ing to catch up oth­ers on their phras­ing is a shock­ing waste of in­tel­lect and time.

EDIT: Again, I can to­tally dis­card ev­ery word that’s en­trenched in good, old-fash­ioned sin­gle-uni­verse con­no­ta­tions, and spell out all the fas­ci­nat­ing mul­ti­verse im­pli­ca­tions of ev­ery­thing I say, if that will make you happy—but it will make my posts about five times longer, and it will make it a good deal more difficult to figure out what the hell I’m say­ing, which rather defeats the pur­pose of us­ing lan­guage.

• I’ll note that I re­ject your ‘im­plau­si­ble’ claim, ob­ject to all in­sinu­a­tions re­gard­ing mo­tive, stand by my pre­vi­ous state­ments and will main­tain my policy of mak­ing mild clar­ifi­ca­tions when the sub­ject hap­pens to come up.

There seems to be lit­tle else to be said here.

• 28 Mar 2012 18:04 UTC
−1 points
Parent

As you like. Though I do hope you ap­ply your stri­dent policy of tech­ni­cal cor­rect­ness in your home life, for con­sis­tency’s sake.

For ex­am­ple: some­one (clearly wrong) like I would merely say, in our ar­chaic and ho­plessly mono­cos­molog­i­cal phras­ing ‘I am go­ing to lunch.’ This is clearly non­sense. You will, over the set of mul­ti­verse branches, do a great many things, many of them hav­ing noth­ing to do with food, or sur­vival. The con­cept of ‘I’ and ‘lunch’ are not even par­tic­u­larly well defined.

In con­trast, some­one held to your stan­dard of cor­rect­ness would have to say ‘The com­pu­ta­tion func­tion im­ple­mented in the cluster of mass from which these en­coded pres­sure waves are em­a­nat­ing will ex­e­cute a se­ries of ac­tion for which they pre­dict that in the ma­jor­ity of fu­ture Everette branches of this fork of the world tree, the afore­men­tioned cluster of mass will ac­cu­mu­late new am­pli­tude and po­ten­tial en­ergy through the pro­cess of di­ges­tion within the next hour and fif­teen min­utes.’

Clearly this is more effi­cient and less con­fus­ing to the reader.

• I con­sider the se­lec­tion of analo­gies made in the par­ent to con­sti­tute a mis­rep­re­sen­ta­tion (and fun­da­men­tal mi­s­un­der­stand­ing) of the pre­ced­ing con­ver­sa­tion.

• I’m very torn on this prob­lem. Every time I think I’ve got it figured out and start typ­ing out my rea­sons why, I change my mind, and throw away my 6+ para­graph ex­pla­na­tion and start over, ar­gu­ing the op­po­site case, only to change my mind again.

I think the prob­lem has to do with strong con­flicts be­tween my ra­tio­nal ar­gu­ments and my in­tu­ition. This prob­lem is a much more in­ter­est­ing koan for me than one hand clap­ping, or tree in the for­est.

• I think my an­swer would be “I would have agreed, had you asked me when the coin chances were .5 and .5. Now that they’re 1 and 0, I have no rea­son to agree.”

Se­ri­ously, why stick with an agree­ment you never made? Be­sides, if Omega can pre­dict me this well he knows how the coin will come up and how I’ll re­act. Why then, should I try to act oth­er­wise. Some­how, I think I just don’t get it.

• Be­sides, if Omega can pre­dict me this well he knows how the coin will come up and how I’ll re­act.

It doesn’t mat­ter too much but we can as­sume the Omega doesn’t know how the coin will come up.

Why then, should I try to act oth­er­wise.

That would be rather fu­tile, wouldn’t it? Of course, de­cid­ing to give Omega \$100 now isn’t try­ing to change how you would re­act, it is just choos­ing your re­ac­tion.

• So, is it rea­son­able to pre-com­mit to giv­ing the \$100 in the coun­ter­fac­tual mug­ging game? (Pre-com­mit­ment is one solu­tion to the New­comb prob­lem.) On first glance, it seems that a pre-com­mit­ment will work.

But now con­sider “counter-coun­ter­fac­tual mug­ging”. In this game, Omega meets me and scans my brain. If it finds that I’ve pre-com­mit­ted to hand­ing over the \$s in the coun­ter­fac­tual mug­ging game, then it emp­ties my bank ac­count. If I haven’t pre-com­mit­ted to do­ing any­thing in coun­ter­fac­tual mug­ging, then it re­wards me with \$1 mil­lion. Damn.

So what should I pre-com­mit to do­ing, if any­thing? Should I some­how try to as­sess my like­li­hood of meet­ing Omega (in some form or other) and guess what sort of par­lour game it is likely to play with me, and for what stakes? Has any­one got any idea how to do that as­sess­ment, with­out un­duly priv­ileg­ing the games that we hap­pen to have thought of so far? This way mad­ness lies I fear...

The in­ter­est with these Omega games is that we don’t meet ac­tual Omegas, but do meet each other, and the effects are some­times rather similar. We do like the thought of friends who’ll give us \$1000 if we re­ally need it (say in a once-in-a-life­time emer­gency, with no like­li­hood of re­ciproc­ity) be­cause they be­lieve we’d do the same for them if they re­ally needed it. We don’t want to call that be­havi­our ir­ra­tional. Isn’t that the real point here?

• Should I some­how try to as­sess my like­li­hood of meet­ing Omega (in some form or other) and guess what sort of par­lour game it is likely to play with me, and for what stakes? Has any­one got any idea how to do that as­sess­ment, with­out un­duly priv­ileg­ing the games that we hap­pen to have thought of so far? This way mad­ness lies I fear...

Not ex­actly mad­ness, but Pas­cal’s wa­ger. If you haven’t seen any ev­i­dence of Omega ex­ist­ing by now, nor any the­ory be­hind how pre­dic­tions such as his could be pos­si­ble, and word of his par­lour game prefer­ences has not reached you, then chances are that he is so un­likely in this uni­verse that he is in the same cat­e­gory as Pas­cal’s wa­ger.

• There is one nice thing about the real-world friend case, which is that you ac­tu­ally might be in the re­verse situ­a­tion later. So it’s not just a coun­ter­fac­tual you’re con­sid­er­ing; it’s a real fu­ture pos­si­bil­ity.

Take that away and it’s more like Omega; but then it’s not the real-world prob­lem any­more!

• This prob­lem seems con­cep­tu­ally iden­ti­cal to Kavka’s toxin puz­zle; we have merely re­placed in­tend­ing to drink the poi­son/​pay \$100 with be­ing the sort of per­son whom Omega would pre­dict would do it.

• Since, as has been pointed out, one needn’t be a perfect pre­dic­tor for the game to work, I think I’ll ac­tu­ally try this on some of my friends.

• Thanks for re­mind­ing of the Kavka’s puz­zle. I think that puz­zle is un­nec­es­sar­ily men­tal in its for­mu­la­tion, for ex­am­ple you have to “in­tend”. It’s less con­fus­ing when you work on more tech­ni­cal con­cepts of de­ci­sion-mak­ing, ev­i­dence, prefer­ence and pre­com­mit­ment.

I can’t imag­ine how you are go­ing to perform this on your friends...

• The main prob­lem, I think, is get­ting them to be­lieve that I’m a re­li­able pre­dic­tor (i.e. that I pre­dict as well as I claim I do).

Ac­tu­ally, I don’t know that if I do this it will show any­thing rele­vant to the prob­lem un­der con­sid­er­a­tion. But I think it will show some­thing. It has in fact already shown that I be­lieve that 59% of them would agree to give me the money, ei­ther be­cause they are suffi­ciently similar to Eliezer, or be­cause they en­joy ran­dom acts of silli­ness (and the amount of money in­volved will be pretty triv­ial).

• Did you do it? And if so, did you give away money to the friends you pre­dicted would have given you money, if the coin came up that way?

How much money did you lose?

• No, I never got around to ac­tu­ally do­ing it I’m afraid.

• Whether I give Omega the \$100 de­pends en­tirely on whether there will be mul­ti­ple iter­a­tions of coin-flip­ping. If there will be mul­ti­ple iter­a­tions, giv­ing Omega the \$100 is in­deed win­ning, just like buy­ing a fi­nan­cial in­stru­ment that in­creases in value is win­ning.

• No, there are no iter­a­tions. Omega flies away from your galaxy, right af­ter finish­ing the trans­ac­tion. (Added to P.S.)

• In that case, I’d hate to dis­ap­point Omega, but there’s no in­cen­tive for me to give up my \$100. A util­ity of 0 is bet­ter than a nega­tive util­ity, and if the coin-flip is de­ter­minis­tic, I won’t be serv­ing the in­ter­ests of my al­ter­nate-uni­verse self. Why would I choose oth­er­wise?

• Would you pre­fer to choose oth­er­wise if you con­sid­ered the deal be­fore the ac­tual coin toss, and ar­range the pre­com­mit­ment to that end?

• Yes, then, fol­low­ing the util­ity func­tion you speci­fied, I would gladly risk \$100 for an even chance at \$10000. Since Omega’s om­ni­scient, I’d be hon­est about it, too, and cough up the money if I lost.

• Yes, then, fol­low­ing the util­ity func­tion you speci­fied, I would gladly risk \$100 for an even chance at \$10000. Since Omega’s om­ni­scient, I’d be hon­est about it, too, and cough up the money if I lost.

If it’s ra­tio­nal to do this when Omega asks you in ad­vance, isn’t it also ra­tio­nal to make such a com­mit­ment right now? Whether you make the com­mit­ment in re­sponse to Omega’s no­tifi­ca­tion, or on a whim when con­sid­er­ing the thought ex­per­i­ment in re­sponse to a blog post makes no differ­ence to the pay­off. If you now com­mit to a “if this ex­act situ­a­tion comes up, I will com­mit to pay­ing the \$100 if I lose the coin­flip”, and p(x) is the prob­a­bil­ity of this situ­a­tion oc­cur­ring, you will achieve a net gain of \$4950*p(x) over a non-com­mit­ter (a very small num­ber ad­mit­tedly given that p(x) is tiny, but for the sake of the thought ex­per­i­ment all that mat­ters is that it’s pos­i­tive.)

Given that some­one who makes such a pre­com­mit­ment comes out ahead of some­one who doesn’t—shouldn’t you make such a com­mit­ment right now? Ex­tend this and make a pre­com­mit­ment to always make the de­ci­sion to perform the ac­tion that would max­imise your av­er­age re­turns in all such new­combe­like situ­a­tions and you’re go­ing to come off even bet­ter on av­er­age.

• No, I will not pre­com­mit to giv­ing up my \$100 for cases where Omega de­mands the money af­ter the coin flip has oc­curred. There is no in­cen­tive to pre­com­mit in those cases, be­cause the out­come is already against me and there’s not a chance that it “would” go in my favour.

• At that point, it’s no longer a pre­com­mit­tal—it’s how you face the con­se­quences of your de­ci­sion whether to pre­com­mit or not.
Note that the hy­po­thet­i­cal loss case pre­sented in the post is not in fact the de­ci­sion point—that point is when you first con­sider the mat­ter, which is ex­actly what you are do­ing right now. If you would re­ally change your an­swer af­ter con­sid­er­ing the mat­ter, then hav­ing now done so, have you changed it?

If you want to ob­tain the ad­van­tage of some­one who makes such a pre­com­mit­tal (and sticks to it), you must be some­one who would do so. If you are not such a per­son (and given your an­swer, you are not) it is ad­van­ta­geous to change your­self to be such a per­son, by mak­ing that pre­com­mit­ment (or bet­ter, a gen­er­al­ised “I will always take the path would have max­imised re­turns across the dis­tri­bu­tion of coun­ter­fac­tual out­comes in New­comblike situ­a­tions”) im­me­di­ately.

Such com­mit­ments change the dy­nam­ics of many such thought ex­per­i­ments, but usu­ally they re­quire that that com­mit­ment be known to the other per­son, and en­forced some way (The way to win at Chicken is to throw your steer­ing wheel out the win­dow). Here though, Omega’s knowl­edge of us re­moves the need to ex­plicit an­nounce­ment, and it is in our own in­ter­ests to be self-en­forc­ing (or rather we wish to re­li­ably en­force the de­ci­sion on our fu­ture selves), or we will not re­ceive the benefit. For that rea­son, a silent de­ci­sion is as effec­tive as hav­ing a con­ver­sa­tion with Omega and tel­ling it how we de­cide.

Ex­plic­itly an­nounc­ing our de­ci­sion thus only has an effect in­so­far as it keeps your fu­ture self hon­est. Eg. if you know you wouldn’t keep to a de­ci­sion idly ar­rived at, but value your word such that you would stick to do­ing what you said you would de­spite its ir­ra­tional­ity in that case, then it is cur­rently in your in­ter­est to give your word. It’s just as much in your in­ter­est to give your word now though—make some pub­lic promise that you would keep. Alter­na­tively if you have suffi­cient mechanisms in your mind to com­mit to such fu­ture ir­ra­tional be­havi­our with­out a for­mal promise, it be­comes un­nec­ces­sary.

• Maybe in thought-ex­per­i­ment-world. But if there’s a sig­nifi­cant chance that you’ll misi­den­tify a con man as Omega, then this ten­dency makes you lose on av­er­age.

• Sure—all bets are off if you aren’t ab­solutely sure Omega is trust­wor­thy.

I think this is a large part of the rea­son why the in­tu­itive an­swer we jump to is re­jec­tion. Be­ing told we be­lieve a be­ing mak­ing such ex­traor­di­nary claims is differ­ent to ac­tu­ally be­liev­ing them (es­pe­cially when the claims may have un­pleas­ant im­pli­ca­tions to our be­liefs about our­selves), so have a ten­dency to con­sider the prob­lem with the im­plicit doubt we have for ev­ery­day in­ter­ac­tions lurk­ing in our minds.

• Bri­anm un­der­stands re­flec­tive con­sis­tency!

• you will achieve a net gain of \$4950*p(x) over a non-com­mit­ter (a very small num­ber ad­mit­tedly given that p(x) is tiny, but for the sake of the thought ex­per­i­ment all that mat­ters is that it’s pos­i­tive.)

Given that some­one who makes such a pre­com­mit­ment comes out ahead of some­one who doesn’t—shouldn’t you make such a com­mit­ment right now?

Right now, yes, I should pre­com­mit to pay the \$100 in all such situ­a­tions, since the ex­pected value is p(x)*\$4950.

If Omega just walked up to me and asked for \$100, and I had never con­sid­ered this be­fore, the value of this com­mit­ment is now p(x)*\$4950 - \$100, so I would not pay un­less I thought there was more than a 2% chance this would hap­pen again.

• So af­ter you ob­serve the coin toss, and find your­self in a po­si­tion where you’ve lost, you’ll give Omega your money? Why would you? It won’t ever re­cip­ro­cate, and it won’t en­force the deal, its only en­force­ment are those \$10000 that you know got away any­way, be­cause you didn’t win the coin toss.

• Yes, I’ll give Omega the money, be­cause if I’m go­ing to re­fuse to give Omega the money af­ter the coin toss oc­curs, Omega knows ahead of time on ac­count of his om­ni­science. If I had won, Omega could look at me and say, “You get no money, be­cause I know you wouldn’t have re­ally given me the \$100 if you’d lost. Your pre-com­mit­ment wasn’t gen­uine.”

• My an­swer to this is that in­tegrity is a virtue, and break­ing one’s promises re­duces one’s in­tegrity. And be­ing a per­son with in­tegrity is vi­tal to the good life.

• Then I re­peat the ques­tion with MBlume’s cor­rec­tions, to make the prob­lem less con­ve­nient. Would you still fol­low up and mur­der 15 peo­ple, to pre­serve your per­sonal in­tegrity? It’s not a ques­tion of val­ues, it’s a ques­tion of de­ci­sion the­ory.

• This thread as­sumes a pre­com­mit­ment. I would not pre­com­mit to mur­der.

It’s not a ques­tion of val­ues, it’s a ques­tion of de­ci­sion the­ory.

I’m not sure what your point is here.

• The point is that the dis­tinc­tion be­tween \$0.02 and a trillion lives is ir­rele­vant to the dis­cus­sion, which is about the struc­ture of prefer­ence or­der as­signed to ac­tions, what­ever your val­ues are. If you are de­ter­mined to pay off Omega, the rea­son for that must be in your de­ci­sion al­gorithm, not in an exquisite bal­ance be­tween \$100, per­sonal in­tegrity, and mur­der. If you are will­ing to carry the deal through (note that there isn’t even any deal, only your pre­med­i­tated de­ci­sion), the rea­son for that must lie el­se­where, not in the value of per­sonal in­tegrity.

• To make that claim, you do need to first es­tab­lish that he would ac­cept a bet of 15 lives vs some re­ward in the first place, which I think is what he is claiming he would not do. There’s a differ­ence be­tween mak­ing a bet and reneg­ing, and not ac­cept­ing the bet. If you would not com­mit mur­der to save a mil­lion lives in the first place, then the re­fusal is for a differ­ent rea­son than just the fact that the stakes are raised.

• In­tegrity is a virtue, not a value.

The val­ues aren’t nec­es­sar­ily rele­vant af­ter I’ve pre­com­mit­ted to the bet, but they’re ab­solutely rele­vant to whether I’d pre­com­mit to the bet. If mur­der is one of the op­tions, count me out.

My rea­son for car­ry­ing the deal through is (par­tially) that it pro­motes virtue. I do not see any ar­gu­ments that it can­not be so.

• My rea­son for car­ry­ing the deal through is (par­tially) that it pro­motes virtue. I do not see any ar­gu­ments that it can­not be so.

Too vague.

• What’s vague? Let me try to spell this out in ex­cru­ci­at­ing de­tail:

Mak­ing good on one’s com­mit­ments pro­motes the virtue of in­tegrity.
In­tegrity is con­sti­tu­tive of good char­ac­ter.
One can­not con­sis­tently act as a per­son of good char­ac­ter with­out hav­ing it.
To act eth­i­cally is to act as a per­son of good char­ac­ter does.
Ethics speci­fies what one has most rea­son to do or want.

So, if you ask me what I have most rea­son to do in a cir­cum­stance where I’ve made a com­mit­ment, ce­teris paribus, I’ll re­spond that I’ll make good on my com­mit­ments.

• There is a caveat: if you are an agent who is con­structed to live in the world where Omega tossed its coin to come out tails, so that the state space for which your util­ity func­tion and prior are defined doesn’t con­tain the ar­eas cor­re­spond­ing to the coin com­ing up heads, you don’t need to give up \$100. You only give up \$100 as a trib­ute to the part of your moral­ity speci­fied on the coun­ter­fac­tual area of the state space.

• I would one-box on New­combe, and I be­lieve I would give the \$100 here as well (as­sum­ing I be­lieved Omega).

With New­combe, if I want to win, my op­ti­mal strat­egy is to mimic as closely as pos­si­ble the type of per­son Omega would pre­dict would take one box. How­ever, I have no way of know­ing what would fool Omega: in­deed if it is a suffi­ciently good pre­dic­tor there may be no such way. Clearly then the way to be “as close as pos­si­ble” to a one-boxer is to be a one-boxer. A per­son seek­ing to op­ti­mise their re­turns will be a per­son who wants their re­sponse to such stim­u­lus to be “take one box”. I do want to win, so I do want my re­sponse to be that, so it is: I’m ca­pa­ble of lock­ing my de­ci­sions (mak­ing promises) in ways that forgo short-term gain for longer term benefit.

The situ­a­tion here is the same, even though I have already lost. It is benefi­cial for me to be that type of per­son in gen­eral (ob­scured by the fact that the situ­a­tion is so un­likely to oc­cur). Were I not the type of per­son who made the de­ci­sion to pay out on loss, I would be the type of per­son that lost \$10000 in an equally un­likely cir­cum­stance. Lock­ing that re­sponse in now as a gen­eral re­sponse to such oc­cur­rances means I’m more likely to benefit than those who don’t.

• I would one-box on New­combe, and I be­lieve I would give the \$100 here as well (as­sum­ing I be­lieved Omega).

With New­combe, if I want to win, my op­ti­mal strat­egy is to mimic as closely as pos­si­ble the type of per­son Omega would pre­dict would take one box.

Well, the other way to look at it is “What ac­tion leads me to win?” in the New­comb prob­lem, one-box­ing wins, so you and I are in agree­ment there.

But in this prob­lem, not-giv­ing-away-\$100 wins. Sure, I want to be the “type of per­son who one boxes”, but why do I want to be that per­son? Be­cause I want to win. Be­ing that type of per­son in this prob­lem ac­tu­ally makes you lose.

The prob­lem states that this is a one-shot bet, and that af­ter you do or don’t give Omega the \$100, he flies away from this galaxy and will never in­ter­act with you again. So why give him the \$100? It won’t make you win in the long term.

• Yes, but Omega isn’t re­ally here yet, and you, Nebu, de­cid­ing right now that you will give him \$100 does make you win, since it gives you a shot at \$10000.

• Right, so if a nor­mal per­son offered me the bet (and as­sum­ing I could some­how know it was a fair coin) then yes, I would ac­cept the bet.

If it was Omega in­stead of a nor­mal per­son offer­ing the bet, we run into some prob­lems...

But if Omega doesn’t ac­tu­ally offer the bet, and just does what is de­scribed by Vladimir Nesov, then I wouldn’t give him the \$100. [1]

In other words, I do differ­ent things in differ­ent situ­a­tions.

Edit 1: (Or maybe I would. I haven’t figured it out yet.)

• The prob­lem only asks about what you would do in the failure case, and I think this ob­scures the fact that the rele­vant de­ci­sion point is right now. If you would re­fuse to pay, that means that you are the type of per­son who would not have won had the coin flip turned out differ­ently, ei­ther be­cause you haven’t con­sid­ered the mat­ter (and luck­ily turn out to be in the situ­a­tion where your choice worked out bet­ter), or be­cause you would re­nege on such a com­mit­ment when it oc­curred in re­al­ity.

How­ever at this point, the coin flip hasn’t been made. The globally op­ti­mal per­son to be right now is one that does pre­com­mit and doesn’t re­nege. This per­son will come out be­hind in the hy­po­thet­i­cal case as it re­quires we lock our­selves into the bad choice for that situ­a­tion, but by be­ing a per­son who would act “ir­ra­tionally” at that point, they will out­perform a non-com­mit­ter/​re­neger on av­er­age.

• What if there is no “on av­er­age”, if the choice to give away the \$100 is the only choice you are given in your life? There is no value in be­ing the kind of per­son who globally op­ti­mizes be­cause of the ex­pec­ta­tion to win on av­er­age. You only make this choice be­cause it’s what you are, not be­cause you ex­pect the re­al­ity on av­er­age to be the way you want it to be.

• From my per­spec­tive now, I ex­pect the re­al­ity to be the win­ning case 50% of the time be­cause we are told this as part of the ques­tion: Omega is trust­wor­thy and said it tossed a fair coin. In the pos­si­ble fu­tures where such an event could hap­pen, 50% of the time my strat­egy would have paid off to a greater de­gree than it would lose the other 50% of the time. If omega did not toss a fair coin, then the situ­a­tion is differ­ent, and my choice would be too.

There is no value in be­ing the kind of per­son who globally op­ti­mizes be­cause of the ex­pec­ta­tion to win on av­er­age.

There is no value in be­ing such a per­son if they hap­pen to lose, but that’s like say­ing there’s no value in be­ing a per­son who avoids bets that lose on av­er­age by only pos­ing the 1 in sev­eral mil­lion time they would have won the lot­tery. On av­er­age they’ll come out ahead, just not in the spe­cific situ­a­tion that was de­scribed.

• I’m way late to this party, but aren’t we ig­nor­ing some­thing ob­vi­ous? Such as im­perfect knowl­edge of how likely Omega is to be right about its pre­dic­tion of what you would do? If you live in a uni­verse where Omega is a known fact and no­body thinks them­selves in­sane when they meet him, well, then it’s the de­gen­er­ate case where you are 100% cer­tain that Omega pre­dicts cor­rectly. If you lived in such a uni­verse pre­sum­ably you would know it, and ev­ery­one in that world would pre-com­mit to giv­ing Omega \$100, just like in ours pizza-de­liv­er­ers pre-com­mit to not car­ry­ing more than a small amount of cash with them.

There may be other uni­verses where Omega is known to be right and do what he says he will do 80% of the time. Or ones where there are ru­mors of an om­ni­scient Omega that always makes good on his word, but you as­sign them 80% prob­a­bil­ity of be­ing true. And so on.

Given the \$5000 ex­pected pay­off and the \$50 ex­pected cost for pre com­mit­ting, you should do it if the prob­a­bil­ity of Omega be­ing both right and trust­wor­thy is greater than or equal to 0.01.

But, if you, know­ing what you know about THIS uni­verse, sud­denly found your­self in the pres­ence of some alien en­tity mak­ing the claim Omega makes in the above sce­nario, what kind of ev­i­dence would you de­mand for this claim be­fore as­sign­ing a prob­a­bil­ity greater than 0.01?

It oc­curs to me that the dude in the robe and mask pre­tend­ing to be Omega could up the ante to \$1000000, and if I wouldn’t be­lieve him more than 0.01% given a \$10000 pay­off, it prob­a­bly wouldn’t mat­ter to me what he offered as a pay­off, be­cause if he has enough delu­sions and/​or chutz­pah to make this claim in this uni­verse, there’s no rea­son for him to balk at adding on a few ex­tra dec­i­mal places. I’m not sure how to for­mal­ize that math­e­mat­i­cally, though.

• Un­der my syn­tac­ti­cist cos­mol­ogy, which is a kind of Teg­markian/​Al­mon­dian crossover (with mea­sure flow­ing along the seem­ingly ‘back­ward’ causal re­la­tions), the an­swer be­comes triv­ially “yes, give Omega the \$100” be­cause coun­ter­fac­tual-me ex­ists. In fact, since this-Omega simu­lates coun­ter­fac­tual-me and coun­ter­fac­tual-Omega simu­lates this-me, the (back­wards) flow of mea­sure en­sures that the sub­jec­tive prob­a­bil­ities of find­ing my­self in real-me and coun­ter­fac­tual-me must be fairly close to­gether; con­se­quently this re­mains my de­ci­sion even in the Al­mon­dian va­ri­ety. The purer and more el­e­gant ver­sion of syn­tac­ti­cism doesn’t place a mea­sure on the Teg­mark-space at all, but that makes it difficult to ex­plain the reg­u­lar­ity of our uni­verse—with­out a prob­a­bil­ity dis­tri­bu­tion on Teg­mark-space, you can’t even math­e­mat­i­cally ap­proach an­throp­ics. How­ever, in that ver­sion coun­ter­fac­tual-me ‘ex­ists to the same ex­tent that I do’, and so again the an­swer is triv­ially “give Omega the \$100″.

Coun­ter­fac­tual prob­lems can be solved in gen­eral by tak­ing one’s util­i­tar­ian sum­ma­tion over all of syn­tax-space rather than merely one’s own Uni­verse/​hub­ble bub­ble/​Everett branch. The out­stand­ing prob­lem is whether syn­tax-space should have a mea­sure and if so what its na­ture is (and whether this mea­sure can be com­puted).

• since this-Omega simu­lates coun­ter­fac­tual-me and coun­ter­fac­tual-Omega simu­lates this-me

Does syn­tac­ti­cism work if you know Omega likes simu­lat­ing poor you, and each simu­lated rich you is coun­ter­bal­anced by many simu­lated poor yous? Or only in spe­cial cases like you men­tioned?

• Yes, it still works, be­cause of the way the sub­jec­tive prob­a­bil­ity flow on Teg­mark-space works. (Think of it like PageRank, and re­mem­ber that the s.p. flows from the simu­lated to the simu­la­tor)

It is tech­ni­cally pos­si­ble that the differ­ences be­tween how much the two Uni­verses simu­late each other can, when com­bined with differ­ences in how much they are simu­lated by other Uni­verses, can cause the cou­pling be­tween the two not to be strong enough to over­ride some other cou­plings, with the re­sult that the s.p. ex­pec­ta­tion of “giv­ing Omega the \$100” is nega­tive. How­ever, un­der my cur­rent state of log­i­cal un­cer­tainty about the cou­plings, that out­come is rather un­likely, so tak­ing a fur­ther ex­pec­ta­tion over my guesses of how likely var­i­ous cou­plings are, the deal is still a good one.

Ac­tu­ally, in my own think­ing I no longer call it “Teg­mark-space”, in­stead I call it the “Causal­ity Man­i­fold” and I’m work­ing on try­ing to find a for­mal math­e­mat­i­cal ex­pres­sion of how causal loop un­fold­ing can work in a con­tin­u­ous con­text. Also, I’m no longer wor­ried about the “purer and more el­e­gant ver­sion” of syn­tac­ti­cism, be­cause to­day I worked out how to ex­plain the sub­jec­tive favour­ing of reg­u­lar uni­verses (over ir­reg­u­lar ones, which are much more nu­mer­ous). One thing that does worry me, though, is that ev­ery pos­si­ble Causal­ity Man­i­fold is also an el­e­ment of the CM, which means ei­ther stupidly large car­di­nal ax­ioms or some kind of var­i­ant of the “No Gödels” ar­gu­ment from Syn­tac­ti­cism (the ar­ti­cle).

• If I found my­self in this kind of sce­nario then it would im­ply that I was very wrong about how I rea­son about an­throp­ics in an en­sem­ble uni­verse (as with Pas­cal’s mug­ging or any sort of situ­a­tion where an agent has enough com­put­ing power to take con­trol of that much of my mea­sure such that I find my­self in a con­trived philo­soph­i­cal ex­per­i­ment). In fact, I would be so sur­prised to find my­self in such a situ­a­tion that I would ques­tion the rea­son­ing that led me to think one box­ing was the best course of ac­tion in the first place, be­cause some­where along the way my model be­came very con­fused. (I’d still one box, but it would seem less ob­vi­ous af­ter tak­ing into ac­count the huge amount of pre­vi­ously un­ex­pected struc­tural un­cer­tainty my model of the world sud­denly has to deal with.)

• If I found my­self in this kind of sce­nario then it would im­ply that I was very wrong about how I rea­son about an­throp­ics in an en­sem­ble uni­verse (as with Pas­cal’s mug­ging or any sort of situ­a­tion where an agent has enough com­put­ing power to take con­trol of that much of my mea­sure such that I find my­self in a con­trived philo­soph­i­cal ex­per­i­ment).

I see some rea­sons for this per­spec­tive but I’m not sure.

On the one hand, I don’t know much about the dis­tri­bu­tion of agent prefer­ences in an en­sem­ble uni­verse. But there may be enough long tow­ers of nested simu­la­tions of agents like us to com­pen­sate for this.

• This is just the one-shot Pri­soner’s Dilemma. You be­ing split into two differ­ent pos­si­ble wor­lds, is just like the two pris­on­ers be­ing taken into two differ­ent cells.

There­fore, you should give Omega \$100 if and only if you would co­op­er­ate in the one-shot PD.

• I don’t see the difficulty. No, you don’t win by giv­ing Omega \$100. Yes, it would have been a win­ning bet be­fore the flip if, as you spec­ify, the coin is fair. Your PS, in which you say to “as­sume that in the over­whelming mea­sure of the MWI wor­lds it gives the same out­come”, con­tra­dicts the as­ser­tion that the coin is fair, and so you have asked us for an an­swer to an in­co­her­ent ques­tion.

• Your PS, in which you say to “as­sume that in the over­whelming mea­sure of the MWI wor­lds it gives the same out­come”, con­tra­dicts the as­ser­tion that the coin is fair, and so you have asked us for an an­swer to an in­co­her­ent ques­tion.

This doesn’t sound right to me. The coin doesn’t need to be quan­tum me­chan­i­cal to be fair. Here is a fair but perfectly de­ter­minis­tic coin: the 1098374928th digit of pi, mod 2. I have no idea whether it’s a zero or one. I could figure it out if you gave me enough time, as could Omega. If both of us agree not to take the time to figure it out in ad­vance, we can use it as a fair coin. But in all Everett branches, it comes out the same way.

• I don’t see the difficulty. No, you don’t win by giv­ing Omega \$100. Yes, it would have been a win­ning bet be­fore the flip if, as you spec­ify, the coin is fair.

The difficulty comes from pro­ject­ing the ideal de­ci­sion the­ory on peo­ple. Look how many peo­ple are ready to pay up \$100, so it must be a real difficulty.

The fair­ness of a coin is a prop­erty of your mind, not of the coin it­self. The coin can be fair in a de­ter­minis­tic world, the same way you can have free will in de­ter­minis­tic world.

• Bet­ter to say that your state of knowl­edge about the coin, prior to Omega ap­pear­ing, is that it has a prob­a­bil­ity 12 of be­ing heads and 12 of be­ing tails. The MWI clause is sup­posed to make the prob­lem harder by pre­vent­ing you from as­sign­ing util­ity (once Omega ap­pears) to your ‘other selves’ in other Everett branches. The prob­lem is then just: “how, know­ing that Omega might ap­pear, but not know­ing what the coin flip will be, can I max­imise my util­ity?” If Omega ap­pears in front of you right now then that’s a differ­ent ques­tion.

• My state of knowl­edge about the coin prior to Omega ap­pear­ing is that I don’t even know that the coin is go­ing to be flipped, ac­tu­ally.

• Nor­mally, you can as­sume your thought pro­cesses are un­cor­re­lated with whats out there. New­comb-like prob­lems how­ever, do have the state of the out­side uni­verse cor­re­lated with your ac­tual thoughts, and this is what throws peo­ple off.

If you are un­sure if the state of the uni­verse is X or Y (say with p = 12 for sim­plic­ity), and we can chose ei­ther op­tion A or B, we can calcu­late the ex­pected util­ity of choos­ing A vs B by tak­ing 12u(A,X)+1/​2u(A,Y) and com­par­ing it to 12u(B,X)+1/​2u(B,Y).

In a new­comb-like prob­lem, where the state of the ex­per­i­ment is ac­tu­ally de­pen­dent on your choice, the ex­pected util­ity com­par­i­son should now be ~1u(A,X)+~0u(A,Y) vs ~0u(B,X)+~1u(B,Y).

In this case, it boils down to “Is u(A,X) > u(B,Y)?”.

It is not enough for Omega to have a de­cent record of get­ting it right, since you could prob­a­bly do pretty well by read­ing peo­ples com­ments and guess­ing based on that.

If Omega made its pre­dic­tion solely based on a com­ment you made on LessWrong, you should ex­pect that if you choose A the uni­verse will be in the same state as if you choose b- know­ing your ul­ti­mate de­ci­sion doesn’t tell you any­thing, since the only rele­vant ev­i­dence is what you said a month ago.

If, how­ever, Omega ac­tu­ally simu­lates your thought pro­cess in suffi­cient de­tail to know for sure which choice you made, know­ing that you ul­ti­mately de­cide to pick A is strong ev­i­dence that omega has set up X, and if you choose B, you bet­ter ex­pect to see Y.

The rea­son that the an­swer changes is that the state of the box ac­tu­ally does de­pend on the thoughts them­selves- it’s just that you thought the same thoughts when omega was simu­lat­ing you be­fore filling the boxes/​flip­ping the coin.

If you aren’t sure whether you’re just Omega’s simu­la­tion, you bet­ter one box/​pay omega. If we’re talk­ing about a wannabe Omega that just makes de­cent pre­dic­tions based off com­ments, then you defect (though if you ac­tu­ally ex­pect a situ­a­tion like this to come up, you ar­gue that you won’t)

• Omega’s ac­tions de­pend only on your de­ci­sion (ac­tion), or in this case coun­ter­fac­tual de­ci­sion, not on your thoughts or the al­gorithm you use to reach the de­ci­sion. The ac­tion of course de­pends on your thoughts, but that’s the usual case. You may move sev­eral steps back, seek­ing the ul­ti­mate cause, but that’s pretty fu­tile.

• I re­al­ise I’m com­ing to this a lit­tle late, but I’m a lit­tle un­clear about this case. This is my un­der­stand­ing:

When you ask me if I should give Omega the \$100, I com­mit to “yes” be­cause I am the agent who might meet Omega one day, and since I am in fact at the time be­fore the coin has been flipped right now, by the usual ex­pected value calcu­la­tions the ra­tio­nal choice is to de­cide to.

So does that mean that if I com­mit now (eg: by giv­ing my­self a mon­e­tary in­cen­tive to give the \$100), and my friend John meets Omega to­mor­row who has flipped the coin and it has landed tails, I should tell him that the ra­tio­nal choice is to not give the \$100, since he is de­cid­ing af­ter the coin toss.

Would any­one be so kind as to tell me if that seems right?

• Well, this comes up differ­ent ways un­der differ­ent in­ter­pre­ta­tions. If there is a chance that I am be­ing simu­lated, that is this is part of his de­ter­min­ing my choice, then I give him \$100. If the coin is quan­tum, that is there will ex­ist other mes get­ting the money, I give him \$100. If there is a chance that I will en­counter similar situ­a­tions again, I give him \$100. If I were in­formed of the deal be­fore­hand, I give him \$100. Given that I am not simu­lated, given that the coin is de­ter­minis­tic, and given that I will never again en­counter Omega, I don’t think I give him \$100. See­ing as I can treat this en­tirely in iso­la­tion due to these con­di­tions, I have the choice be­tween -\$100 and \$0, of which two op­tions the sec­ond is bet­ter. Now, this runs into some prob­lems. If I were in­formed of it be­fore­hand, I should have pre­com­mit­ted. See­ing as my choices given all in­for­ma­tion shouldn’t change, this pre­sents difficulty. How­ever, due to the unique­ness of this deal, there re­ally does seem to be no benefit to any mes from giv­ing him the money, and so it is purely a loss.

• No pre­com­mitt­ment, no deal.

• Sup­pose Omega gives you the same choice, but says that if a head had come up, it would have kil­led you, but only if you {would have re­fused|will re­fuse} to give it your lousy \$100 {if the coin had come up heads|given that the coin has come up heads}. Not sure what the cor­rect tense is, here.

I be­lieve that I would keep the \$100 in your prob­lem, but give it up in mine.

ETA: Can you clar­ify your postscript? Pre­sum­ably you don’t want the knowl­edge about the dis­tri­bu­tion of coin-flip states across fu­ture Everett branches to be available for the pur­poses of the ex­pected util­ity calcu­la­tion?

• I’m try­ing to set up a suffi­ciently in­con­ve­nient pos­si­ble world by in­tro­duc­ing ad­di­tional as­sump­tions. The one about MWI stops the ex­cuse of there be­ing other real you in the other MWI branches who do re­ceive the \$10000. Not al­lowed.

How do you pick the thresh­old, de­cide that [\$10000] < [de­ci­sion thresh­old] < [your life]?

• You’ve ac­tu­ally made it an eas­ier prob­lem for me, though, be­cause I re­gard my al­ter­nate selves as other peo­ple.

How do you peak the thresh­old, de­cide that [\$10000] < [de­ci­sion thresh­old] < [your life]?

If it were pos­si­ble for me to make a deal with my al­ter­nate self by which I get a few thou­sand dol­lars, I would ob­vi­ously sur­ren­der my \$100. As it isn’t pos­si­ble, I see lit­tle rea­son to give some­one oth­er­wise des­tined to be for­ever causally iso­lated from me \$10000 at the cost of \$100. I wouldn’t keep \$100 if it meant he lost \$10000, ei­ther. I prob­a­bly would keep the \$100 if they lost less than \$100. If my al­ter­nate self stood to gain, say, a mil­lion dol­lars, but noth­ing if I kept my \$100, then I prob­a­bly would give it up. But that would be as a whimsy, some­thing to think about and feel good. But the benefit to me of that whimsy would have to be worth more than \$100.

The pat­tern be­hind my choices is that the pain ex­pe­rienced by my al­ter­nate self (who, re­call, I con­sider a differ­ent per­son) in any of these cases is never more than \$100. I think this is the most we can ex­pect, on av­er­age, of other in­tel­li­gent be­ings: that they will not in­flict a large loss for a small gain. Why not steal, in that case? Be­cause there is, in fact, no such thing as to­tal fu­ture causal iso­la­tion.

• There is no al­ter­na­tive self. None at all. The al­ter­na­tive may be im­pos­si­ble ac­cord­ing to the laws of physics. It is only pre­sent in your im­perfect model of the world. You can’t trade with a fic­tion, and you shouldn’t em­pha­size with a fic­tion. What you de­cide, you de­cide in this our real world. You de­cide that it is right to make a sac­ri­fice, ac­cord­ing to your prefer­ences that only live in your model of the world, but speak about the re­al­ity.

• I think that this is a crit­i­cal point, wor­thy of a blog post of its own. Im­pos­si­ble pos­si­ble wor­lds are a con­fu­sion.
The in­cli­na­tion to trade with fic­tion seems like a se­ri­ous prob­lem within this com­mu­nity.

• I’ve mi­s­un­der­stood you to an ex­tent, then.

My prefer­ences don’t in­volve me sac­ri­fic­ing un­less some­one can get hurt. It doesn’t mat­ter whether that per­son ex­ists in an­other Everett branch, within Omega or in an­other part of the Teg­mark en­sem­ble, but there must be a some­one. I’ll play sym­metrist with ev­ery­one else (which is, in a nut­shell, what I said in my com­ment above) but not with my­self. You seem to want a per­son that is me, but minus the “ex­is­tence” prop­erty. I don’t think that is a co­her­ent con­cept.

OK, sup­pose that Omega came along right now and said to me “I have de­ter­mined that if you could be per­suaded that your ac­tions would have no con­se­quence, and then given the prob­lem you are cur­rently dis­cussing, you would in ev­ery case keep \$100. There­fore I will tor­ture you end­lessly.” I would not see this as proof of my ir­ra­tional­ity (in the sense of hope­lessly failing to achieve my prefer­ences). I don’t think that such a se­quence of events is ger­mane to the prob­lem as you see it, but I also don’t see how it is not ger­mane.

• How much do you know about many wor­lds, any­ways? My al­ter­nate self very much does ex­ist, the tech­ni­cal term is pos­si­bil­ity-cloud which will even­tu­ally di­verge no­tice­ably but which for now is just barely dis­t­in­guish­able from me.

there you go.

• How much do you know about many wor­lds, any­ways?

Vladimir_Nesov!2009 knew more than enough about Many Wor­lds to know how to ex­clude it as a con­sid­er­a­tion. Vladimir_Nesov!2013 prob­a­bly hasn’t for­got­ten.

My al­ter­nate self very much does ex­ist, the tech­ni­cal term is pos­si­bil­ity-cloud which will even­tu­ally di­verge no­tice­ably but which for now is just barely dis­t­in­guish­able from me.

No. It doesn’t ex­ist. Not all un­cer­tainty rep­re­sents knowl­edge about quan­tum events which will have sig­nifi­cant macro­scopic rele­vance. Some rep­re­sents mere ig­no­rance. This ig­no­rance can be about events that are close to de­ter­minis­tic—that means the ‘al­ter­nate selves’ have neg­ligible mea­sure and even less de­ci­sion the­o­retic rele­vance. Other un­cer­tainty rep­re­sents log­i­cal un­cer­tainty. That is, where the al­ter­nate selves don’t even ex­ist in the triv­ial ir­rele­vant sense. It was just that the par­ti­ci­pant didn’t know that “2+2=4” yet.

• My al­ter­nate self very much does exist

Given that many-wor­lds is true, yes. In­vok­ing it kind of defeats the pur­pose of the de­ci­sion the­ory prob­lem though, as it is meant as a test of re­flec­tive con­sis­tency (i.e. you are sup­posed to as­sume you pre­fer \$100>\$0 in this world re­gard­less of any other wor­lds).

• Ok so there’s a good chance I’m just be­ing an idiot here, but I feel like a mul­ti­ple wor­lds kind of in­ter­pre­ta­tion serves well here. If, as you say, “the coin is de­ter­minis­tic, [and] in the over­whelming mea­sure of the MWI wor­lds it gives the same out­come,” then I don’t be­lieve the coin is fair. And if the coin isn’t fair, then of course I’m not giv­ing Omega any money. If, on the other hand, the coin is fair, and so I have rea­son to be­lieve that in roughly half of the wor­lds the coin landed on the other side and Omega posed the op­po­site ques­tion, then by giv­ing Omega the \$100 I’m giv­ing the me in those other wor­lds \$1000 and I’m perfectly happy to do that.

• is the de­ci­sion to give up \$100 when you have no real benefit from it, only coun­ter­fac­tual benefit, an ex­am­ple of win­ning?

No, it’s a clear loss.

The only win­ning sce­nario is, “the coin comes down heads and you have an effec­tive com­mit­ment to have paid if it came down tails.”

By mak­ing a bind­ing pre­com­mit­ment, you effec­tively gam­ble that the coin will come down heads. If it comes down tails in­stead, clearly you have lost the gam­ble. Giv­ing the \$100 when you didn’t even make the pre­com­mit­ment would just be pointlessly giv­ing away money.

• Not sure how to delete, but this was meant to be a re­ply.

• I think that what re­ally does my head in about this prob­lem is, al­though I may right now be mo­ti­vated to make a com­mit­ment, be­cause of the hope of win­ning the 10K, nonethe­less my com­mit­ment can­not rely on that mo­ti­va­tion, be­cause when it comes to the crunch, that pos­si­bil­ity has evap­o­rated and the as­so­ci­ated mo­ti­va­tion is gone. I can only make an effec­tive com­mit­ment if I have some­thing more per­sis­tent—like the sug­gested \$1000 con­tract with a third party. Without that, I can­not trust my fu­ture self to fol­low through, be­cause the rea­sons that I would cur­rently like it to fol­low through will no longer ap­ply.

MBlume stated that if you want to be known as the sort of per­son who’ll do X given Y, then when Y turns up, you’d bet­ter do X. That’s a good prin­ci­ple—but it too can’t ap­ply, un­less at the point of be­ing pre­sented with the re­quest for \$100, you still care about be­ing known as that sort of per­son—in other words, you ex­pect a later rep­e­ti­tion of the sce­nario in some form or an­other. This ap­plies as well to Eliezer’s rea­son­ing about how to de­sign a self-mod­ify­ing de­ci­sion agent—which will have to make many fu­ture de­ci­sions of the same kind.

Just want­ing the 10K isn’t enough to make an effec­tive pre­com­mit­ment. You need some mo­ti­va­tion that will per­sist in the face of no longer hav­ing the pos­si­bil­ity of the 10K.

• It seems to me the an­swer be­comes more ob­vi­ous when you stop imag­in­ing the coun­ter­fac­tual you who would have won the \$10000, and start imag­in­ing the 50% of su­per­po­si­tions of you who are cur­rently win­ning the \$10000 in their re­spec­tive wor­lds.

Every im­ple­men­ta­tion of you is you, and half of them are win­ning \$10000 as the other half lose \$100. Take one for the team.

• Sorry, but I’m not in the habit of tak­ing one for the quan­tum su­perteam. And I don’t think that it re­ally helps to solve the prob­lem; it just means that you don’t nec­es­sar­ily care so much about win­ning any more. Not ex­actly the point.

Plus we are ex­plic­itly told that the coin is de­ter­minis­tic and comes down tails in the ma­jor­ity of wor­lds.

• Sorry, but I’m not in the habit of tak­ing one for the quan­tum su­perteam.

If you’re not will­ing to “take one for the team” of su­pery­ous, I’m not sure you un­der­stand the im­pli­ca­tions of “ev­ery im­ple­men­ta­tion of you is you.”

And I don’t think that it re­ally helps to solve the prob­lem;

It does solve the prob­lem, though, be­cause it’s a con­sis­tent way to for­mal­ize the de­ci­sion so that on av­er­age for things like this you are win­ning.

it just means that you don’t nec­es­sar­ily care so much about win­ning any more. Not ex­actly the point.

I think you’re miss­ing the point here. Win­ning in this case is do­ing the thing that on av­er­age nets you the most suc­cess for prob­lems of this class, one sin­gle in­stance of it notwith­stand­ing.

Plus we are ex­plic­itly told that the coin is de­ter­minis­tic and comes down tails in the ma­jor­ity of wor­lds.

And this ex­plains why you’re miss­ing the point. We are told no such thing. We are told it’s a fair coin and that can only mean that if you di­vide up wor­lds by their prob­a­bil­ity den­sity, you win in half of them. This is defined.

What seems to be con­fus­ing you is that you’re told “in this par­tic­u­lar prob­lem, for the sake of ar­gu­ment, as­sume you’re in one of the wor­lds where you lose.” It states noth­ing about those wor­lds be­ing over rep­re­sented.

• We are told no such thing. We are told it’s a fair coin and that can only mean that if you di­vide up wor­lds by their prob­a­bil­ity den­sity, you win in half of them. This is defined.

No, take an­other look:

in the over­whelming mea­sure of the MWI wor­lds it gives the same out­come. You don’t care about a frac­tion that sees a differ­ent re­sult, in all re­al­ity the re­sult is that Omega won’t even con­sider giv­ing you \$10000, it only asks for your \$100.

• The only mechanisms I know of by which Omega can ac­cu­rately pre­dict me with­out in­tro­duc­ing para­doxes is by run­ning some­thing like a simu­la­tion, as oth­ers have sug­gested. But I re­ally, truly, only care about the uni­verse I hap­pen to know about, and for the life of me, I can’t figure out why I should care about any other. So even if the uni­verse I per­ceive re­ally is just simu­lated so that Omega can figure out what I would do in this situ­a­tion, I don’t un­der­stand why I should care about “my” util­ity in some other uni­verse. So, two box, keep my \$100.

Edit: I should add that my not car­ing about other uni­verses is con­di­tional on my hav­ing no rea­son to be­lieve they ex­ist.

• Ah. But un­der mild as­sump­tions about how Omega’s simu­la­tion works, I can ex­pect that with some prob­a­bil­ity p bounded away from zero, I am in a simu­la­tion. So with prob­a­bil­ity at least p, there is an­other uni­verse I care about, and I can in­crease util­ity there.

So, I guess I do pay \$100, but only be­cause my util­ity func­tion val­ues the util­ity of oth­ers. I re­main un­con­vinced that pay­ing is win­ning for some­one with a differ­ent util­ity func­tion.

• 14 Sep 2011 13:36 UTC
0 points

The Omega is also known to be ab­solutely hon­est and trust­wor­thy, no word-twist­ing, so the facts are re­ally as it says, it re­ally tossed a coin and re­ally would’ve given you \$10000.

How do I know that? I would as­sign a lower prior prob­a­bil­ity to that than to me wak­ing up to­mor­row with a blue ten­ta­cle in­stead of my right arm; so, it such a situ­a­tion, I would just be­lieve Omega is bul­lshit­ting me.

• See Least con­ve­nient pos­si­ble world. Th­ese tech­ni­cal difficul­ties are ir­rele­vant to the prob­lem it­self.

• It does seem like a le­gi­t­i­mate is­sue though, that a de­ci­sion the­ory that deals with the least con­ve­nient pos­si­ble world man­i­fes­ta­tion of the Coun­ter­fac­tual Mug­ging sce­nario is not nec­es­sar­ily well adapted in gen­eral.

• When to be­lieve what claims is a com­pletely sep­a­rate is­sue. We are look­ing at a thought ex­per­i­ment to get a bet­ter idea about what kinds of con­sid­er­a­tions should be taken into ac­count in gen­eral, not to build a par­tic­u­lar agent that does well in this situ­a­tion (and pos­si­bly worse in oth­ers).

• Is the sce­nario re­ally iso­mor­phic to any sort of real life dilemma though? An agent which com­mits to pay­ing out the \$100 could end up be­ing screwed over by an anti-Omega, which would pay out \$10,000 only to a per­son who wouldn’t give Omega the \$100. I’m not clear on what sort of gen­eral prin­ci­ples the thought ex­per­i­ment is sup­posed to illus­trate.

• Start from as­sum­ing that the agent jus­tifi­ably knows that the thought ex­per­i­ment is set up as it’s de­scribed.

• Do they know be­fore be­ing con­fronted by Omega, or only once con­fronted?

If they did not know in ad­vance that it’s more likely for Omega to ap­pear and con­duct the coun­ter­fac­tual mug­ging than it is for anti-Omega to ap­pear and re­ward those who wouldn’t co­op­er­ate on the coun­ter­fac­tual mug­ging, then I can’t see that there’s any point in time where the agent should ex­pect greater util­ity by com­mit­ting to co­op­er­ate on the coun­ter­fac­tual mug­ging. If they do know in ad­vance, then it’s bet­ter to pre­com­mit.

• It’s an as­sump­tion of the thought ex­per­i­ment that the player jus­tifi­ably learns about the situ­a­tion af­ter the coin is tossed, that they are deal­ing with Omega and not “anti-Omega” and some­how learn that to be the case.

• In that case, it doesn’t seem like there’s any point in time where a de­ci­sion to co­op­er­ate should have a pos­i­tive ex­pected util­ity.

• Cor­rect­ness of de­ci­sions doesn’t de­pend on cur­rent time or cur­rent knowl­edge.

• I have one minor ques­tion about this prob­lem, would I be al­lowed to say, offer omega \$50 in­stead of the \$100 he asked for in ex­change for \$5000 and the promise that, if it had oc­cured that the coin landed head, it would give me \$5000 and ask me for \$50, which he (go­ing to re­fer to all sen­ti­nents as he, that way I don’t have to waste time typ­ing figur­ing out whether the per­son I’m talk­ing about is he,she, or it.) would know to do since Omega would simu­late the me when the tail landed tails, and thus the simu­lated me would offer him this propo­si­tion. Which should not be too difficult to ac­cept, given that the cost to Omega is ba­si­cally zero across all pos­si­bil­ities, un­less part of the point of the ex­er­cise was to mess with me.

In the event that he re­jects this offer, I’m go­ing to give him \$100, and then mug him. (As­sum­ing of course that the prob­a­bil­ity is not 0 that I suc­ceed in the mug­ging, if he were to say kill me in re­sponse to the mug­ging, I’d sim­ply have the copy that suc­ceeded in mug­ging him force him to use some of his pow­ers to re­s­ur­rect my less for­tu­nate copies. As­sum­ing of course that that power is part of his om­nipo­tence, else I wouldn’t mug him, no point in bet­ting my life in some­thing that does not gen­er­ate more of my life (given that if I suc­ceeded I can have him make all copies of me im­mor­tal. Of course, if he had this power, there’s a might be a chance that his re­tal­i­a­tion might have been to sim­ply wipe out all copies of me in all ex­is­tances, in that case, the prob­a­bil­ity of suc­cess should be com­puted as a nega­tive value, given I CAN fail more times then I try.) In the case I suc­ceed in the mug­ging, I’d get at least my \$100 back, and the cases I fail, I doubt I’d care about \$100.

In the case that nei­ther of the above are pos­si­ble, I would not give him the \$100, given that the diminish­ing re­turns of in­creas­ing amounts of money might well make the \$10000 less util­ity than 2x in­stances of \$100. (The 2x in­stances of \$100 scale lin­early, where each in­creas­ing \$100 in the \$10000 diminishes in value. As in, each in­stance of \$100 would be worth just as much as a prior in­stance of \$100 since it’s be­ing dis­tributed among differ­ent copies of me, so diminish­ing re­turns does not kick in, whereas the \$10000 all goes to one in­stance. It should be ob­vi­ous that I pre­fer \$50 to 50% chance of \$100)

Of course, due to the above, there’s a fourth pos­si­blity, one where the iter­a­tion of me be­ing offered the choice is be­ing very much af­fected by the diminish­ing re­turns on the value of money. In that case, I would give \$100 to omega, since this ac­tion would par­tially smooth out the differ­ing amounts of wealth among mul­ti­ple copies of me across wor­lds. Or rather, diminish the num­ber of me who are “poorer,” since the copies that are in need of money do not give up \$100, but will re­cieve some re­gard­less, un­less that doesn’t work out be­cause Omega simu­lates the ex­act ver­sion, in­clud­ing cur­rent fi­ni­an­cial as­sets, which rather nul­lifies his ca­pa­bil­ities as an in­ter­di­men­tional ar­bi­trageur among copies of me. But at that point, the diminish­ing re­turns on money should be such that each ad­di­tional \$100 should be roughly equal in value, since diminish­ing re­turns ALSO suffer from diminish­ing re­turns, with in­creas­ing amounts of diminish­ing re­turns diminish­ing less re­turns.

In short, the op­tions are, I offer him \$50 for a con­stant \$5000 across all out­comes of the coin flip, and he ac­cepts, I give him \$100 then I mug him, I do not give him \$100 if diminish­ing re­turns is not yet it­self af­fected by much by diminish­ing re­turns, or I give him \$100 if it is.

• We have to pre­sume you can’t just mug Omega. (He is om­ni­scient, may as well make him om­nipo­tent too.) Other­wise the prob­lem is to­tally differ­ent.

• He is om­ni­scient, may as well make him om­nipo­tent too.

Given what you can do with om­ni­science that’s not much of a stretch!

• I am un­able to see how this boils down to any­thing but a moral prob­lem (and there­fore with no ob­jec­tive solu­tion).

Com­pare this to a sim­ple lost bet. Omega tells you about the deal, you agree, and then he flips a coin, which comes out tails. Why ex­actly would you pay the \$100 in this ex­am­ple?

Be­cause some­one will pun­ish/​os­tracise me if I re­nege (or other ex­ter­nal con­se­quences)? Then in the CM case all that mat­ters is what the con­se­quences are for your pay­ment/​re­fusal.

Be­cause I have an ab­solute/​ir­ra­tional/​moral de­sire to hold to my word? Then the only ques­tion is whether your defi­ni­tion of “my word” (or, more gen­er­ally, your self-im­posed moral obli­ga­tions) in­cludes coun­ter­fac­tual promises. But this is only a mat­ter of choos­ing the bound­aries of your ar­bi­trary moral guidelines. It is hardly more solv­able or more in­ter­est­ing than ask­ing if you would con­sider your­self morally be­holden to a promise you made when you were four years old.

• Does this par­tic­u­lar thought ex­per­i­ment re­ally have any prac­ti­cal ap­pli­ca­tion?

I can think of plenty of similar sce­nar­ios that are gen­uinely use­ful and worth con­sid­er­ing, but all of them can be ex­pressed with much sim­pler and more in­tu­itive sce­nar­ios—eg when the offer will/​might be re­peated, or when you get to choose in ad­vance whether to flip the coin and win 10000/​lose 100. But with the sce­nario as stated—what real phe­nomenon is there that would re­ward you for be­ing will­ing to coun­ter­fac­tu­ally take an oth­er­wise-detri­men­tal ac­tion for no rea­son other than qual­ify­ing for the coun­ter­fac­tual re­ward? Even if we de­cide the best course of ac­tion in this con­trived sce­nario—there­fore what?

• Precom­mit­ments are used in de­ci­sion-the­o­retic prob­lems. Some peo­ple have pro­posed that a good de­ci­sion the­ory should take the ac­tion that it would have pre­com­mit­ted to, if it had known in ad­vance to do such a thing. This is an at­tempt to ex­am­ine the con­se­quences of that.

• This is an at­tempt to ex­am­ine the con­se­quences of that.

Yes, but if the ar­tifi­cial sce­nario doesn’t re­flect any­thing in the real world, then even if we get the right an­swer, there­fore what? It’s like be­ing vac­ci­nated against a fic­ti­tious dis­ease; even if you suc­cess­fully de­velop the an­ti­bod­ies, what good do they do?

It seems to me that the “beg­gars and gods” var­i­ant men­tioned ear­lier in the com­ments, where the op­por­tu­nity re­peats it­self each day, is ac­tu­ally a more use­ful study. Sure, it’s much more in­tu­itive; it doesn’t tie our brains up in knots, try­ing to work out a way to in­tend to do some­thing at a point when all our mo­ti­va­tion to do so has evap­o­rated. But re­al­ity doesn’t have to be com­pli­cated. Some­times you just have to learn to throw in the peb­ble.

• De­ci­sion the­ory is an at­tempt to for­mal­ize the hu­man de­ci­sion pro­cess. The point isn’t that we re­ally are un­sure whether you should leave peo­ple to die of thirst, but how we can en­code that in an ac­tual de­ci­sion the­ory. Like so many dis­cus­sions on Less Wrong, this im­plic­itly comes back to AI de­sign: an AI needs a de­ci­sion the­ory, and that de­ci­sion the­ory needs to not have ma­jor failure modes, or at least the failure modes should be well-un­der­stood.

If your AI some­how as­signs a nonzero prob­a­bil­ity to “I will face a mas­sive penalty un­less I do this re­ally weird ac­tion”, that ideally shouldn’t de­rail its en­tire de­ci­sion pro­cess.

The beg­gars-and-gods for­mu­la­tion is the same prob­lem. “Omega” is just a handy ab­strac­tion for “don’t fo­cus on how you got into this de­ci­sion-the­o­retic situ­a­tion”. Ad­mit­tedly, this ab­strac­tion some­times ob­scures the is­sue.

• The beg­gars-and-gods for­mu­la­tion is the same prob­lem.

I don’t think so; I think the el­e­ment of rep­e­ti­tion sub­stan­tially al­ters it—but in a good way, one that makes it more use­ful in de­sign­ing a real-world agent. Be­cause in re­al­ity, we want to de­sign de­ci­sion the­o­ries that will solve prob­lems mul­ti­ple times.

At the point of meet­ing a beg­gar, al­though my prospects of ob­tain­ing a gold coin this time around are gone, nonethe­less my over­all com­mit­ment is not mean­ingless. I can still think, “I want to be the kind of per­son who gives pen­nies to beg­gars, be­cause over­all I will come out ahead”, and this thought re­mains ap­pli­ca­ble. I know that I can av­er­age out my losses with greater wins, and so I still want to stick to the al­gorithm.

In the sin­gle-shot sce­nario, how­ever, my com­mit­ment be­comes worth­less once the coin comes down tails. There will never be any more 10K; there is no mo­ti­va­tion any more to give 100. Fol­low­ing my pre­com­mit­ment, un­less it is ex­ter­nally en­forced, no longer makes any sense.

So the sce­nar­ios are sig­nifi­cantly differ­ent.

• There will never be any more 10K; there is no mo­ti­va­tion any more to give 100. Fol­low­ing my pre­com­mit­ment, un­less it is ex­ter­nally en­forced, no longer makes any sense.

This is the point of the thought ex­per­i­ment.

Omega is a pre­dic­tor. His ac­tions aren’t just based on what you de­cide, but on what he pre­dicts that you will de­cide.

If your de­ci­sion the­ory says “nah, I’m not pay­ing you” when you aren’t given ad­vance warn­ing or re­peated tri­als, then that is a fact about your de­ci­sion the­ory even be­fore Omega flips his coin. He flips his coin, gets heads, ex­am­ines your de­ci­sion the­ory, and gives you no money.

But if your de­ci­sion the­ory pays up, then if he flips tails, you pay \$100 for no pos­si­ble benefit.

Nei­ther of these seems en­tirely satis­fac­tory. Is this a rea­son­able fea­ture for a de­ci­sion the­ory to have? Or is it patholog­i­cal? If it’s patholog­i­cal, how do we fix it with­out cre­at­ing other patholo­gies?

• if your de­ci­sion the­ory pays up, then if he flips tails, you pay \$100 for no pos­si­ble benefit.

But in the sin­gle-shot sce­nario, af­ter it comes down tails, what mo­ti­va­tion does an ideal game the­o­rist have to stick to the de­ci­sion the­ory?

Like Parfit’s hitch­hiker, al­though in ad­vance you might agree that it’s a worth­while deal, when it comes to the point of ac­tu­ally pay­ing up, your mo­ti­va­tion is gone, un­less you have bound your­self in some other way.

• But in the sin­gle-shot sce­nario, af­ter it comes down tails, what mo­ti­va­tion does an ideal game the­o­rist have to stick to the de­ci­sion the­ory?

That’s what the prob­lem is ask­ing!

This is a de­ci­sion-the­o­ret­i­cal prob­lem. No­body cares about it for im­me­di­ate prac­ti­cal pur­pose. “Stick to your de­ci­sion the­ory, ex­cept when you non-rigor­ously de­cide not to” isn’t a re­s­olu­tion to the prob­lem, any more than “ig­nore the calcu­la­tions since they’re wrong” was a re­s­olu­tion to the ul­tra­vi­o­let catas­tro­phe.

Again, the point of this ex­per­i­ment is that we want a rigor­ous, for­mal ex­pla­na­tion of ex­actly how, when, and why you should or should not stick to your pre­com­mit­ment. The origi­nal mo­ti­va­tion is al­most cer­tainly in the con­text of AI de­sign, where you don’t HAVE a hu­man ho­muncu­lus im­ple­ment­ing a de­ci­sion the­ory, the agent just is its de­ci­sion the­ory.

• we want a rigor­ous, for­mal ex­pla­na­tion of ex­actly how, when, and why you should or should not stick to your precommitment

Well, if we’re de­sign­ing an AI now, then we have the ca­pa­bil­ity to make a bind­ing pre­com­mit­ment, sim­ply by writ­ing code. And we are still in a po­si­tion where we can hope for the coin to come down heads. So yes, in that priv­ileged po­si­tion, we should bind the AI to pay up.

How­ever, to the ques­tion as stated, “is the de­ci­sion to give up \$100 when you have no real benefit from it, only coun­ter­fac­tual benefit, an ex­am­ple of win­ning?” I would still an­swer, “No, you don’t achieve your goals/​util­ity by pay­ing up.” We’re speci­fi­cally told that the coin has already been flipped. Los­ing \$100 has nega­tive util­ity, and pos­i­tive util­ity isn’t on the table.

Alter­na­tively, since it’s ask­ing speci­fi­cally about the de­ci­sion, I would an­swer, If you haven’t made the de­ci­sion un­til af­ter the coin comes down tails, then pay­ing is the wrong de­ci­sion. Only if you’re de­cid­ing in ad­vance (when you still hope for heads) can a de­ci­sion to pay have the best ex­pected value.

Even if de­cid­ing in ad­vance, though, it’s still not a guaran­teed win, but rather a gam­ble. So I don’t see any in­con­sis­tency in say­ing, on the one hand, “You should make a bind­ing pre­com­mit­ment to pay”, and on the other hand, “If the coin has already come down tails with­out a pre­com­mit­ment, you shouldn’t pay.”

If there were a lot­tery where the ex­pected value of a ticket was ac­tu­ally pos­i­tive, and some­one comes to you offer­ing to sell you their ticket (at cost price), then it would make sense in ad­vance to buy it, but if you didn’t, and then the win­ners were an­nounced and that ticket didn’t win, then buy­ing it no longer makes sense.

• You’re fun­da­men­tally failing to ad­dress the prob­lem.

For one, your ex­am­ples just plain omit the “Omega is a pre­dic­tor” part, which is key to the situ­a­tion. Since Omega is a pre­dic­tor, there is no dis­tinc­tion be­tween mak­ing the de­ci­sion ahead of time or not.

For an­other, un­less you can prove that your pro­posed al­ter­na­tive doesn’t have patholo­gies just as bad as the Coun­ter­fac­tual Mug­ging, you’re at best back to square one.

It’s very easy to say “look, just don’t do the patholog­i­cal thing”. It’s very hard to for­mal­ize that into an ac­tual de­ci­sion the­ory, with­out cre­at­ing new patholo­gies. I feel ob­nox­ious to keep re­peat­ing this, but that is the en­tire prob­lem in the first place.

• there is no dis­tinc­tion be­tween mak­ing the de­ci­sion ahead of time or not

Ex­cept that even if you make the de­ci­sion, what would mo­ti­vate you to stick to it once it can no longer pay up?

Your only mo­ti­va­tion to pay is the hope of ob­tain­ing the \$10000. If that hope does not ex­ist, what rea­son would you have to abide by the de­ci­sion that you make now?

• Your de­ci­sion is a re­sult of your de­ci­sion the­ory, and your de­ci­sion the­ory is a fact about you, not just some­thing that hap­pens in that mo­ment.

You can say—I’m not mak­ing the de­ci­sion ahead of time, I’m wait­ing un­til af­ter I see that Omega has flipped tails. In which case, when Omega pre­dicts your be­hav­ior ahead of time, he pre­dicts that you won’t de­cide un­til af­ter the coin flip, re­sult­ing in hy­po­thet­i­cally re­fus­ing to pay given tails, so—al­though the coin flip hasn’t hap­pened yet and could still come up heads—your yet-un­made de­ci­sion has the same effect as if you had loudly pre­com­mit­ted to it.

You’re try­ing to rea­son in tem­po­ral or­der, but that doesn’t work in the pres­ence of pre­dic­tors.

I get that that could work for a com­puter, be­cause a com­puter can be bound by an over­all de­ci­sion the­ory with­out at­tempt­ing to think about whether that de­ci­sion the­ory still makes sense in the cur­rent situ­a­tion.

I don’t mind pre­dic­tors in eg New­comb’s prob­lem. Effec­tively, there is a back­ward causal ar­row, be­cause what­ever you choose causes the pre­dic­tor to have already acted differ­ently. Unusual, but rea­son­able.

How­ever, in this case, yes, your choice af­fects the pre­dic­tor’s ear­lier de­ci­sion—but since the coin never came down heads, who cares any more how the pre­dic­tor would have acted? Why care about be­ing the kind of per­son who will pay the coun­ter­fac­tual mug­ger, if there will never again be any op­por­tu­nity for it to pay off?

• Yes, that is the prob­lem in ques­tion!

If you want the pay­off, you have to be the kind of per­son who will pay the coun­ter­fac­tual mug­ger, even once you no longer can benefit from do­ing so. Is that a rea­son­able fea­ture for a de­ci­sion the­ory to have? It’s not clear that it is; it seems strange to pay out, even though the ex­pected value of be­com­ing that kind of per­son is clearly pos­i­tive be­fore you see the coin. That’s what the coun­ter­fac­tual mug­ging is about.

If you’re ask­ing “why care” rhetor­i­cally, and you be­lieve the an­swer is “you shouldn’t be that kind of per­son”, then your de­ci­sion the­ory prefers lower ex­pected val­ues, which is also patholog­i­cal. How do you re­solve that ten­sion? This is, once again, liter­ally the en­tire prob­lem.

• How do you re­solve that ten­sion?

Well, as pre­vi­ously stated, my view is that the sce­nario as stated (sin­gle-shot with no pre­com­mit­ment) is not the most helpful hy­po­thet­i­cal for de­sign­ing a de­ci­sion the­ory. An iter­ated ver­sion would ac­tu­ally be more rele­vant, since we want to de­sign an AI that can make more than one de­ci­sion. And in the iter­ated ver­sion, the ten­sion is largely re­solved, be­cause there is a clear mo­ti­va­tion to stick with the de­ci­sion: we still hope for the next coin to come down heads.

• Are you ac­tu­ally try­ing to un­der­stand? At some point you’ll pre­dictably ap­proach death, and pre­dictably as­sign a van­ish­ing prob­a­bil­ity to an­other offer or coin-flip com­ing af­ter a cer­tain point. Your pre­sent self should know this. Omega knows it by as­sump­tion.

• At some point you’ll pre­dictably ap­proach death

I’m pretty sure that de­ci­sion the­o­ries are not de­signed on that ba­sis. We don’t want an AI to start mak­ing differ­ent de­ci­sions based on the prob­a­bil­ity of an up­com­ing de­com­mis­sion. We don’t want it to be­come nihilis­tic and stop mak­ing de­ci­sions be­cause it pre­dicted the heat death of the uni­verse and de­cided that all paths have zero value. If death is ac­tu­ally tied to the de­ci­sion in some way, then sure, take that into ac­count, but oth­er­wise, I don’t think a de­ci­sion the­ory should have “death is in­evitably com­ing for us all” as a fac­tor.

• I’m pretty sure that de­ci­sion the­o­ries are not de­signed on that ba­sis.

You are wrong. In fact, this is a to­tally stan­dard thing to con­sider, and “avoid back-chain­ing defec­tion in games of fixed length” is a known prob­lem, with var­i­ous known strate­gies.

• So say it’s re­peated. Since our ob­serv­able uni­verse will end some­day, there will come a time when the prob­a­bil­ity of fu­ture flips is too low to jus­tify pay­ing if the coin lands tails. Your ar­gu­ment sug­gests you won’t pay, and by as­sump­tion Omega knows you won’t pay. But then on the pre­vi­ous trial you have no in­cen­tive to pay, since you can’t fool Omega about your fu­ture be­hav­ior. This makes it seem like non-pay­ment prop­a­gates back­ward, and you miss out on the whole se­quence.

• I wouldn’t trust my­self to ac­cu­rately pre­dict the odds of an­other rep­e­ti­tion, so I don’t think it would un­ravel for me. But this comes back to my ear­lier point that you re­ally need some ex­ter­nal mo­ti­va­tion, some pre­com­mit­ment, be­cause “I want the 10K” loses its power as soon as the coin comes down tails.

• Unique­ness raises all sorts of prob­lems for de­ci­sion the­ory, be­cause ex­pected util­ity im­plic­itly as­sumes many tri­als. This may just be an­other ex­am­ple of that gen­eral phe­nomenon.

• Precom­mit­ting should be, as some­one already said, sign­ing a pa­per with a third party agree­ing to give them \$1000 in case you fail to give the \$100 to Omega. Precom­mit­ment means you have no other op­tion. You can’t say that you both pre­com­mit­ted to give the \$100 AND re­fused to do it when pre­sented with the case.

Which means, if Omega pre­sents you with the sce­nario be­fore the coin toss, you pre­com­mit (by sign­ing the con­tract with the third party). If Omega pre­sents you with the sce­nario af­ter the coin toss AND also tells you it has already come up tails—you haven’t pre­com­mited, there­fore you shouldn’t give it \$100.

EDIT: Also, some peo­ple ob­jected to not giv­ing the \$100, be­cause they might be the em­u­la­tion which Omega uses to pre­dict whether you’d re­ally give money. If you were an em­u­la­tion, then you would re­mem­ber pre­com­mit­ting in ex­pec­ta­tion to get \$10,000 with a 50% chance. It makes no sense for Omega to em­u­late you in a sce­nario where you don’t get a chance to pre­com­mit.

• That level of pre­comit­ting is only nec­ces­sary if you are un­able to trust your­self to carry through with a self-im­posed pre­com­mit­ment. If you are ca­pa­ble of this, you can de­cide now to act ir­ra­tionally to cer­tain fu­ture de­ci­sions in or­der to benefit to a greater de­gree than some­one who can’t. If the temp­ta­tion to go back on your self-promise is too great in the failure case, then you would have lost in the win case—you are sim­ply a for­tu­nate loser who found out the flaw in his promise in the case where be­ing flawed was benefi­cial. It doesn’t change the fact that be­ing ca­pa­ble of this de­ci­sion would be a bet­ter strat­egy on av­er­age. Mak­ing your­self con­di­tion­ally less ra­tio­nal can ac­tu­ally be a ra­tio­nal de­ci­sion, and so the abil­ity to do so can be a strength worth ac­quiring.

Ul­ti­mately the prob­lem is the same as that of an ul­ti­ma­tum (eg. MAD). We want the other party to be­lieve we will carry through even if it would be clearly ir­ra­tional to do so at that point. As your op­po­nent be­comes bet­ter and bet­ter at pre­dict­ing, you must be­come closer and closer to be­ing some­one who would make the ir­ra­tional de­ci­sion. When your op­po­nent is suffi­ciently good (or you have in­suffi­cient knowl­edge as to how they are pre­dict­ing), the only way to be sure is to be some­one who would ac­tu­ally do it.

• Okay, I agree that this level of pre­comit­ting is not nec­es­sary. But if the deal is re­ally a one-time offer, then, when pre­sented with the case of the coin already hav­ing come up tails, you can no longer ever benefit from be­ing the sort of per­son who would pre­com­mit. Since you will never again be pre­sented with a new­comb-like sce­nario, then you will have no benefit from be­ing the pre­com­mit­ing type. There­fore you shouldn’t give the \$100.

If, on the other hand, you still ex­pect that you can en­counter some other Omega-like thing which will pre­sent you with such a sce­nario, doesn’t this make the deal re­peat­able, which is not how the ques­tion was for­mu­lated?

• If, on the other hand, you still ex­pect that you can en­counter some other Omega-like thing which will pre­sent you with such a sce­nario, doesn’t this make the deal re­peat­able, which is not how the ques­tion was for­mu­lated?

In a re­peat­able deal your ac­tion in­fluences the con­di­tions in the next rounds. Even if you defect in this round, you may still co­op­er­ate in the next rounds, Omegas aren’t look­ing back at how you de­cided in the past, and don’t pun­ish you by not offer­ing the deals. Your suc­cess in the fol­low­ing rounds (from your cur­rent point of view) de­pends on whether you man­age to pre­com­mit to the fu­ture en­coun­ters, not on what you do now.

• This is a self-de­cep­tion tech­nique. If you think it’s morally OK to self-de­ceive your fu­ture self for your cur­rent self­ish ends, then by all means go ahead. Also, it looks like vi­o­lent means of pre­com­mit­ment should ac­tu­ally be con­sid­ered im­moral, on par with forc­ing some other per­son to do your bid­ding by hiring a kil­ler to kill them if they don’t com­ply.

In the New­comb’s prob­lem, it ac­tu­ally is in your self-in­ter­est to one-box. Not so in this prob­lem.

• This is a self-de­cep­tion tech­nique.

I am fairly sure that it isn’t, but demon­strat­ing so would re­quire an­other maths-laden ar­ti­cle, which I an­ti­ci­pate would be re­ceived similarly to my last. I will how­ever email you my en­tire rea­son­ing if you so wish (you will have to wait sev­eral days while I brush up on the log­i­cal con­cept of com­mon knowl­edge). (I don’t know how to en­code a ) in a link, so please add one to the end.)

• Com­mon knowl­edge (I used the %29 ASCII code for ”)”).

I’m go­ing to write up my new po­si­tion on this topic. Nonethe­less I think it should be pos­si­ble to dis­cuss the ques­tion in a more con­cise form, since I think the prob­lem is that of com­mu­ni­ca­tion, not rigor. You de­ceive your fu­ture self, that’s the whole point of the com­ment above, make it be­lieve that it wants to make an ac­tion that it ac­tu­ally doesn’t. The only dis­agree­ment po­si­tion that I ex­pect is say­ing that no, the fu­ture self ac­tu­ally wants to fol­low that ac­tion.

I think the prob­lem with your ar­ti­cle wasn’t that it was math-laden, but that you didn’t in­tro­duce things in suffi­cient de­tail to fol­low along, and to see the mo­ti­va­tion be­hind the math.

• To be perfectly hon­est, your last sen­tence is also my feel­ing. I should at the least have talked more about the key equa­tion. But the ar­ti­cle was already long, I was un­sure as to how it would be re­ceived, and I spent too lit­tle time re­vis­ing it (this is a per­sis­tent prob­lem for me). If I were to write it again now, it would have been closer in style to the thread be­tween you and me there.

If you in­tend to write an­other post, then I am happy to wait un­til then to in­tro­duce the ideas I have in mind, and I will try hard to do so in a man­ner that won’t alienate ev­ery­one.

• If you think that through and de­cide that way, then your pre­com­mit­ting method didn’t work. The idea is that you must some­how now pre­vent your fu­ture self from be­hav­ing ra­tio­nally in that situ­a­tion—if they do, they will perform ex­actly the thought pro­cess you de­scribe. The method of do­ing so, whether mak­ing a pub­lic promise (and valu­ing your spo­ken word more than \$100), hiring a hit­man to kill you if you re­nege or just hav­ing the ca­pa­bil­ity of re­li­ably con­vinc­ing your­self to do so (effec­tively valu­ing keep­ing faith with your self-promise more than \$100) doesn’t mat­ter so long as it is effec­tive. If merely de­cid­ing now is effec­tive, then that is all that’s needed.

If you do then de­cide to take the ra­tio­nal course in the los­ing coin­flip case, it just means you were wrong by defi­ni­tion about your com­mit­ment be­ing effec­tive. Luck­ily in this one case, you found it out in the loss case rather than the win case. Had you won the coin flip, you would have found your­self with noth­ing though.

• How do you ver­ify that “Omega” re­ally is Omega and not a drunk in a bar? I can’t think of a way of do­ing it—so it sounds like a fraud to me.

Why is the Omega ask­ing me, when it already knows my an­swer? So what hap­pens to Omega/​the uni­verse when I say no?

If he asks me the ques­tion, I have already an­swered the ques­tion, so I don’t need to post this com­ment. I acted as I did. But i didn’t act as I did (Omega hasn’t shown up in my part of the uni­verse), so we all know my an­swer.

• If some guy walked up to you and gave you this spiel, you’d be fully jus­tified in tel­ling him to get lost, or even seek­ing men­tal help for him.

The prob­lem as­sumes Omega to be gen­uine, and trust­wor­thy.

• Wow, this red­dit softtware is pretty neat for a blog.

I’d love to see a post on the best in­tro­duc­tory books to logic, and also episte­mol­ogy. Episte­mol­ogy, es­pe­cially, seems to lack good in­tro­duc­tory texts.

• I know this is off-topic, but I feel duty-bound to re­spond (in the ab­sence of pro­file pages or a re­ally work­ing di­rect mes­sage func­tion­al­ity).

“Episte­mol­ogy: the big ques­tions” by Black­well pub­lish­ing is awe­some.

in­tro­duc­tory logic texts are easy to find, but Hurley’s “A Con­cise In­tro­duc­tion to Logic” comes recom­mended, de­pend­ing on what sort of in­tro you were look­ing for.

• This doesn’t go here. I’m not sure where it goes—we don’t have open threads yet.

You might want to try Jaynes though.

If you want to re­spond to this, please make it a pri­vate mes­sage—this thread should be for dis­cussing the post.

• This is ac­tu­ally a parable on the bound­aries of self (think a bit Bud­dhist here). Let me pose this an­other way: late last night in the pub, my past self com­mit­ted to the drunken bet of \$100 vs. \$200 on the flip of a coin (the other guy was even more drunk than I was). My past self lost, but didn’t have the money. This morn­ing, my pre­sent self gets a phone call from the per­son it lost to. Does it honor the bet? As­sum­ing, as in typ­i­cal in these hy­po­thet­i­cal prob­lems, that we can ig­nore the con­se­quences (else we’d have to as­sign a cost to them that might well offset the gains, so we’ll just as­sign 0 and don’t con­sider them), a util­i­tar­ian ap­proach is that I should de­fault on the bet if I can get away with it. Why should I be re­spon­si­ble for what I said yes­ter­day?

How­ever, as usual in util­i­tar­ian dilem­mas, the effect that we get in real-life is that we have a con­science—can I live with my­self be­ing the kind of per­son that doesn’t honor past com­mit­ments? So, most peo­ple will, out of one con­sid­er­a­tion or an­other, not think twice about pay­ing up the \$100.

Of Omega it is said that I can trust it more than I would my­self. It knows more about me than I do my­self. It would be part of my­self if I didn’t con­sider it seper­ate from my­self. If I con­sider my ego and Omega part of the same all-en­com­pass­ing self, then hon­or­ing the com­mit­ment that Omega com­mit­ted it­self to on my be­half should draw the same re­sponse as if I had done it my­self. Only if I per­ceive Omega as a sep­a­rate en­tity to whom I am not morally obli­gated can I jus­tify not pay­ing the \$100. Only with this in­di­vi­d­u­al­ist view­point will I see some­one whom I am not obli­gated to in any way de­mand­ing \$100 of me.

If you man­age to in­still your AI with a sense of the “com­mon good”, a sense of broth­er­hood of all in­tel­li­gent crea­tures, then it will, given the premises of trust etc., co­op­er­ate in this broth­er­hood—in fact, that is what I be­lieve would be one of the mean­ings of “friendly”.

• Your ver­sion of the story dis­cards the most im­por­tant in­gre­di­ent: The fact that when you win the coin toss, you only re­ceive money if you would have paid had you lost.

As for Omega, all we know about it is that some­how it can ac­cu­rately pre­dict your ac­tions. For the pur­poses of Coun­ter­fac­tual Mug­ging we may as well re­gard Omega as a mind­less robot which will burn the money you give to it and then self-de­struct im­me­di­ately af­ter the game. (This makes it im­pos­si­ble to pay be­cause you feel obli­gated to Omega. In fact, the idea is that you pay up be­cause you feel obli­gated to your coun­ter­fac­tual self.)

• I don’t see how your points ap­ply: I would have paid had I lost. Ex­cept if my hy­po­thet­i­cal self is so much in debt that it can’t rea­son­ably spend \$100 on an in­vest­ment such as this—in which case Omega would have known in ad­vance, and un­der­stands my non­pay­ment.

I do not con­sider the fu­ture ex­is­tence of Omega as a fac­tor at all, so it doesn’t mat­ter whether it self-de­structs or not. And it is also a given that Omega is ab­solutely trust­wor­thy (more than I could say for my­self).

My view is that this may well be one of the un­de­cid­able the­o­rems that Goedel has shown must ex­ist in any rea­son­ably com­plex for­mal sys­tem. The only way to make it de­cid­able is to think out of the box, and in this case it means that I con­sider that some­one else is some­how still “me” (at least un­der eth­i­cal as­pects) - there are other threads on here that in­volve split­ting my­self and still re­main­ing the same per­son some­how, so it’s not in­trin­si­cally ir­ra­tional or any­thing. My refer­ence to Bud­dhism was merely meant to show that the con­cept is main­stream enough to be part of a ma­jor world re­li­gion, though most other re­li­gions and the UN charta of hu­man rights have it as well, though not as pro­nounced, as “broth­er­hood”—not a fac­tual, but an eth­i­cal iden­tity.

• After a good night’s sleep, here are some more thoughts:

the idea is that you pay up be­cause you feel obli­gated to your coun­ter­fac­tual self.

To feel obli­gated to my coun­ter­fac­tual self, which ex­ists only in the “mind” of Omega, and not feel obli­gated to Omega doesn’t make any sense to me.

Your ad­di­tional as­sump­tions about Omega de­stroy the util­ity that the \$100 had—in the origi­nal ver­sion, \$100 is \$100 to both me and Omega, but in your ver­sion it is noth­ing to Omega. Your amended ver­sion of the prob­lem amounts to “would I throw \$100 into an in­cin­er­a­tor on the ba­sis of some thought ex­per­i­ment”, and that is clearly not even a zero-sum game if you con­sider the whole sys­tem—the origi­nal prob­lem is zero-sum, and that gives me more free­dom of choice.