Outlawing Anthropics: An Updateless Dilemma

Let us start with a (non-quan­tum) log­i­cal coin­flip—say, look at the heretofore-un­known-to-us-per­son­ally 256th bi­nary digit of pi, where the choice of bi­nary digit is it­self in­tended not to be ran­dom.

If the re­sult of this log­i­cal coin­flip is 1 (aka “heads”), we’ll cre­ate 18 of you in green rooms and 2 of you in red rooms, and if the re­sult is “tails” (0), we’ll cre­ate 2 of you in green rooms and 18 of you in red rooms.

After go­ing to sleep at the start of the ex­per­i­ment, you wake up in a green room.

With what de­gree of cre­dence do you be­lieve—what is your pos­te­rior prob­a­bil­ity—that the log­i­cal coin came up “heads”?

There are ex­actly two ten­able an­swers that I can see, “50%” and “90%”.

Sup­pose you re­ply 90%.

And sup­pose you also hap­pen to be “al­tru­is­tic” enough to care about what hap­pens to all the copies of your­self. (If your cur­rent sys­tem cares about your­self and your fu­ture, but doesn’t care about very similar xe­rox-siblings, then you will tend to self-mod­ify to have fu­ture copies of your­self care about each other, as this max­i­mizes your ex­pec­ta­tion of pleas­ant ex­pe­rience over fu­ture selves.)

Then I at­tempt to force a re­flec­tive in­con­sis­tency in your de­ci­sion sys­tem, as fol­lows:

I in­form you that, af­ter I look at the un­known bi­nary digit of pi, I will ask all the copies of you in green rooms whether to pay \$1 to ev­ery ver­sion of you in a green room and steal \$3 from ev­ery ver­sion of you in a red room. If they all re­ply “Yes”, I will do so.

(It will be un­der­stood, of course, that \$1 rep­re­sents 1 utilon, with ac­tual mon­e­tary amounts rescaled as nec­es­sary to make this hap­pen. Very lit­tle rescal­ing should be nec­es­sary.)

(Time­less de­ci­sion agents re­ply as if con­trol­ling all similar de­ci­sion pro­cesses, in­clud­ing all copies of them­selves. Clas­si­cal causal de­ci­sion agents, to re­ply “Yes” as a group, will need to some­how work out that other copies of them­selves re­ply “Yes”, and then re­ply “Yes” them­selves. We can try to help out the causal de­ci­sion agents on their co­or­di­na­tion prob­lem by sup­ply­ing rules such as “If con­flict­ing an­swers are de­liv­ered, ev­ery­one loses \$50″. If causal de­ci­sion agents can win on the prob­lem “If ev­ery­one says ‘Yes’ you all get \$10, if ev­ery­one says ‘No’ you all lose \$5, if there are con­flict­ing an­swers you all lose \$50” then they can pre­sum­ably han­dle this. If not, then ul­ti­mately, I de­cline to be re­spon­si­ble for the stu­pidity of causal de­ci­sion agents.)

Sup­pose that you wake up in a green room. You rea­son, “With 90% prob­a­bil­ity, there are 18 of me in green rooms and 2 of me in red rooms; with 10% prob­a­bil­ity, there are 2 of me in green rooms and 18 of me in red rooms. Since I’m al­tru­is­tic enough to at least care about my xe­rox-siblings, I calcu­late the ex­pected util­ity of re­ply­ing ‘Yes’ as (90% * ((18 * +\$1) + (2 * -\$3))) + (10% * ((18 * -\$3) + (2 * +\$1))) = +\$5.60.” You re­ply yes.

How­ever, be­fore the ex­per­i­ment, you calcu­late the gen­eral util­ity of the con­di­tional strat­egy “Re­ply ‘Yes’ to the ques­tion if you wake up in a green room” as (50% * ((18 * +\$1) + (2 * -\$3))) + (50% * ((18 * -\$3) + (2 * +\$1))) = -\$20. You want your fu­ture selves to re­ply ‘No’ un­der these con­di­tions.

This is a dy­namic in­con­sis­tency—differ­ent an­swers at differ­ent times—which ar­gues that de­ci­sion sys­tems which up­date on an­thropic ev­i­dence will self-mod­ify not to up­date prob­a­bil­ities on an­thropic ev­i­dence.

I origi­nally thought, on first for­mu­lat­ing this prob­lem, that it had to do with dou­ble-count­ing the utilons gained by your vari­able num­bers of green friends, and the prob­a­bil­ity of be­ing one of your green friends.

How­ever, the prob­lem also works if we care about pa­per­clips. No self­ish­ness, no al­tru­ism, just pa­per­clips.

Let the dilemma be, “I will ask all peo­ple who wake up in green rooms if they are will­ing to take the bet ‘Create 1 pa­per­clip if the log­i­cal coin­flip came up heads, de­stroy 3 pa­per­clips if the log­i­cal coin­flip came up tails’. (Should they dis­agree on their an­swers, I will de­stroy 5 pa­per­clips.)” Then a pa­per­clip max­i­mizer, be­fore the ex­per­i­ment, wants the pa­per­clip max­i­miz­ers who wake up in green rooms to re­fuse the bet. But a con­scious pa­per­clip max­i­mizer who up­dates on an­thropic ev­i­dence, who wakes up in a green room, will want to take the bet, with ex­pected util­ity ((90% * +1 pa­per­clip) + (10% * −3 pa­per­clips)) = +0.6 pa­per­clips.

This ar­gues that, in gen­eral, de­ci­sion sys­tems—whether they start out self­ish, or start out car­ing about pa­per­clips—will not want their fu­ture ver­sions to up­date on an­thropic “ev­i­dence”.

Well, that’s not too dis­turb­ing, is it? I mean, the whole an­thropic thing seemed very con­fused to be­gin with—full of no­tions about “con­scious­ness” and “re­al­ity” and “iden­tity” and “refer­ence classes” and other poorly defined terms. Just throw out an­thropic rea­son­ing, and you won’t have to bother.

When I ex­plained this prob­lem to Mar­cello, he said, “Well, we don’t want to build con­scious AIs, so of course we don’t want them to use an­thropic rea­son­ing”, which is a fas­ci­nat­ing sort of re­ply. And I re­sponded, “But when you have a prob­lem this con­fus­ing, and you find your­self want­ing to build an AI that just doesn’t use an­thropic rea­son­ing to be­gin with, maybe that im­plies that the cor­rect re­s­olu­tion in­volves us not us­ing an­thropic rea­son­ing ei­ther.”

So we can just throw out an­thropic rea­son­ing, and re­lax, and con­clude that we are Boltz­mann brains. QED.

In gen­eral, I find the sort of ar­gu­ment given here—that a cer­tain type of de­ci­sion sys­tem is not re­flec­tively con­sis­tent—to be pretty damned com­pel­ling. But I also find the Boltz­mann con­clu­sion to be, ahem, more than or­di­nar­ily un­palat­able.

In per­sonal con­ver­sa­tion, Nick Bostrom sug­gested that a di­vi­sion-of-re­spon­si­bil­ity prin­ci­ple might can­cel out the an­thropic up­date—i.e., the pa­per­clip max­i­mizer would have to rea­son, “If the log­i­cal coin came up heads then I am 1/​18th re­spon­si­ble for adding +1 pa­per­clip, if the log­i­cal coin came up tails then I am 12 re­spon­si­ble for de­stroy­ing 3 pa­per­clips.” I con­fess that my ini­tial re­ac­tion to this sug­ges­tion was “Ewwww”, but I’m not ex­actly com­fortable con­clud­ing I’m a Boltz­mann brain, ei­ther.

EDIT: On fur­ther re­flec­tion, I also wouldn’t want to build an AI that con­cluded it was a Boltz­mann brain! Is there a form of in­fer­ence which re­jects this con­clu­sion with­out rely­ing on any rea­son­ing about sub­jec­tivity?

EDIT2: Psy-Kosh has con­verted this into a non-an­thropic prob­lem!

• Ac­tu­ally… how is this an an­thropic situ­a­tion AT ALL?

I mean, wouldn’t it be equiv­a­lent to, say, gather 20 ra­tio­nal peo­ple (That un­der­stand PD, etc etc etc, and can cer­tainly man­age to agree to co­or­di­nate with each other) that are al­lowed to meet with each other in ad­vance and dis­cuss the situ­a­tion...

I show up and tell them that I have two buck­ets of mar­bles, some of which are green, some of which are red

One bucket has 18 green and 2 red, and the other bucket has 18 red and 2 green.

I will (already have) flipped a log­i­cal coin. Depend­ing on the out­come, I will use ei­ther one bucket or the other.

After hav­ing an op­por­tu­nity to dis­cuss strat­egy, they will be al­lowed to reach into the bucket with­out look­ing, pull out a mar­ble, look at it, then, if it’s green choose if to pay and steal, etc etc etc. (in case it’s not ob­vi­ous, the pay­out rules be­ing equiv­a­lent to the OP)

As near as I can de­ter­mine, this situ­a­tion is en­tirely equiv­a­lent to the OP and is in no way an an­thropic one. If the OP ac­tu­ally is an ar­gu­ment against an­thropic up­dates in the pres­ence of log­i­cal un­cer­tainty… then it’s ac­tu­ally an ar­gu­ment against the gen­eral case of Bayesian up­dat­ing in the pres­ence of log­i­cal un­cer­tainty, even when there’s no an­thropic stuff go­ing on at all!

EDIT: oh, in case it’s not ob­vi­ous, mar­bles are not re­placed af­ter be­ing drawn from the bucket.

• That un­cer­tainty is log­i­cal seems to be ir­rele­vant here.

• Agreed. But I seem to re­call see­ing some com­ments about dis­t­in­guish­ing be­tween quan­tum and log­i­cal un­cer­tainty, etc etc, so figured may as well say that it at least is equiv­a­lent given that it’s the same type of un­cer­tainty as in the origi­nal prob­lem and so on...

• Right, and this is a per­spec­tive very close to in­tu­ition for UDT: you con­sider differ­ent in­stances of your­self at differ­ent times as sep­a­rate de­ci­sion-mak­ers that all share the com­mon agenda (“global strat­egy”), co­or­di­nated “off-stage”, and im­ple­ment it with­out change de­pend­ing on cir­cum­stances they en­counter in each par­tic­u­lar situ­a­tion. The “off-sta­ge­ness” of co­or­di­na­tion is more nat­u­rally de­scribed by TDT, which al­lows con­sid­er­ing differ­ent agents as UDT-in­stances of the same strat­egy, but the pre­cise way in which it hap­pens re­mains magic.

• Nesov, the rea­son why I re­gard Dai’s for­mu­la­tion of UDT as such a sig­nifi­cant im­prove­ment over your own is that it does not re­quire offstage co­or­di­na­tion. Offstage co­or­di­na­tion re­quires a base the­ory and a priv­ileged van­tage point and, as you say, magic.

• Nesov, the rea­son why I re­gard Dai’s for­mu­la­tion of UDT as such a sig­nifi­cant im­prove­ment over your own is that it does not re­quire offstage co­or­di­na­tion. Offstage co­or­di­na­tion re­quires a base the­ory and a priv­ileged van­tage point and, as you say, magic.

I still don’t un­der­stand this em­pha­sis. Here I sketched in what sense I mean the global solu­tion—it’s more about defi­ni­tion of prefer­ence than the ac­tual com­pu­ta­tions and ac­tions that the agents make (lo­cally). There is an ab­stract con­cept of global strat­egy that can be char­ac­ter­ized as be­ing “offstage”, but there is no offstage com­pu­ta­tion or offstage co­or­di­na­tion, and in gen­eral com­plete com­pu­ta­tion of global strat­egy isn’t performed even lo­cally—only ap­prox­i­ma­tions, of­ten ap­prox­i­ma­tions that make it im­pos­si­ble to im­ple­ment the globally best solu­tion.

In the above com­ment, by “magic” I referred to ex­act mechanism that says in what way and to what ex­tent differ­ent agents are run­ning the same al­gorithm, which is more in the do­main of TDT, UDT gen­er­ally not talk­ing about sep­a­rate agents, only differ­ent pos­si­ble states of the same agent. Which is why nei­ther con­cept solves the bar­gain­ing prob­lem: it’s out of UDT’s do­main, and TDT takes the rele­vant pieces of the puz­zle as given, in its causal graphs.

For fur­ther dis­am­bigua­tion, see for ex­am­ple this com­ment you made:

We’re tak­ing apart your “math­e­mat­i­cal in­tu­ition” into some­thing that in­vents a causal graph (this part is still magic) and a part that up­dates a causal graph “given that your out­put is Y” (Pearl says how to do this).

• Again, if we ran­domly se­lected some­one to ask, rather than hav­ing speci­fied in ad­vance that we’re go­ing to make the de­ci­sion de­pend on the unan­i­mous re­sponse of all peo­ple in green rooms, then there would be no para­dox. What you’re talk­ing about here, pul­ling out a ran­dom mar­ble, is the equiv­a­lent of ask­ing a ran­dom sin­gle per­son from ei­ther green or red rooms. But this is not what we’re do­ing!

• Either I’m mi­s­un­der­stand­ing some­thing, or I wasn’t clear.

To make it ex­plicit: EVERYONE who gets a green mar­ble gets asked, and the out­come de­pends their con­sent be­ing unan­i­mous, just like ev­ery­one who wakes up in a green room gets asked. ie, all twenty ra­tio­nal­ists draw a mar­ble from the bucket, so that by the end, the bucket is empty.

Every­one who got a green mar­ble gets asked for their de­ci­sion, and the fi­nal out­come de­pends on all the an­swers. The bit about them draw­ing mar­bles in­di­vi­d­u­ally is just to keep them from see­ing what mar­bles the oth­ers got or be­ing able to talk to each other once the mar­ble draw­ing starts.

Un­less I com­pletely failed to com­pre­hend some as­pect of what’s go­ing on here, this is effec­tively equiv­a­lent to the prob­lem you de­scribed.

• Oh, okay, that wasn’t clear ac­tu­ally. (Be­cause I’m used to “they” be­ing a gen­der­less sin­gu­lar pro­noun.) In that case these prob­lems do in­deed look equiv­a­lent.

Hm. Hm hm hm. I shall have to think about this. It is a an ex­tremely good point. The more so as any­one who draws a green mar­ble should in­deed be as­sign­ing a 90% prob­a­bil­ity to there be­ing a mostly-green bucket.

• Sorry about the unclar­ity then. I prob­a­bly should have ex­plic­itly stated a step by step “mar­ble game pro­ce­dure”.

My per­sonal sug­ges­tion if you want an “an­thropic rea­son­ing is con­foooz­ing” situ­a­tion would be the whole an­thropic up­dat­ing vs au­mann agree­ment thing, since the dis­agree­ment would seem to be pre­dictable in ad­vance, and ev­ery­one in­volved would ap­pear to be able to be ex­pected to agree that the dis­agree­ment is right and proper. (ie, mad sci­en­tist sets up a quan­tum suicide ex­per­i­ment. Test sub­ject sur­vives. Test sub­ject seems to have Bayesian ev­i­dence in fa­vor of MWI vs sin­gle world, ex­ter­nal ob­server mad sci­en­tist who sees the test sub­ject/​vic­tim sur­vive would seem to not have any par­tic­u­lar new ev­i­dence fa­vor­ing MWI over sin­gle world)

(Yes, I know I’ve brought up that sub­ject sev­eral times, but it does seem, to me, to be a rather more blatant “some­thing funny is go­ing on here”)

(EDIT: okay, I guess this would count as quan­tum mur­der rather than quan­tum suicide, but you know what I mean.)

• I don’t see how be­ing as­signed a green or red room is “an­thropic” while be­ing as­signed a green or red mar­ble is not an­thropic.

I thought the an­thropic part came from up­dat­ing on your own in­di­vi­d­ual ex­pe­rience in the ab­sence of ob­serv­ing what ob­ser­va­tions oth­ers are mak­ing.

• The differ­ence wasn’t mar­ble vs room but “copies of one be­ing, so num­ber of be­ings changed” vs “just gather 20 ra­tio­nal­ists...”

But my whole point was “the origi­nal wasn’t re­ally an an­thropic situ­a­tion, let me con­struct this al­ter­nate yet equiv­a­lent ver­sion to make that clear”

• Do you think that the Sleep­ing Beauty prob­lem is an an­thropic one?

• It prob­a­bly counts as an in­stance of the gen­eral class of prob­lems one would think of as an “an­thropic prob­lem”.

• I see. I had always thought of the prob­lem as in­volv­ing 20 (or some­times 40) differ­ent peo­ple. The rea­son for this is that I am an in­tu­itive rather than literal reader, and when Eliezer men­tioned stuff about copies of me, I just in­ter­preted this as mean­ing to em­pha­size that each per­son has their own in­de­pen­dent ‘sub­jec­tive re­al­ity’. Really only mean­ing that each per­son doesn’t share ob­ser­va­tions with the oth­ers.

So all along, I thought this prob­lem was about challeng­ing the sound­ness of up­dat­ing on a sin­gle in­de­pen­dent ob­ser­va­tion in­volv­ing your­self as though you are some kind of spe­cial refer­ence frame.

… there­fore, I don’t think you took this el­e­ment out, but I’m glad you are re­solv­ing the mean­ing of “an­thropic” be­cause there are prob­a­bly quite a few differ­ent “sub­jec­tive re­al­ities” cir­cu­lat­ing about what the essence of this prob­lem is.

• Sorry for de­lay.

Copies as in “up­load your mind. then run 20 copies of the up­loaded mind”.

And yes, I know there’s still tricky bits left in the prob­lem, I merely es­tab­lished that those tricky bits didn’t de­rive from effects like mind copy­ing or quan­tum suicide or any­thing like that and could in­stead show up in or­di­nary sim­ple stuff, with no need to ap­peal to an­thropic prin­ci­ples to pro­duce the con­fu­sion. (sorry if that came out bab­bly, am get­ting tired)

• That’s funny: when Eliezer said “imag­ine there are two of you”, etc., I had as­sumed he meant two of us ra­tio­nal­ists, etc.

• any­one who draws a green mar­ble should in­deed be as­sign­ing a 90% prob­a­bil­ity to there be­ing a mostly-green bucket.

I don’t think so. I think the an­swer to both these prob­lems is that if you up­date cor­rectly, you get 0.5.

• *blinks* mind ex­pand­ing on that?

P(green|mostly green bucket) = 1820

P(green|mostly red bucket) = 220

like­li­hood ra­tio = 9

if one started with no par­tic­u­lar ex­pec­ta­tion of it be­ing one bucket vs the other, ie, as­signed 1:1 odds, then af­ter up­dat­ing upon see­ing a green mar­ble, one ought as­sign 9:1 odds, ie, prob­a­bil­ity 910, right?

• I guess that does need a lot of ex­plain­ing.

I would say:

P(green|mostly green bucket) = 1

P(green|mostly red bucket) = 1

P(green) = 1

be­cause P(green) is not the prob­a­bil­ity that you will get a green mar­ble, it’s the prob­a­bil­ity that some­one will get a green mar­ble. From the per­spec­tive of the pri­ors, all the mar­bles are drawn, and no one draw is differ­ent from any other. If you don’t draw a green mar­ble, you’re dis­carded and the peo­ple who did get a green vote. For the pur­poses of figur­ing out the pri­ors for a group strat­egy, your draw be­ing green is not an event.

Of course, you know that you’ve drawn green. But the only thing you can trans­late it into that has a prior is “some­one got green.”

That prob­a­bly sounds con­trived. Maybe it is. But con­sider a slightly differ­ent ex­am­ple:

• Two mar­bles and two peo­ple in­stead of twenty.

• One mar­ble is green, the other will be red or green based on a coin flip (green on heads, red on tails).

I like this ex­am­ple be­cause it com­bines the two con­flict­ing in­tu­itions in the same prob­lem. Only a fool would draw a red mar­ble and re­main un­cer­tain about the coin flip. But some­one who draws a green mar­ble is in a situ­a­tion similar to the twenty mar­ble sce­nario.

If you were to plan ahead of time how the greens should vote, you would tell them to as­sume 50%. But a per­son hold­ing a green mar­ble might think it’s 23 in fa­vor of dou­ble green.

To avoid em­bar­rass­ing para­doxes, you can base ev­ery­thing on the four events “heads,” “tails,” “some­one gets green,” and “some­one gets red.” Up­date as nor­mal.

• yes, the prob­a­bil­ity that some­one will get a green mar­ble is rather differ­ent than the prob­a­bil­ity that I, per­son­ally, will get a green mar­ble. But if I do per­son­ally get a green mar­ble, that’s ev­i­dence in fa­vor of green bucket.

The de­ci­sion al­gorithm for how to re­spond to that though in this case is skewed due to the rules for the pay­out.

And in your ex­am­ple, if I drew green, I’d con­sider the 23 prob­a­bil­ity the cor­rect one for who­ever drew green.

Now, if there’s a pay­out scheme in­volved with funny busi­ness, that may al­ter some de­ci­sions, but not mag­i­cally change my episte­mol­ogy.

• What kind of funny busi­ness?

• Let’s just say that you don’t draw blue.

• OK, but I think Psy-Kosh was talk­ing about some­thing to do with the pay­offs. I’m just not sure if he means the vot­ing or the dol­lar amounts or what.

• Sorry for de­lay. And yeah, I meant stuff like “only greens get to de­cide, and the de­ci­sion needs to be unan­i­mous” and so on

• I agree that changes the an­swer. I was as­sum­ing a scheme like that in my two mar­ble ex­am­ple. In a more typ­i­cal situ­a­tion, I would also say 23.

To me, it’s not a dras­tic (or mag­i­cal) change, just get­ting a differ­ent an­swer to a differ­ent ques­tion.

• Um… okay… I’m not sure what we’re dis­agree­ing about here, if any­thing:

my po­si­tion is “given that I found my­self with a green mar­ble, it is right and proper for me to as­sign a 23 prob­a­bil­ity to both be­ing green. How­ever, the cor­rect choice to make, given the pecuiluar­i­ties of this spe­cific prob­lem, may re­quire one to make a de­ci­sion that seems, on the sur­face, as if one didn’t up­date like that at all.”

• Well, we might be say­ing the same thing but com­ing from differ­ent points of view about what it means. I’m not ac­tu­ally a bayesian, so when I talk about as­sign­ing prob­a­bil­ities and up­dat­ing them, I just mean do­ing equa­tions.

What I’m say­ing here is that you should set up the equa­tions in a way that re­flects the group’s point of view be­cause you’re tel­ling the group what to do. That in­volves plug­ging some prob­a­bil­ities of one into Bayes’ Law and get­ting a fi­nal an­swer equal to one of the start­ing num­bers.

• OK, but I think Psy-Kosh was talk­ing about some­thing to do with the pay­offs.

So was I. But for­tu­nately I was re­strained enough to tem­per my un­couth hu­mour with ob­scu­rity.

• Very en­light­en­ing!

It just shows that the OP was an over­com­pli­cated ex­am­ple gen­er­at­ing con­fu­sion about the up­date.

[EDIT] Deleted rest of the com­ment due to re­vised opinion here: http://​​less­wrong.com/​​lw/​​17c/​​out­law­ing_an­throp­ics_an_up­date­less_dilemma/​​13hk

• Good point. After think­ing about this for a while, I feel com­fortable si­mul­ta­neously hold­ing these views:

1) You shouldn’t do an­thropic up­dates. (i.e. up­date on the fact that you ex­ist)

2) The ex­am­ple posed in the top-level post is not an ex­am­ple of an­thropic rea­son­ing, but rea­son­ing on spe­cific givens and ob­ser­va­tions, as are most sup­posed ex­am­ples of an­thropic rea­son­ing.

3) Any ev­i­dence aris­ing from the fact that you ex­ist is im­plic­itly con­tained by your ob­ser­va­tions by virtue of their ex­is­tence.

Wikipe­dia gives one ex­am­ple of a pro­duc­tive use of the an­thropic prin­ci­ple, but it ap­pears to be rea­son­ing based on ob­ser­va­tions of the type of life-form we are, as well as other hard-won bio­chem­i­cal knowl­edge, well above and be­yond the ob­ser­va­tion that we ex­ist.

• Thanks.

I don’t THINK I agree with your point 1. ie, I fa­vor say­ing yes to an­thropic up­dates, but I ad­mit that there’s definitely con­fus­ing is­sues here.

Mind ex­pand­ing on point 3? I think I get what you’re say­ing, but in gen­eral we filter out that part our ob­ser­va­tions, that is, the fact that ob­ser­va­tions are oc­cur­ring at all, Get­ting that back is the point of an­thropic up­dat­ing. Ac­tu­ally… IIRC, Nick Bostrom’s way of talk­ing about an­thropic up­dates more or less is ex­actly your point 3 in re­verse… ie, near as I can de­ter­mine and re­call, his po­si­tion ex­plic­itly ad­vo­cates talk­ing about the sig­nifi­cance that ob­ser­va­tions are oc­cur­ring at all as part of the usual up­date based on ob­ser­va­tion. Maybe I’m mis­re­mem­ber­ing though.

Also, sep­a­rat­ing it out into a sin­gle an­thropic up­date and then treat­ing all ob­ser­va­tions as con­di­tional on your ex­is­tence or such helps avoid dou­ble count­ing that as­pect, right?

Also, here’s an­other physics ex­am­ple, a bit more re­cent that was dis­cussed on OB a while back.

• Read­ing the link, the sec­ond pa­per’s ab­stract, and most of Scott Aaron­son’s post, it looks to me like they’re not us­ing an­thropic rea­son­ing at all. Robin Han­son sum­ma­rizes their “en­tropic prin­ci­ple” (and the ab­stract and all dis­cus­sion agree with his sum­mary) as

since ob­servers need en­tropy gains to func­tion phys­i­cally, we can es­ti­mate the prob­a­bil­ity that any small space­time vol­ume con­tains an ob­server to be pro­por­tional to the en­tropy gain in that vol­ume.

The prob­lem is that “ob­server” is not the same as “an­throp-” (hu­man). This prin­ci­ple is just a sub­tle restate­ment of ei­ther a tau­tol­ogy or known phys­i­cal law. Be­cause it’s not that “ob­servers need en­tropy gains”. Rather, ob­ser­va­tion is en­tropy gain. To ob­serve some­thing is to in­crease one’s mu­tual in­for­ma­tion with it. But since phase space is con­served, all gains in mu­tual in­for­ma­tion must be offset by an in­crease in en­tropy.

But since “ob­servers” are sim­ply any­thing that forms mu­tual in­for­ma­tion with some­thing else, it doesn’t mean a con­scious ob­server, let alone a hu­man one. For that, you’d need to go be­yond P(en­tropy gain|ob­server) to P(con­scious­ness|en­tropy gain).

(I’m a bit dis­tressed no one else made this point.)

Now, this idea could lead to an in­sight if you en­dorsed some neo-an­i­mistic view that con­scious­ness is pro­por­tional to nor­mal­ized rate of mu­tual in­for­ma­tion in­crease, and so hu­mans are (as) con­scious (as we are) be­cause we’re above some thresh­old … but again, you’d be us­ing noth­ing from your ex­is­tence as such.

• The ar­gu­ment was “higher rate of en­tropy pro­duc­tion is cor­re­lated with more ob­servers, prob­a­bly. So we should ex­pect to find our­selves in chunks of re­al­ity that have high rates of en­tropy pro­duc­tion”

I guess it wasn’t just ob­servers, but (non re­versible) computations

ie, an­thropic rea­son­ing was the jus­tifi­ca­tion for us­ing the en­tropy pro­duc­tion crite­ria in the first place. Yes, there is a ques­tion of frac­tions of ob­servers that are con­scious, etc… but a uni­verse that can’t sup­port much in the way of ob­servers at all prob­a­bly can’t sup­port much in the way of con­scious ob­servers, while a uni­verse that can sup­port lots of ob­servers can prob­a­bly sup­port more con­scious ob­servers than the other, right?

Or did I mi­s­un­der­stand your point?

• Now I’m not un­der­stand­ing how your re­sponse ap­plies.

My point was: the en­tropic prin­ci­ple es­ti­mates the prob­a­bil­ity of ob­servers per unit vol­ume by us­ing the en­tropy per unit vol­ume. But this fol­lows im­me­di­ately from the sec­ond law and con­ser­va­tion of phase space; it’s nec­es­sar­ily true.

To the ex­tent that it as­signs a prob­a­bil­ity to a class that in­cludes us, it does a poor job, be­cause we make up a tiny frac­tion of the “ob­servers” (ap­pro­pri­ately defined) in the uni­verse.

• The situ­a­tion is not iden­ti­cal in the non-an­thropic case in that there are equal num­bers of rooms but differ­ing num­bers of mar­bles.

There’s only one green room (so ob­serv­ing it is ev­i­dence for heads-green with p=0.5) whereas there are 18 green mar­bles, so p(heads|green)= ((18/​20)/​0.5)*0.5 = 0.9.

• Sorry for de­layed re­sponse.

Any­ways, how so? 20 rooms in the origi­nal prob­lem, 20 mar­bles in mine.

what frac­tion are green vs red de­rives from ex­am­in­ing a log­i­cal coin, etc etc etc… I’m not sure where you’re get­ting the only one green room thing.

• An AI that runs UDT wouldn’t con­clude that it was a Boltz­mann or non-Boltz­mann brain. For such an AI, the state­ment has no mean­ing, since it’s always both. The clos­est equiv­a­lent would be “Most of the value I can cre­ate by mak­ing the right de­ci­sion is con­cen­trated in the vicinity of non-Boltz­mann brains.”

BTW, does my in­dex­i­cal un­cer­tainty and the Ax­iom of In­de­pen­dence post make any more sense now?

• This was my take af­ter go­ing through a similar anal­y­sis (with ap­ples, not pa­per­clips) at the SIAI sum­mer in­tern pro­gram.

• It seems promis­ing that sev­eral peo­ple are con­verg­ing on the same “up­date­less” idea. But some­times I won­der why it took so long, if it’s re­ally the right idea, given the amount of brain­power spent on this is­sue. (Take a look at http://​​www.an­thropic-prin­ci­ple.com/​​pro­files.html and con­sider that Nick Bostrom wrote “In­ves­ti­ga­tions into the Dooms­day Ar­gu­ment” in 1996 and then did his whole Ph.D. on an­thropic rea­son­ing, cul­mi­nat­ing in a book pub­lished in 2002.)

BTW, weren’t the SIAI sum­mer in­terns sup­posed to try to write one LessWrong post a week (or was it a month)? What hap­pened to that plan?

• But some­times I won­der why it took so long, if it’s re­ally the right idea, given the amount of brain­power spent on this is­sue.

Peo­ple are crazy, the world is mad. Also in­vent­ing ba­sic math is a hell of a lot harder than read­ing it in a text­book af­ter­ward.

• Peo­ple are crazy, the world is mad.

I sup­pose you’re refer­ring to the fact that we are “de­signed” by evolu­tion. But why did evolu­tion cre­ate a species that in­vented the num­ber field sieve (to give a ran­dom piece of non-ba­sic math) be­fore UDT? It doesn’t make any sense.

Also in­vent­ing ba­sic math is a hell of a lot harder than read­ing it in a text­book af­ter­ward.

In what sense is it “hard”? I don’t think it’s hard in a com­pu­ta­tional sense, like NP-hard. Or is it? I guess it goes back to the ques­tion of “what al­gorithm are we us­ing to solve these types of prob­lems?”

• No, I’m refer­ring to the fact that peo­ple are crazy and the world is mad. You don’t need to reach so hard for an ex­pla­na­tion of why no one’s in­vented UDT yet when many-wor­lds wasn’t in­vented for thirty years.

• I also don’t think gen­eral mad­ness is enough of an ex­pla­na­tion. Both are coun­ter­in­tu­itive ideas in ar­eas with­out well-es­tab­lished meth­ods to ver­ify progress, e.g. build­ing a work­ing ma­chine or stan­dard math­e­mat­i­cal proof tech­niques.

• The OB/​LW/​SL4/​TOElist/​poly­math­list group is one in­tel­lec­tual com­mu­nity draw­ing on similar prior work that hasn’t been broadly dis­sem­i­nated.

The same ar­gu­ments ap­ply with much greater force to the the causal de­ci­sion the­ory vs ev­i­den­tial de­ci­sion the­ory de­bate.

The in­terns wound up more fo­cused on their group pro­jects. As it hap­pens, I had told Katja Grace that I was go­ing to write up a post show­ing the differ­ence be­tween UDT and SIA (us­ing my ap­ples ex­am­ple which is iso­mor­phic with the ex­am­ple above), but in light of this post it seems need­less.

• UDT is ba­si­cally the bare defi­ni­tion of re­flec­tive con­sis­tency: it is a non-solu­tion, just state­ment of the prob­lem in con­struc­tive form. UDT says that you should think ex­actly the same way as the “origi­nal” you thinks, which guaran­tees that the origi­nal you won’t be dis­ap­pointed in your de­ci­sions (re­flec­tive con­sis­tency). It only looks good in com­par­i­son to other the­o­ries that fail this par­tic­u­lar re­quire­ment, but oth­er­wise are much more mean­ingful in their do­mains of ap­pli­ca­tion.

TDT fails re­flec­tive con­sis­tency in gen­eral, but offers a cor­rect solu­tion in a do­main that is larger than those of other prac­ti­cally use­ful de­ci­sion the­o­ries, while re­tain­ing their ex­pres­sivity/​effi­ciency (i.e. up­dat­ing on graph­i­cal mod­els).

• The OB/​LW/​SL4/​TOElist/​poly­math­list group is one in­tel­lec­tual com­mu­nity draw­ing on similar prior work that hasn’t been broadly dis­sem­i­nated.

What prior work are you refer­ring to, that hasn’t been broadly dis­sem­i­nated?

The same ar­gu­ments ap­ply with much greater force to the the causal de­ci­sion the­ory vs ev­i­den­tial de­ci­sion the­ory de­bate.

I think much less brain­power has been spent on CDT vs EDT, since that’s thought of as more of a tech­ni­cal is­sue that only pro­fes­sional de­ci­sion the­o­rists are in­ter­ested in. Like­wise, New­comb’s prob­lem is usu­ally seen as an in­tel­lec­tual cu­ri­os­ity of lit­tle prac­ti­cal use. (At least that’s what I thought un­til I saw Eliezer’s posts about the po­ten­tial link be­tween it and AI co­op­er­a­tion.)

An­thropic rea­son­ing, on the other hand, is widely known and dis­cussed (I re­mem­ber the Dooms­day Ar­gu­ment brought up dur­ing a ca­sual lunch-time con­ver­sa­tion at Microsoft), and thought to be both in­ter­est­ing in it­self and hav­ing im­por­tant ap­pli­ca­tions in physics.

The in­terns wound up more fo­cused on their group pro­jects.

I miss the ar­ti­cles they would have writ­ten. :) Maybe post the topic ideas here and let oth­ers have a shot at them?

• “What prior work are you refer­ring to, that hasn’t been broadly dis­sem­i­nated?”

I’m think­ing of the cor­pus of past posts on those lists, which bring cer­tain tools and con­cepts (Solomonoff In­duc­tion, an­thropic rea­son­ing, Pearl, etc) jointly to read­ers’ at­ten­tion. When those tools are com­bined and fo­cused on the same prob­lem, differ­ent fo­rum par­ti­ci­pants will tend to use them in similar ways.

• You might think that more top-notch economists and game the­o­rists would have ad­dressed New­comb/​TDT/​Hofs­tadter su­per­ra­tional­ity given their in­ter­est in the Pri­soner’s Dilemma.

Look­ing at the ac­tual liter­a­ture on the Dooms­day ar­gu­ment, there are some physi­cists in­volved (just as some economists and oth­ers have tried their hands at New­comb), but it seems like more philoso­phers. And an­throp­ics doesn’t seem core to pro­fes­sional suc­cess, e.g. Teg­mark can in­dulge in it a bit thanks to show­ing his stuff in ‘hard’ ar­eas of cos­mol­ogy.

• I just re­al­ized/​re­mem­bered that one rea­son that oth­ers haven’t found the TDT/​UDT solu­tions to New­comb/​an­thropic rea­son­ing may be that they were as­sum­ing a fixed hu­man na­ture, whereas we’re as­sum­ing an AI ca­pa­ble of self-mod­ifi­ca­tion. For ex­am­ple, economists are cer­tainly more in­ter­ested in an­swer­ing “What would hu­man be­ings do in PD?” than “What should AIs do in PD as­sum­ing they know each oth­ers’ source code?” And per­haps some of the an­thropic thinkers (in the list I linked to ear­lier) did in­vent some­thing like UDT, but then thought “Hu­man be­ings can never prac­tice this, I need to keep look­ing.”

• This post is an ar­gu­ment against vot­ing on your up­dated prob­a­bil­ity when there is a se­lec­tion effect such as this. It ap­plies to any ev­i­dence (mar­bles, ex­is­tence etc), but only in a spe­cific situ­a­tion, so has lit­tle to do with SIA, which is about whether you up­date on your own ex­is­tence to be­gin with in any situ­a­tion. Do you have ar­gu­ments against that?

• It’s for situ­a­tions in which differ­ent hy­pothe­ses all pre­dict that there will be be­ings sub­jec­tively in­dis­t­in­guish­able from you, which cov­ers the most in­ter­est­ing an­thropic prob­lems in my view. I’ll make some posts dis­t­in­guish­ing SIA, SSA, UDT, and ex­plor­ing their re­la­tion­ships when I’m a bit less busy.

• Are you say­ing this prob­lem arises in all situ­a­tions where mul­ti­ple be­ings in mul­ti­ple hy­pothe­ses make the same ob­ser­va­tions? That would sug­gest we can’t up­date on ev­i­dence most of the time. I think I must be mi­s­un­der­stand­ing you. Sub­jec­tively in­dis­t­in­guish­able be­ings arise in vir­tu­ally all prob­a­bil­is­tic rea­son­ing. If there were only one hy­poth­e­sis with one crea­ture like you, then all would be cer­tain.

The only in­ter­est­ing prob­lem in an­throp­ics I know of is whether to up­date on your own ex­is­tence or not. I haven’t heard a good ar­gu­ment for not (though I still have a few promis­ing pa­pers to read), so I am very in­ter­ested if you have one. Will ‘ex­plor­ing their re­la­tion­ships’ in­clude this?

• You can judge for your­self at the time.

• Well, we don’t want to build con­scious AIs, so of course we don’t want them to use an­thropic rea­son­ing.

Why is an­thropic rea­son­ing re­lated to con­scious­ness at all? Couldn’t any kind of Bayesian rea­son­ing sys­tem up­date on the ob­ser­va­tion of its own ex­is­tence (as­sum­ing such up­dates are a good idea in the first place)?

• Why do I think an­thropic rea­son­ing and con­scious­ness are re­lated?

In a nut­shell, I think sub­jec­tive an­ti­ci­pa­tion re­quires sub­jec­tivity. We hu­mans feel dis­satis­fied with a de­scrip­tion like “well, one sys­tem run­ning a con­tinu­a­tion of the com­pu­ta­tion in your brain ends up in a red room and two such sys­tems end up in green rooms” be­cause we feel that there’s this ex­tra “me” thing, whose fu­ture we need to ac­count for. We bother to ask how the “me” gets split up, what “I” should an­ti­ci­pate, be­cause we feel that there’s “some­thing it’s like to be me”, and that (un­less we die) there will be in fu­ture “some­thing it will be like to be me”. I sus­pect that the things I said in the pre­vi­ous sen­tence are at best con­fused and at worst non­sense. But the ques­tion of why peo­ple in­tuit crazy things like that is the philo­soph­i­cal ques­tion we la­bel “con­scious­ness”.

How­ever, the feel­ing that there will be in fu­ture “some­thing it will be like to be me”, and in par­tic­u­lar that there will be one “some­thing it will be like to be me” if taken se­ri­ously, forces us to have sub­jec­tive an­ti­ci­pa­tion, that is, to write prob­a­bil­ity dis­tri­bu­tion sum­ming to one for which copy we end up as. Once you do that, if you wake up in a green room in Eliezer’s ex­am­ple, you are forced to up­date to 90% prob­a­bil­ity that the coin came up heads (pro­vided you dis­tributed your sub­jec­tive an­ti­ci­pa­tion evenly be­tween all twenty copies in both the head and tail sce­nar­ios, which re­ally seems like the only sane thing to do.)

Or, at least, the same amount of “some­thing it is like to be me”-ness as we started with, in some ill-defined sense.

On the other hand, if you do not feel that there is any fact of the mat­ter as to which copy you be­come, then you just want all your copies to ex­e­cute what­ever strat­egy is most likely to get all of them the most money from your ini­tial per­spec­tive of ig­no­rance of the coin­flip.

In­ci­den­tally, the op­ti­mal strat­egy looks like an policy se­lected by up­date­less de­ci­sion the­ory and not like any prob­a­bil­ity of the the coin hav­ing been heads or tails. PlaidX beat me to the counter-ex­am­ple for p=50%. Counter-ex­am­ples of like PlaidX’s will work for any p<90%, and counter-ex­am­ples like Eliezer’s will work for any p>50%, so that pretty much cov­ers it. So, un­less we want to in­clude ugly hacks like re­spon­si­bil­ity, or un­less we let the copies rea­son Gold­enly (us­ing Eliezer’s origi­nal TDT) about each other’s ac­tions as tran­posed ver­sions of their own ac­tions (which does cor­rectly han­dle PlaidX’s counter-ex­am­ple, but might break in more com­pli­cated cases where no iso­mor­phism is ap­par­ent) there sim­ply isn’t a prob­a­bil­ity-of-heads that rep­re­sents the right thing for the copies to do no mat­ter the deal offered to them.

• Con­scious­ness is re­ally just a name for hav­ing a model of your­self which you can re­flect on and act on—plus a whole bunch of other con­fused in­ter­pre­ta­tions which don’t re­ally add much.

To do an­thropic rea­son­ing you have to have a sim­ple model of your­self which you can rea­son about.

Machines can do this too, of course, with­out too much difficulty. That typ­i­cally makes them con­scious, though. Per­haps we can imag­ine a ma­chine perform­ing an­thropic rea­son­ing while dream­ing—i.e. when most of its ac­tu­a­tors are dis­abled, and it would not nor­mally be re­garded as be­ing con­scious. How­ever, then, how would we know about its con­clu­sions?

• I think I’m with Bostrom.

The prob­lem seems to come about be­cause the good effects of 18 peo­ple be­ing cor­rect are more than wiped out by the bad effects of 2 peo­ple be­ing wrong.

I’m sure this im­bal­ance in the power of the agents has some­thing to do with it.

• What if, in­stead of re­quiring agree­ment of all copies in a green room, one copy in a green room was cho­sen at ran­dom to make the choice?

• In this case the cho­sen copy in the green room should up­date on the an­thropic ev­i­dence of be­ing cho­sen to make the choice. That copy had a 118 prob­a­bil­ity of be­ing cho­sen if the coin flip came up heads, and a 12 prob­a­bil­ity of be­ing cho­sen if the coin flip came up tails, so the odds of heads:tails should be up­dated from 9:1 to 1:1. This ex­actly can­celed the an­thropic ev­i­dence of be­ing in a green room.

• … or equiv­a­lently: you play a sep­a­rate game with ev­ery sin­gle copy in each green room...

In both cases, the an­thropic up­date gives the right solu­tion as I men­tioned in an ear­lier post. (And con­se­quently, this demon­strates the the crux of the prob­lem was in fact the col­lec­tive na­ture of the de­ci­sion.)

• They are not equiv­a­lent. If one green room copy is cho­sen at ran­dom, then the game will be played ex­actly once whether the coin flip re­sulted in heads or tails. But if ev­ery green room copy plays, the the game will be played 18 times if the coin came up heads and 2 times if the coin came up tails.

• Good point.

How­ever, be­ing cho­sen for the game (since the agent knows that in both cases ex­actly one copy will be cho­sen) also car­ries in­for­ma­tion the same way as be­ing in the green room. There­fore, (by the same logic) it would im­ply an ad­di­tional an­thropic up­date: “Although I am in a green groom, the fact that I am cho­sen to play the game makes it much less prob­a­ble that the coin is head.” So (by calcu­lat­ing the cor­rect chances), he can de­duce:

I am in a green room + I am cho­sen ⇒ P(head)=0.5

OTOH:

I am in a green room (not know­ing whether cho­sen) ⇒ P(head)=0.9

[EDIT]: I just noted that you already ar­gued the same way, I have plainly over­looked it.

• Curses on this prob­lem; I spent the whole day wor­ry­ing about it, and am now so much of a wreck that the fol­low­ing may or may not make sense. For bet­ter or worse, I came to a similar con­clu­sion of Psy-Kosh: that this could work in less an­thropic prob­lems. Here’s the equiv­a­lent I was us­ing:

Imag­ine Omega has a coin bi­ased so that it comes up the same way nine out of ten times. You know this, but you don’t know which way it’s bi­ased. Omega al­lows you to flip the coin once, and asks for your prob­a­bil­ity that it’s bi­ased in fa­vor of heads. The coin comes up heads. You give your prob­a­bil­ity as 910.

Now Omega takes 20 peo­ple and puts them in the same situ­a­tion as in the origi­nal prob­lem. It lets each of them flip their coins. Then it goes to each of the peo­ple who got tails, and offers \$1 to char­ity for each coin that came up tails, but threat­ens to steal \$3 from char­ity for each coin that came up heads.

This nonan­thropic prob­lem works the same way as the origi­nal an­thropic prob­lem. If the coin is re­ally bi­ased heads, 18 peo­ple will get heads and 2 peo­ple will get tails. In this case,the cor­rect sub­jec­tive prob­a­bil­ity to as­sign is definitely 910 in fa­vor of what­ever re­sult you got; af­ter all, this is the cor­rect prob­a­bil­ity when you’re the only per­son in the ex­per­i­ment, and just know­ing that 19 other peo­ple are also par­ti­ci­pat­ing in the ex­per­i­ment shouldn’t change mat­ters.

I don’t have a for­mal an­swer for why this hap­pens, but I can think of one more ex­am­ple that might throw a lit­tle light on it. In an­other thread, some­one men­tioned that lot­tery win­ners have ex­cel­lent ev­i­dence that they are brains-in-a-vat and that the rest of the world is an illu­sion be­ing put on by the Dark Lord of the Ma­trix for their en­ter­tain­ment. After all, if this was true, it wouldn’t be too un­likely for them to win the lot­tery, so for a suffi­ciently large lot­tery, the chance of win­ning it this way ex­ceeds the chance of win­ning it through luck.

Sup­pose Bob has won the lot­tery and so be­lieves him­self to be a brain in a vat. And sup­pose that the ev­i­dence for the simu­la­tion ar­gu­ment is poor enough that there is no other good rea­son to be­lieve your­self to be a brain in a vat. Omega goes up to Bob and asks him to take a bet on whether he is a brain in a vat. Bob says he is, he loses, and Omega laughs at him. What did he do wrong? Noth­ing. Omega was just be­ing mean by speci­fi­cally ask­ing the one per­son whom ve knew would get the an­swer wrong.

Omega’s lit­tle prank would still work if ve an­nounced ver in­ten­tion to perform it be­fore­hand. Ve would say “When one of you wins the lot­tery, I will be ask­ing this per­son to take a bet whether they are a brain in a vat or not!” Every­one would say “That lot­tery win­ner shouldn’t ac­cept Omega’s bet. We know we’re not brains in vats.” Then some­one wins the lot­tery, Omega asks if they’re a brain in a vat, and they say yes, and Omega laughs at them (note that this also works if we con­sider a coin with a bias such that it lands the same way 999999 out of a mil­lion times, let a mil­lion peo­ple flip it once, and ask peo­ple what they think the coin’s bias is, ask­ing the peo­ple who get the counter-to-ex­pec­ta­tions re­sult more of­ten than chance.)

Omega’s be­ing equally mean in the origi­nal prob­lem. There’s a 50% chance ve will go and ask the two out of twenty peo­ple who are speci­fi­cally most likely to be wrong and can’t do any­thing about it. The best course I can think of would be for ev­ery­one to swear an oath not to take the offer be­fore they got as­signed into rooms.

• Then some­one wins the lot­tery, Omega asks if they’re a brain in a vat, and they say yes, and Omega laughs at them

By as­sump­tion, if the per­son is right to be­lieve they’re in a sim, then most of the lot­tery win­ners are in sims, so while Omega laughs at them in our world, they win the bet with Omega in most of their wor­lds.

wrong and can’t do any­thing about it

should have been your clue to check fur­ther.

• This is a fea­ture of the origi­nal prob­lem, isn’t it?

Let’s say there are 1000 brains in vats, each in their own lit­tle world, and a “real” world of a billion peo­ple. The chance of a vat-brain win­ning the lot­tery is 1, and the chance of a real per­son win­ning the lot­tery is 1 in a mil­lion. There are 1000 real lot­tery win­ners and 1000 vat lot­tery win­ners, so if you win the lot­tery your chance of be­ing in a vat is 50-50. How­ever, if you look at any par­tic­u­lar world, the chances of this week’s sin­gle lot­tery win­ner be­ing a brain in a vat is 1000/​1001.

As­sume the origi­nal prob­lem is run mul­ti­ple times in mul­ti­ple wor­lds, and that the value of pi some­how differs in those wor­lds (prob­a­bly you used pi pre­cisely so peo­ple couldn’t do this, but bear with me). Of all the peo­ple who wake up in green rooms, 1820 of them will be right to take your bet. How­ever, in each par­tic­u­lar world, the chances of the green room peo­ple be­ing right to take the bet is 12.

In this situ­a­tion there is no para­dox. Most of the peo­ple in the green rooms come out happy that they took the bet. It’s only when you limit it to one uni­verse that it be­comes a prob­lem. The same is true of the lot­tery ex­am­ple. When re­stricted to a sin­gle (real, non-vat) uni­verse, it be­comes more trou­ble­some.

• Now Omega takes 20 peo­ple and puts them in the same situ­a­tion as in the origi­nal prob­lem. It lets each of them flip their coins. Then it goes to each of the peo­ple who got tails, and offers \$1 to char­ity for each coin that came up tails, but threat­ens to steal \$3 from char­ity for each coin that came up heads.

It’s worth not­ing that if ev­ery­one got to make this choice sep­a­rately—Omega do­ing it once for each per­son who re­sponds—then it would in­deed be wise for ev­ery­one to take the bet! This is ev­i­dence in fa­vor of ei­ther Bostrom’s di­vi­sion-of-re­spon­si­bil­ity prin­ci­ple, or byrnema’s poin­ter-based view­point, if in­deed those two views are nonequiv­a­lent.

• EDIT: Never mind

• Bostrom’s calcu­la­tion is cor­rect, but I be­lieve it is an ex­am­ple of mul­ti­ply­ing by the right co­effi­cients for the wrong rea­sons.

I did ex­actly the same thing—mul­ti­plied by the right co­effi­cients for the wrong rea­sons—in my deleted com­ment. I re­al­ized that the jus­tifi­ca­tion of these co­effi­cients re­quired a quite differ­ent prob­lem (in my case, I mod­eled that all the green roomers de­cided to evenly di­vide the spoils of the whole group) and the only rea­son it worked was be­cause mul­ti­ply­ing the first term by 118 and the next term by 12 meant you were effec­tively can­cel­ing away that the fac­tors the rep­re­sented your ini­tial 90% pos­te­rior, and thus ul­ti­mately just ap­ply­ing the 5050 prob­a­bil­ity of the non-an­thropic solu­tion.

An­thropic calcu­la­tion:

18/​20(12)+2/​20(-52) = 5.6

Bostrom-mod­ified calcu­la­tion for re­spon­si­bil­ity per per­son:

[18/​20(12)/​18+2/​20(-52)/​2] /​ 2 = −1

Non-an­thropic calcu­la­tion for EV per per­son:

[1/​2(12)+1/​2(-52)] /​20 = −1

My poin­ter-based view­point, in con­trast, is not a calcu­la­tion but a ra­tio­nale for why you must use the 5050 prob­a­bil­ity rather than the 9010 one. The ar­gu­ment is that each green roomer can­not use the in­for­ma­tion that they were in a green room be­cause this in­for­ma­tion was pre­s­e­lected (a bi­ased sam­ple). With effec­tively no in­for­ma­tion about what color room they’re in, each green roomer must re­sort to the non-an­thropic calcu­la­tion that the prob­a­bil­ity of flip­ping heads is 50%.

• I can very much re­late to Eliezer’s origi­nal gut re­ac­tion: I agree that Nick’s calcu­la­tion is very ad hoc and hardly jus­tifi­able.

How­ever, I also think that, al­though you are right about the poin­ter bias, your ex­pla­na­tion is still in­com­plete.

I think Psi-kosh made an im­por­tant step with his re­for­mu­la­tion. Espe­cially elimi­nat­ing the copy pro­ce­dure for the agents was es­sen­tial. If you fol­low through the math from the point of view of one of the agents, the na­ture of the prob­lem be­comes clear:

Try­ing to write down the pay­off ma­trix from the view­point of one of the agents, it be­comes clear that you can’t fill out any of the re­ward en­tries, since the out­come never de­pends on that agent’s de­ci­sion alone. If he got a green mar­ble, it still de­pends on other agents de­ci­sion and if he drew a red one, it will de­pend only on other agent’s de­ci­sion.

This makes it com­pletely clear that the only solu­tion is for the agents is to agree on a pre­de­ter­mined pro­to­col and there­fore the sec­ond calcu­la­tion of the OP is the only cor­rect one so far.

How­ever, this pro­to­col does not im­ply any­thing about P(head|be­ing in green room). It is sim­ply ir­rele­vant for the ex­pected value of any of the agreed upon pro­to­col. One could cre­ate a pro­to­col that de­pends on P(head|be­ing in a green room) for some of the agents, but you would have to an­a­lyze the ex­pected value of the pro­to­col from a global point of view, not just from the point of view of the agent, for you can’t com­plete the de­ci­sion ma­trix if the out­come de­pends on other agent’s de­ci­sions as well.

Of course a pre­de­ter­mined pro­to­col does not mean that the agents must ex­plic­itly agree on a nar­row pro­to­col be­fore the ac­tion. If we as­sume that the agents get all the in­for­ma­tion once they find them­selves in the room, they could still cre­ate a men­tal model of the whole global situ­a­tion and base their de­ci­sion on the sec­ond calcu­la­tion of the OP.

• I agree with you that the rea­son why you can’t use the 9010 prior is be­cause the de­ci­sion never de­pends on a per­son in a red room.

In Eliezer’s de­scrip­tion of the prob­lem above, he tells each green roomer that he asks all the green roomers if they want him to go ahead with a money dis­tri­bu­tion scheme, and they must be unan­i­mous or there is a penalty.

I think this is a nice pe­do­gog­i­cal com­po­nent that helps a per­son un­der­stand the dilemma, but I would like to em­pha­size here (even if you’re aware of it) that it is com­pletely su­perflu­ous to the me­chan­ics of the prob­lem. It doesn’t make any differ­ence if Eliezer bases his ac­tion on the an­swer of one green roomer or all of them.

For one thing, all green roomer an­swers will be unan­i­mous be­cause they all have the same in­for­ma­tion and are asked the same com­pli­cated ques­tion.

And, more to the point, even if just one green roomer is asked, the dilemma still ex­ists that he can’t use his prior that heads was prob­a­bly flipped.

• Agreed 100%.

[EDIT:] Although I would be a bit more gen­eral: re­gard­less of red rooms: if you have sev­eral ac­tors, even if they nec­es­sar­ily make the same de­ci­sion they have to an­a­lyze the global pic­ture. The only situ­a­tion when the agent should be al­lowed to make the sim­plified sub­jec­tive Bayesian de­ci­sion table anal­y­sis if he is the only ac­tor (no copies, etc. It is easy to con­struct sim­ple de­ci­sion prob­lems with­out “red rooms”: Where each of the ac­tors have some con­trol over the out­come and none of them can make the anal­y­sis for it­self only but have to buid a model of the whole situ­a­tion to make the globally op­ti­mal de­ci­sion.)

How­ever, I did not im­ply in any way that the penalty mat­ters. (At least, as long as the agents are sane and don’t start to flip non-log­i­cal coins) The global anal­y­sis of the pay­off may clearly dis­re­gard the penalty case if it’s im­pos­si­ble for that spe­cific pro­to­col. The only re­quire­ment is that the ex­pected value calcu­la­tion must be made pro­to­col by pro­to­col ba­sis.

• “I’ve made sac­ri­fices! You don’t know what it cost me to climb into that ma­chine ev­ery night, not know­ing if I’d be the man in the box or in the pres­tige!”

sorry- couldn’t help my­self.

• “I’ve made sac­ri­fices! You don’t know what it cost me to climb into that ma­chine ev­ery night, not know­ing if I’d be the man in the box or in the pres­tige!”

You know, I never could make sense out of that line. If you as­sume the ma­chine cre­ates “copies” (and that’s strongly im­plied by the story up to that point), then that means ev­ery time he gets on stage, he’s go­ing to wind up in the box. (And even if the copies are er­ror-free and ab­solutely in­ter­change­able, one copy will still end up in the box.)

(Edit to add: of course, if you view it from the quan­tum suicide POV, “he” never ends up in the box, since oth­er­wise “he” would not be there to try again the next night.)

• More think­ing out loud:

It re­ally is in your best in­ter­est to ac­cept the offer af­ter you’re in a green room. It re­ally is in your best in­ter­est to ac­cept the offer con­di­tional on be­ing in a green room be­fore you’re as­signed. Maybe part of the prob­lem arises be­cause you think your de­ci­sion will in­fluence the de­ci­sion of oth­ers, ie be­cause you’re act­ing like a time­less de­ci­sion agent. Re­place “me” with “any­one with my pla­tonic com­pu­ta­tion”, and “I should ac­cept the offer con­di­tional on be­ing in a green room” with “any­one with my pla­tonic com­pu­ta­tion should ac­cept the offer, con­di­tional on any­one with my pla­tonic com­pu­ta­tion be­ing in a green room.” But the chances of some­one with my pla­tonic com­pu­ta­tion be­ing in a green room is 100%. Or, to put it an­other way, the Pla­tonic Com­pu­ta­tion is won­der­ing “Should I ac­cept the offer con­di­tional on any one of my in­stan­ti­a­tions be­ing in a green room?”. But the Pla­tonic Com­pu­ta­tion knows that at least one of its in­stan­ti­a­tions will be in a green room, so it de­clines the offer. If the Pla­tonic Com­pu­ta­tion was re­ally a sin­gle or­ganism, its best op­tion would be to sin­gle out one of its in­stan­ti­a­tions be­fore-hand and de­cide “I will ac­cept the offer, given that In­stan­ti­a­tion 6 is in a green room”—but since most in­stan­ti­a­tions of the com­pu­ta­tion can’t know the sta­tus of In­stan­ti­a­tion 6 when they de­cide, it doesn’t have this op­tion.

• Yes, ex­actly.

If you are in a green room and some­one asks you if you will bet that a head was flipped, you should say “yes”.

How­ever, if that same per­son asks you if they should bet that heads was flipped, you should an­swer no if you as­cer­tain that they asked you on the pre­con­di­tion that you were in a green room.

• the prob­a­bil­ity of heads | you are in green room = 90%

• the prob­a­bil­ity of you bet­ting on heads | you are green room = 100% = no in­for­ma­tion about the coin flip

• Your first claim needs qual­ifi­ca­tions: You should only bet if you’re be­ing drawn ran­domly from ev­ery­one. If it is known that one ran­dom per­son in a green room will be asked to bet, then if you wake up in a green room and are asked to bet you should re­fuse.

P(Heads | you are in a green room) = 0.9 P(Be­ing asked | Heads and Green) = 118, P(Be­ing asked | Tails and Green) = 12 Hence P(Heads | you are asked in a green room) = 0.5

Of course the OP doesn’t choose a ran­dom in­di­vi­d­ual to ask, or even a ran­dom in­di­vi­d­ual in a green room. The OP asks all peo­ple in green rooms in this world.

If there is con­fu­sion about when your de­ci­sion al­gorithm “chooses”, then TDT/​UDT can try to make the lat­ter two cases equiv­a­lent, by think­ing about the “other choices I force”. Of course the fact that this as­serts some va­ri­ety of choice for a spe­cial in­di­vi­d­ual and not for oth­ers, when the situ­a­tion is sym­met­ric, sug­gests some­thing is be­ing missed.

What is be­ing missed, to my mind, is a dis­tinc­tion be­tween the dis­tri­bu­tion of (ran­dom in­di­vi­d­u­als | data is ob­served), and the dis­tri­bu­tion of (ran­dom wor­lds | data is ob­served).

In the OP, the lat­ter dis­tri­bu­tion isn’t al­tered by the up­date as the ob­served data oc­curs some­where with prob­a­bil­ity 1 in both cases. The former is be­cause it cares about the num­ber of copies in the two cases.

• I’ve been watch­ing for a while, but have never com­mented, so this may be hor­ribly flawed, opaque or oth­er­wise un­helpful.

I think the prob­lem is en­tirely caused by the use of the wrong sets of be­lief, and that any­thing hold­ing to Eliezer’s 1-line sum­mary of TDT or al­ter­na­tively UDT should get this right.

Sup­pose that you’re a ra­tio­nal agent. Since you are in­stan­ti­ated in mul­ti­ple iden­ti­cal cir­cum­stances (green rooms) and asked iden­ti­cal ques­tions, your an­swers should be iden­ti­cal. Hence if you wake up in a green room and you’re asked to steal from the red rooms and give to the green rooms, you ei­ther com­mit a group of 2 of you to a loss of 52 or com­mit a group of 18 of you to a gain of 12.

This com­mit­tal is what you wish to op­ti­mise over from TDT/​UDT, and clearly this re­quires knowl­edge about the likely­hood of differ­ent de­ci­sion mak­ing groups. The dis­tri­bu­tion of sizes of ran­dom groups is not the same as the dis­tri­bu­tion of sizes of groups that a ran­dom in­di­vi­d­ual is in. The prob­a­bil­ities of be­ing in a group are up­weighted by the size of the group and nor­mal­ised. This is why Bostrom’s sug­gested 1/​n split of re­spon­si­bil­ity works; it re­verses the be­lief about where a ran­dom in­di­vi­d­ual is in a set of de­ci­sion mak­ing groups to a be­lief about the size of a ran­dom de­ci­sion mak­ing group.

By the con­struc­tion of the prob­lem the prob­a­bil­ity that a ran­dom (group of all the peo­ple in green rooms) has size 18 is 0.5, and similarly for 2 the prob­a­bil­ity is 0.5. Hence the ex­pected util­ity is (0.512)+(0.5-52)=-20.

If you’re asked to ac­cept a bet on there be­ing 18 peo­ple in green rooms, and you’re told that only you’re be­ing offered it, then the de­ci­sion com­mits ex­actly one in­stance of you to a spe­cific loss or gain, re­gard­less of the group you’re in. Hence you can’t do bet­ter than the 0.9 and 0.1 be­liefs.

If you’re told that the bet is be­ing offered to ev­ery­one in a green room, then you are com­mit­ting to n times the out­come in any group of n peo­ple. In this case gains are con­di­tional on group size, and so you have to use the 0.5-0.5 be­lief about the dis­tri­bu­tion of groups. It doesn’t mat­ter be­cause the larger groups have the larger mul­ti­plier and thus shut­ting up and mul­ti­ply­ing yields the same an­swers as a sin­gle-shot bet.

ETA: At some level this is just choos­ing an op­ti­mal out­put for your calcu­la­tion of what to do, given that the re­sult is used vari­ably widely.

• This com­mit­tal is what you wish to op­ti­mise over from TDT/​UDT, and clearly this re­quires knowl­edge about the likely­hood of differ­ent de­ci­sion mak­ing groups.

I was in­fluenced by the OP and used to think that way. How­ever I think now, that this is not the root prob­lem.

What if the agents get more com­pli­cated de­ci­sion prob­lems: for ex­am­ple, re­wards de­pend­ing on the par­ity of the agents vot­ing cer­tain way, etc.?

I think, what es­sen­tial is that the agents have to think globally (cat­e­gor­i­cal im­per­a­tive, hmmm?)

Prac­ti­cally: if the agent rec­og­nizes that there is a col­lec­tive de­ci­sion, then it should model all available con­ceiv­able pro­to­cols (but mak­ing apri­ori sure that all co­op­er­at­ing agents perform the same or com­pat­i­ble anal­y­sis, if they can’t com­mu­ni­cate) and then they should choose the pro­to­col with best over­all to­tal gain. In the case of the OP: the sec­ond calcu­la­tion in the OP. (Not mess­ing around with cor­rec­tion fac­tors based on re­spon­si­bil­ities, etc.)

Spe­cial con­sid­er­a­tions based on group sizes etc. may be in­ci­den­tally cor­rect in cer­tain situ­a­tions, but this is just not gen­eral enough. The crux is that the ul­ti­mate test is sim­ply the ex­pected value com­pu­ta­tion for the pro­to­col of the whole group.

• Between non com­mu­ni­cat­ing copies of your de­ci­sion al­gorithm, it’s forced that ev­ery in­stance comes to the same an­swers/​dis­tri­bu­tions to all ques­tions, as oth­er­wise Eliezer can make money bet­ting be­tween differ­ent in­stances of the al­gorithm. It’s not re­ally a cat­e­gor­i­cal im­per­a­tive, be­yond de­mand­ing con­sis­tency.

The crux of the OP is ask­ing for a prob­a­bil­ity as­sess­ment of the world, not whether the DT func­tions.

I’m not pos­tu­lat­ing 1/​n al­lo­ca­tion of re­spon­si­bil­ity; I’m stat­ing that the source of the con­fu­sion is over: P(A ran­dom in­di­vi­d­ual is in a world of class A_i | Data) with P(A ran­dom world is of class A_i | Data) And that these are not equal if the num­ber of in­di­vi­d­u­als with ac­cess to Data are differ­ent in dis­tinct classes of world.

Hence in this case, there are 2 classes of world, A_1 with 18 Green rooms and 2 Reds, and A_2 with 2 Green rooms and 18 Reds.

P(Ran­dom in­di­vi­d­ual is in the A_1 class | Woke up in a green room) = 0.9 by an­thropic up­date. P(Ran­dom world is in the A_1 class | Some in­di­vi­d­ual woke up in a green room) = 0.5

Why? Be­cause in A_1 there 1820 in­di­vi­d­u­als fit the de­scrip­tion “Woke up in a green room”, but in A_2 only 220 do.

The crux of the OP is that nei­ther a 9010 nor 5050 split seem ac­cept­able, if bet­ting on “Which world-class an in­di­vi­d­ual in a Green room is in” and “Which world-class the (set of all in­di­vi­d­u­als in Green rooms which con­tains this in­di­vi­d­ual) is in” are iden­ti­cal. I as­sert that they are not. The first case is 0.9/​0.1 A_1/​A_2, the sec­ond is 0.5/​0.5 A_1/​A_2.

Con­sider a similar ques­tion where a ran­dom Green room will be asked. If you’re in that room, you up­date both on (Green walls) and (I’m be­ing asked) and re­cover the 0.5/​0.5, cor­rectly. This is close to the OP as if we wildly as­sert that you and only you have free will and force the oth­ers, then you are spe­cial. Equally in cases where ev­ery­one is asked and plays sep­a­rately, you have 18 or 2 times the benefits de­pend­ing on whether you’re in A_1 or A_2.

If each in­di­vi­d­ual Green room played sep­a­rately, then you up­date on (Green walls), but P(I’m be­ing asked|Green) = 1 in ei­ther case. This is bet­ting on whether there are 18 peo­ple in green rooms or 2, and you get the cor­rect 0.9/​0.1 split. To re­pro­duce the OP the offers would need to be +1/​18 to Greens and −3/​18 from Reds in A_1, and +1/​2 to Greens and −3/​2 from Reds in A_2, and then you’d re­fuse to play, cor­rectly.

• And how would they de­cide which pro­to­col had the best over­all to­tal gain? For in­stance, could you define a pro­to­col com­plex­ity mea­sure, and then use this com­plex­ity mea­sure to de­cide? And are you even deal­ing with or­di­nary Bayesian rea­son­ing any more, or is this the first hint of some new more gen­eral type of ra­tio­nal­ity?

MJG—The Black Swan is Near!

• It’s not about com­plex­ity, it is just ex­pected to­tal gain. Sim­ply the sec­ond calcu­la­tion of the OP.

I just ar­gued, that the sec­ond calcu­la­tion is right and that is what the agents should do in gen­eral. (un­less they are com­pletely ego­is­tic for their spe­cial copies)

• This was a sim­ple situ­a­tion. I’m sug­gest­ing a ‘big pic­ture’ idea for the gen­eral case.

Ac­cord­ing to Wei Dei and Nesov above, the an­thropic-like puz­zles can be re-in­ter­preted as ‘agent co-or­di­na­tion’ prob­lems (mul­ti­ple agents try­ing to co­or­di­nate their de­ci­sion mak­ing). And you seemed to have a si­mil­iar in­ter­pre­ta­tion. Am I right?

If Dei and Nesov’s in­ter­pre­ta­tion is right, it seems the puz­zles could be rein­ter­preted as be­ing about groups of agents tring to agree in ad­vance about a ‘de­ci­sion mak­ing pro­to­col’.

But now I ask is this not equiv­a­lent to try­ing to find a ‘com­mu­ni­ca­tion pro­to­col’ which en­ables them to best co­or­di­nate their de­ci­sion mak­ing? And rather than try­ing to di­rectly calcu­late the re­sults of ev­ery pos­si­ble pro­to­col (which would be im­prac­ti­cal for all but sim­ple prob­lems), I was sug­gest­ing try­ing to use in­for­ma­tion the­ory to ap­ply a com­plex­ity mea­sure to pro­to­cols, in or­der to rank them.

In­deed I ask whether this is ac­tu­ally the cor­rect way to in­ter­pret Oc­cam’s Ra­zor/​Com­plex­ity Pri­ors? i.e, My sug­ges­tion is to re-in­ter­pret Oc­cam/​Pri­ors as refer­ring to copies of agents try­ing to co-or­di­nate their de­ci­sion mak­ing us­ing some com­mu­ni­ca­tion pro­to­col, such that they seek to min­i­mize the com­plex­ity of this pro­to­col.

• “Hence if you wake up in a green room and you’re asked to steal from the red rooms and give to the green rooms, you ei­ther com­mit a group of 2 of you to a loss of 52 or com­mit a group of 18 of you to a gain of 12.”

In the ex­am­ple you care equally about the red room and green room dwellers.

• Hence if there are 2 in­stances of your de­ci­sion al­gorithm in Green rooms, there are 2 runs of your de­ci­sion al­gorithm, and if they vote to steal there is a loss of 3 from each red and gain 1 for each green, for a to­tal gain of 12-318 = − 52.

If there are 18 in­stances in Green rooms, there are 18 runs of your de­ci­sion al­gorithm, and if they vote to steal there is a loss of 3 from each red and a gain of 1for each green, for a to­tal gain of 118-23 = 12

The “com­mit­tal of a group of” is not­ing that there are 2 or 18 runs of your de­ci­sion al­gorithm that are log­i­cally forced by the de­ci­sion made this spe­cific in­stance of the de­ci­sion al­gorithm in a green room.

• You can’t re­ject the con­clu­sion that you are a Boltz­mann brain—but if you are, it doesn’t mat­ter what you do, so the idea doesn’t seem to have much im­pact on de­ci­sion the­ory.

• Again: how can you talk about con­clud­ing that you are a Boltz­mann brain? To con­clude means to up­date, and here you re­fuse up­dat­ing.

• I read this and told my­self that it only takes five min­utes to have an in­sight. Five min­utes later, here’s what I’m think­ing:

An­thropic rea­son­ing is con­fus­ing be­cause it treats con­scious­ness as a prim­i­tive. By do­ing so, we’re com­mit­ting LW’s ul­ti­mate no-no: as­sum­ing an on­tolog­i­cally fun­da­men­tal men­tal state. We need to find a way to re­for­mu­late an­thropic rea­son­ing in terms Solomonoff in­duc­tion. If we can suc­cess­fully do so, the para­dox will dis­solve.

• An­thropic rea­son­ing is con­fus­ing—prob­a­bly be­cause we are not used to do­ing it much in our an­ces­tral en­vi­ron­ment.

I don’t think you can ar­gue it treats con­scious­ness as a prim­i­tive, though. An­thropic rea­son­ing is challeng­ing—but not so tricky that ma­chines can’t do it.

• It in­volves calcu­lat­ing a ‘cor­rect mea­sure’ of how many par­tial du­pli­cates of a com­pu­ta­tion ex­ist:

www.nick­bostrom.com/​​pa­pers/​​ex­pe­rience.pdf

An­throp­ics does in­volve mag­i­cal cat­e­gories.

• Right—but that’s “Arthur C Clark-style magic”—stuff that is com­pli­cated and difficult—not the type of magic as­so­ci­ated with mys­ti­cal mumbo-jumbo.

We can live with some of the former type of magic—and it might even spice things up a bit.

• need to find a way to re­for­mu­late an­thropic rea­son­ing in terms Solomonoff induction

I fail to see how solomonoff can re­duce on­tolog­i­cally ba­sic men­tal states.

• There are lots of or­di­nary ex­am­ples in game the­ory of time in­con­sis­tent choices. Once you know how to re­solve them, then if you can’t use those ap­proaches to re­solve this I might be con­vinced that an­thropic up­dat­ing is at fault. But un­til then I think you are mak­ing a huge leap to blame an­thropic up­dat­ing for the time in­con­sis­tent choices.

• Robin, you’re jump­ing into the mid­dle of a big ex­tended dis­cus­sion. We’re not only blam­ing an­thropic up­dat­ing, we’re blam­ing Bayesian up­dat­ing in gen­eral, and propos­ing a de­ci­sion the­ory with­out it (Up­date­less De­ci­sion The­ory, or UDT). The ap­pli­ca­tion to an­thropic rea­son­ing is just that, an ap­pli­ca­tion.

UDT seems to solve all cases of time in­con­sis­tency in de­ci­sion prob­lems with one agent. What UDT agents do in multi-player games is still an open prob­lem that we’re work­ing on. There was an ex­ten­sive dis­cus­sion about it in the pre­vi­ous threads if you want to see some of the is­sues in­volved. But the key in­gre­di­ent that is miss­ing is a the­ory of log­i­cal un­cer­tainty, that tells us how differ­ent agents (or more gen­er­ally, com­pu­ta­tional pro­cesses) are log­i­cally cor­re­lated to each other.

• The or­di­nary time in­con­sis­ten­cies in game the­ory are all re­gard­ing mul­ti­ple agents. Seems odd to sug­gest you’ve solved the prob­lem ex­cept for those cases.

• Not ex­actly the way I would phrase it, but Time­less De­ci­sion The­ory and Up­date­less De­ci­sion The­ory be­tween them have already kil­led off a suffi­ciently large num­ber of time in­con­sis­ten­cies that treat­ing any re­main­ing ones as a Prob­lem seems well jus­tified. Yes, we have solved all or­di­nary dy­namic in­con­sis­ten­cies of con­ven­tional game the­ory already!

• Let’s take the sim­ple case of time in­con­sis­tency re­gard­ing pun­ish­ment. There is a two stage game with two play­ers. First A de­cides if to cheat B for some gain. Then B de­cides if to pun­ish A at some cost. Be­fore the game B would like to com­mit to pun­ish­ing A if A cheats, but once A has already cheated, B would rather not pun­ish.

• In UDT, we blame this time in­con­sis­tency on B’s up­dat­ing on A hav­ing cheated (i.e. treat­ing it as a fact that can no longer be al­tered). Sup­pose it’s com­mon knowl­edge that A can simu­late or ac­cu­rately pre­dict B, then B should rea­son that by de­cid­ing to pun­ish, it in­creases the prob­a­bil­ity that A would have pre­dicted that B would pun­ish and thus de­creases the prob­a­bil­ity that A would have cheated.

But the prob­lem is not fully solved, be­cause A could rea­son the same way, and de­cide to cheat no mat­ter what it pre­dicts that B does, in the ex­pec­ta­tion that B would pre­dict this and see that it’s pointless to pun­ish.

So UDT seems to elimi­nate time-in­con­sis­tency, but at the cost of in­creas­ing the num­ber of pos­si­ble out­comes, es­sen­tially turn­ing games with se­quen­tial moves into games with si­mul­ta­neous moves, with the at­ten­dant in­crease in the num­ber of Nash equil­ibria. We’re try­ing to work out what to do about this.

• So UDT seems to elimi­nate time-in­con­sis­tency, but at the cost of in­creas­ing the num­ber of pos­si­ble out­comes, es­sen­tially turn­ing games with se­quen­tial moves into games with si­mul­ta­neous moves, with the at­ten­dant in­crease in the num­ber of Nash equil­ibria. We’re try­ing to work out what to do about this.

Er, turn­ing games with se­quen­tial moves into games with si­mul­ta­neous moves is stan­dard in game the­ory, and “never cheat, always pun­ish cheat­ing” and “always cheat, never pun­ish” are what are con­sid­ered the Nash equil­ibria of that game in stan­dard par­lance. [ETA: Well, “never cheat, pun­ish x% of the time” will also be a NE for large enough x.] It is sub­game perfect equil­ibrium that rules out “never cheat, always pun­ish cheat­ing” (the set of all SPE of a se­quen­tial game is a sub­set of the set of all NE of that game).

• Yeah, I used the wrong ter­minol­ogy in the grand­par­ent com­ment. I guess the right way to put it is that SPE/​back­wards in­duc­tion no longer seems rea­son­able un­der UDT and it’s un­clear what can take its place, as far as re­duc­ing the num­ber of pos­si­ble solu­tions to a given game.

• It is sub­game perfect equil­ibrium that rules out “never cheat, always pun­ish cheat­ing” (the set of all SPE of a se­quen­tial game is a sub­set of the set of all NE of that game).

How strictly do you (or the stan­dard ap­proach) mean to rule out op­tions that aren’t good on all parts of the game? It seems like some­times you do want to do things that are sub­game sub­op­ti­mal.

Edit: or at least be known to do things, which un­for­tu­nately can re­quire ac­tu­ally be­ing pre­pared to do the things.

• Well, the clas­si­cal game the­o­rist would re­ply that they’re study­ing one-off games, in which the game you’re cur­rently play­ing doesn’t af­fect any pay­off you get out­side that game (oth­er­wise that should be made part of the game), so you can’t be do­ing the pun­ish­ment be­cause you want to be known to be a pun­isher, or the game that Robin speci­fied doesn’t model the situ­a­tion you’re in. The clas­si­cal game the­o­rist as­sumes you can’t look into peo­ple’s heads, so what­ever you say or do be­fore the cheat­ing, you’re always free to not pun­ish dur­ing the pun­ish­ment round (as you’re un­doubt­edly aware, mu­tual check­ing of source code is pro­hibited by an­titrust laws in over 185 coun­tries).

The clas­si­cal game the­o­rist would fur­ther point out that if you do want model that pun­ish­ment helps you be known as a pun­isher, then you should use their the­ory of re­peated games, where they have some folk the­o­rems for you say­ing that lots and lots of things can be Nash equil­ibria e.g. in a game where af­ter each round there is a fixed prob­a­bil­ity of an­other round; for ex­am­ple, co­op­er­a­tion in the pris­oner’s dilemma, but also all sorts of sub­op­ti­mal out­comes (which be­come Nash equil­ibria be­cause any de­vi­a­tor gets pun­ished as badly as the other play­ers can pun­ish them).

I should point out that not all clas­si­cal game the­o­rists think that SPE makes par­tic­u­larly good pre­dic­tions, though; I’ve read some­one say, I think Bin­more, that you ex­pect to vir­tu­ally always see a NE in the lab­o­ra­tory af­ter a learn­ing pe­riod, but not an SPE, and that the origi­nal in­ven­tor of SPE ac­tu­ally came up with it as an ex­am­ple of what you would not ex­pect to see in the lab, or some­thing to that tune. (Sorry, I should re­ally chase down that refer­ence, but I don’t have time right now. I’ll try to re­mem­ber to do that later. ETA: Ok, Bin­more and Shaked, 2010: Ex­per­i­men­tal Eco­nomics: Where Next? Jour­nal of Eco­nomic Be­hav­ior & Or­ga­ni­za­tion, 73: 87-100. See the stuff about back­ward in­duc­tion, start­ing at the bot­tom on p.88. The in­ven­tor of SPE is Rein­hard Selten, and the claim is that he didn’t be­lieve it would pre­dict what you see it in the lab and “[i]t was to demon­strate this fact that he en­couraged Werner Güth (...) to carry out the very first ex­per­i­ment on the Ul­ti­ma­tum game”, not that he in­vented SPE for this pur­pose.)

• so what­ever you say or do be­fore the cheat­ing, you’re always free to not pun­ish dur­ing the pun­ish­ment round

In­ter­est­ing. This idea, used as an ar­gu­ment for SPE, seems to be the free will de­bate in­trud­ing into de­ci­sion the­ory. “Only some of these al­gorithms have free­dom, and oth­ers don’t, and hu­mans are free, so they should be­have like the free al­gorithms.” This ei­ther ig­nores, or ac­cepts, the fact that the “free” al­gorithms are just as de­ter­minis­tic as the “un­free” al­gorithms. (And it de­pends on other stuff, but that’s not the fun bit)

(as you’re un­doubt­edly aware, mu­tual check­ing of source code is pro­hibited by an­titrust laws in over 185 coun­tries).

:D

• Hm, I may not quite have got­ten the point across: I think you may be think­ing of the ar­gu­ment that hu­mans have free will, so they can’t force fu­ture ver­sions of them­selves to do some­thing that would be against that fu­ture ver­sion’s given its in­for­ma­tion, but that isn’t the ar­gu­ment I was try­ing to ex­plain. The idea I was refer­ing to works pre­cisely the same way with de­ter­minis­tic al­gorithms, as long as the play­ers only get to ob­serve each oth­ers’ ac­tions, not each oth­ers’ source (though of course its pro­po­nents don’t think in those terms). The point is that if the other player looks at you severely and sug­ges­tively taps their base­ball bat and tells you about how they’ve beaten up peo­ple who have defected in the past, that still doesn’t mean that they’re ac­tu­ally go­ing to beat you up—since if such threats were effec­tive on you, then mak­ing them would be the smart thing to do even if the other player has no in­ten­tion of ac­tu­ally beat­ing you up (and risk go­ing to jail) if for some rea­son you end up defect­ing. (Com­pare AI-in-the-box...) (Of course, this ar­gu­ment only works if you’re rea­son­ably sure that the other player is a clas­si­cal game the­o­rist; if you think you might be play­ing against some­one who will, “ir­ra­tionally”, ac­tu­ally pun­ish you, like a time­less de­ci­sion the­o­rist, then you should not defect, and they won’t have to pun­ish you...)

Now, if you had ac­tual in­for­ma­tion about what this player had done in similar situ­a­tions in the past, like po­lice re­ports of beaten-up defec­tors, this ar­gu­ment wouldn’t work, but then (the stan­dard ar­gu­ment con­tinues) you have the wrong game-the­o­ret­i­cal model; the cor­rect model in­cludes all of the pun­isher’s pre­vi­ous in­ter­ac­tions, and in that game, it might well be a SPE to pun­ish. (Though only if the ex­act num­ber of “rounds” is not cer­tain, for the same rea­son as in the finitely iter­ated Pri­soner’s Dilemma: in the last round the pun­isher has no more rea­son to pun­ish be­cause there are no fu­ture tar­gets to im­press, so you defect no mat­ter what they did in pre­vi­ous rounds, so they have no rea­son to pun­ish in the sec­ond-to-last round, etc.)

(BTW: refer­ence added to grand­par­ent.)

• I think you may be think­ing of the ar­gu­ment that hu­mans have free will, so they can’t force fu­ture ver­sions of them­selves to do some­thing that would be against that fu­ture ver­sion’s given its information

That is not what I was think­ing of. Here, let me re-quote the whole sen­tence:

The clas­si­cal game the­o­rist as­sumes you can’t look into peo­ple’s heads, so what­ever you say or do be­fore the cheat­ing, you’re always free to not pun­ish dur­ing the pun­ish­ment round

The funny im­pli­ca­tion here is that if some­one did look into your head, you would no longer be “free.” Like a lightswitch :P And then if they erased their mem­ory of what they saw, you’re free again. Free­dom on, free­dom off.

And though that is a fine idea to define, to mix it up with an al­gorith­mic use of “free­dom” seems to just be used to ar­gue “by defi­ni­tion.”

• Ok, sorry I mis­read you. “Free” was just my word rather than part of the stan­dard ex­pla­na­tion, so alas we don’t have any­body we can at­tribute that be­lief to :-)

• (The difficulty arises if UDT B rea­sons log­i­cally that there should not log­i­cally ex­ist any copies of its cur­rent de­ci­sion pro­cess find­ing them­selves in wor­lds where A is de­pen­dent on its own de­ci­sion pro­cess, and yet A defects. I’m start­ing to think that this re­sem­bles the prob­lem I talked about ear­lier, where you have to use Omega’s prob­a­bil­ity dis­tri­bu­tion in or­der to agree to be Coun­ter­fac­tu­ally Mugged on prob­lems that Omega ex­pects to have a high pay­off. Namely, you may have to use A’s log­i­cal un­cer­tainty, rather than your own log­i­cal un­cer­tainty, in or­der to per­ceive a copy of your­self in­side A’s coun­ter­fac­tual. This is a com­pli­cated is­sue and I may have to post about it in or­der to ex­plain it prop­erly.)

• Drescher-Nesov-Dai UDT solves this (that is, goes ahead and pun­ishes the cheater, mak­ing the same de­ci­sion at both times).

TDT can han­dle Parfit’s Hitch­hiker—pay for the ride, make the same de­ci­sion at both times, be­cause it forms the coun­ter­fac­tual “If I did not pay, I would not have got­ten the ride”. But TDT has difficulty with this par­tic­u­lar case, since it im­plies that B’s origi­nal be­lief that A would not cheat if pun­ished, was wrong; and af­ter up­dat­ing on this new in­for­ma­tion, B may no longer have a mo­tive to pun­ish. (UDT of course does not up­date.) Since B’s pay­off can de­pend on B’s com­plete strat­egy tree in­clud­ing de­ci­sions that would be made un­der other con­di­tions, in­stead of just de­pend­ing on the ac­tual de­ci­sion made un­der real con­di­tions, this sce­nario is out­side the realm where TDT is guaran­teed to max­i­mize.

• The case is un­der­speci­fied:

• How trans­par­ent/​translu­cent are the agents? I.e. can A ex­am­ine B’s source­code, or use ob­ser­va­tional and other data to as­sess B’s de­ci­sion pro­ce­dure? If not, what is A’s prior prob­a­bil­ity dis­tri­bu­tion for de­ci­sion pro­ce­dures B might be us­ing?

• Are both A and B us­ing the same de­ci­sion the­ory, TDT/​UDT? Or is A us­ing CDT and B us­ing TDT/​UDT or vice versa?

• Clearly B has mis­taken be­liefs about ei­ther A or its own dis­po­si­tions; oth­er­wise B would not have dealt with A in the in­ter­ac­tion where A ended up cheat­ing. If B uses UDT (and hence will carry through pun­ish­ments), and A uses any DT that cor­rectly fore­casts B’s re­sponse to cheat­ing, then A should not in fact cheat. If A cheats any­way, though, B still pun­ishes.

Ac­tu­ally, on fur­ther re­flec­tion, it’s pos­si­ble that B would rea­son that it is log­i­cally im­pos­si­ble for A to have the speci­fied de­pen­dency on B’s de­ci­sion, and yet for A to still end up defect­ing, in which case even UDT might end up in trou­ble—it would be a trans­par­ent log­i­cal im­pos­si­bil­ity for A to defect if B’s be­liefs about A are true, so it’s not clear that B would han­dle the event cor­rectly. I’ll have to think about this.

• If there is some prob­a­bil­ity of A cheat­ing even if B pre­com­mits to pun­ish­ment, but with odds in B’s fa­vor, the situ­a­tion where B needs to im­ple­ment pun­ish­ment is quite pos­si­ble (ex­pected). Like­wise, if B pre­com­mit­ing to pun­ish A is pre­dicted to lead to an even worse out­come than not pun­ish­ing (be­cause of pun­ish­ment ex­penses), UDT B won’t pun­ish A. Futher­more, a prob­a­bil­ity of cheat­ing and not-pun­ish­ment of cheat­ing (mixed strate­gies, pos­si­bly on log­i­cal un­cer­tainty to defy the laws of the game if pure strate­gies are re­quired) is a mechanism through which the play­ers can (con­sen­su­ally) bar­gain with each other in the re­sult­ing par­allel game, an is­sue Wei Dai men­tioned in the other re­ply. B doesn’t need ab­solute cer­tainty at any stage, in both cases.

Also, in UDT there are no log­i­cal cer­tain­ties, as it doesn’t up­date on log­i­cal con­clu­sions as well.

• If there is some prob­a­bil­ity of A cheat­ing even if B pre­com­mits to punishment

Sure, but that’s the con­ve­nient setup. What if for A to cheat means that you nec­es­sar­ily just mis­taken about which al­gorithm A runs?

Also, in UDT there are no log­i­cal cer­tain­ties, as it doesn’t up­date on log­i­cal con­clu­sions as well.

UDT will be log­i­cally cer­tain about some things but not oth­ers. If UDT B “doesn’t up­date” on its com­pu­ta­tion about what A will do in re­sponse to B, it’s go­ing to be in trou­ble.

• What if for A to cheat means that you nec­es­sar­ily just mis­taken about which al­gorithm A runs?

A de­ci­sion al­gorithm should never be mis­taken, only un­cer­tain.

UDT will be log­i­cally cer­tain about some things but not oth­ers. If UDT B “doesn’t up­date” on its com­pu­ta­tion about what A will do in re­sponse to B, it’s go­ing to be in trou­ble.

“Doesn’t up­date” doesn’t mean that it doesn’t use the info (but you know that, so what do you mean?). A log­i­cal con­clu­sion can be a pa­ram­e­ter in a strat­egy, with­out mak­ing the al­gorithm un­able to rea­son about what it would be like if the con­clu­sion was differ­ent, that is ba­si­cally about un­cer­tainty of same al­gorithm in other states of knowl­edge.

• Am I cor­rect in as­sum­ing that if A cheats and is pun­ished, A suffers a net loss?

• Yes.

• What is the re­main­ing Prob­lem that you’re refer­ring to? Why can’t we ap­ply the for­mal­ism of UDT1 to the var­i­ous ex­am­ples peo­ple seem to be puz­zled about and just get the an­swers out? Or is cousin_it right about the fo­cus hav­ing shifted to how hu­man be­ings ought to rea­son about these prob­lems?

• The an­thropic prob­lem was a re­main­ing prob­lem for TDT, al­though not UDT.

UDT has its own prob­lems, pos­si­bly. For ex­am­ple, in the Coun­ter­fac­tual Mug­ging, it seems that you want to be coun­ter­fac­tu­ally mugged when­ever Omega has a well-cal­ibrated dis­tri­bu­tion and has a sys­tem­atic policy of offer­ing high-pay­off CMs ac­cord­ing to that dis­tri­bu­tion, even if your own prior has a differ­ent dis­tri­bu­tion. In other words, the key to the CM isn’t your own dis­tri­bu­tion, it’s Omega’s. And it’s not pos­si­ble to in­ter­pret UDT as epistemic ad­vice, which leaves an­thropic ques­tions open. So I haven’t yet shifted to UDT out­right.

(The rea­son I did not an­swer your ques­tion ear­lier was that it seemed to re­quire a re­sponse at greater length than the above.)

• Hi, this is the 2-week re­minder that you haven’t posted your longer re­sponse yet. :)

• Well, you’re right in the sense that I can’t un­der­stand the ex­am­ple you gave. (I waited a cou­ple of days to see if it would be­come clear, but it didn’t) But the rest of the re­sponse is helpful.

• Did he ever get around to ex­plain­ing this in more de­tail? I don’t re­mem­ber read­ing a re­ply to this, but I think I’ve just figured out the idea: Sup­pose you get word that Omega is com­ing to the neigh­bour­hood and go­ing to offer coun­ter­fac­tual mug­gings. What sort of al­gorithm do you want to self-mod­ify into? You don’t know what CMs Omega is go­ing to offer; all you know is that it will offer odds ac­cord­ing to its well-cal­ibrated prior. Thus, it has higher ex­pected util­ity to be a CM-ac­cepter than a CM-re­jecter, and even a CDT agent would want to self-mod­ify.

I don’t think that’s a prob­lem for UDT, though. What UDT will com­pute when asked to pay is the ex­pected util­ity un­der its prior of pay­ing up when Omega asks it to; thus, the con­di­tion for UDT to pay up is NOT

``````prior prob­a­bil­ity of heads * Omega’s offered pay­off  >  prior of tails * Omega’s price
``````

but

``````prior of (heads and Omega offers a CM for this coin) * pay­off  >  prior of (tails and CM) * price.
``````

In other words, UDT takes the qual­ity of Omega’s pre­dic­tions into ac­count and acts as if up­dat­ing on them (the same way you would up­date if Omega told you who it ex­pects to win the next elec­tion, at 98% prob­a­bil­ity).

CDT agents, as usual, will ac­tu­ally want to self-mod­ify into a UDT agent whose prior equals the CDT agent’s pos­te­rior [ETA: wait, sorry, no, they won’t act as if they can acausally con­trol other in­stances of the same pro­gram, but they will self-mod­ify so as to make fu­ture in­stances of them­selves (which ob­vi­ously they con­trol causally) act in a way that max­i­mizes EU ac­cord­ing to the agent’s pre­sent pos­te­rior, and that’s what we need here], and will use the sec­ond for­mula above ac­cord­ingly—they don’t want to be a gen­eral CM-re­jecter, but they think that they can do even bet­ter than be­ing a gen­eral CM-ac­cepter if they re­fuse to pay up if at the time of self-mod­ifi­ca­tion they as­signed low prob­a­bil­ity to tails, even con­di­tional on Omega offer­ing them a CM.

• He never ex­plained fur­ther, and ac­tu­ally I still don’t quite un­der­stand the ex­am­ple even given your ex­pla­na­tion. Maybe you can re­ply di­rectly to Eliezer’s com­ment so he can see it in his in­box, and let us know if he still thinks it’s a prob­lem for UDT?

• But the key in­gre­di­ent that is miss­ing is a the­ory of log­i­cal un­cer­tainty, that tells us how differ­ent agents (or more gen­er­ally, com­pu­ta­tional pro­cesses) are log­i­cally cor­re­lated to each other.

I’d look for it as log­i­cal the­ory of con­cur­rency and in­ter­ac­tion: “un­cer­tainty” fuzzifies the ques­tion.

• I’d look for it as log­i­cal the­ory of con­cur­rency and in­ter­ac­tion: “un­cer­tainty” fuzzifies the ques­tion.

Why? For me, how differ­ent agents are log­i­cally cor­re­lated to each other seems to be the same type of ques­tion as “what prob­a­bil­ity (if any) should I as­sign to P!=NP?” Wouldn’t the an­swer fall out of a gen­eral the­ory of log­i­cal un­cer­tainty? (ETA: Or at least be illu­mi­nated by such a the­ory?)

• Logic is already in some sense about un­cer­tainty (e.g. you could in­ter­pret pred­i­cates as states of knowl­edge). When you add one more “un­cer­tainty” of some breed, it leads to per­ver­sion of logic, usu­ally of ap­plied char­ac­ter and bar­ren mean­ing.

The con­cept of “prob­a­bil­ity” is sus­pect, I don’t ex­pect it to have foun­da­tional sig­nifi­cance.

• So what would you call a field that deals with how one ought to make bets in­volv­ing P!=NP (i.e., math­e­mat­i­cal state­ments that we can’t prove to be true or false), if not “log­i­cal un­cer­tainty”? Just “logic”? Wouldn’t that cause con­fu­sion in oth­ers, since to­day it’s usu­ally un­der­stood that such ques­tions are out­side the realm of logic?

• I don’t un­der­stand how to make such bets, ex­cept in a way it’s one of the kinds of hu­man de­ci­sion-mak­ing that can be ex­pli­cated in terms of pri­ors and util­ities. The logic of this prob­lem is in the pro­cess that works with the state­ment, which is in the do­main of proof the­ory.

• I waited to com­ment on this, to see what oth­ers would say. Right now Psy-Kosh seems to be right about an­throp­ics; Wei Dai seems to be right about UDT; timtyler seems to be right about Boltz­mann brains; byrnema seems to be mostly right about poin­t­ers; but I don’t un­der­stand why no­body latched on to the “re­flec­tive con­sis­tency” part. Surely the kind of con­sis­tency un­der ob­server-split­ting that you de­scribe is too strong a re­quire­ment in gen­eral: if two copies of you play a game, the cor­rect be­hav­ior for both of them would be to try to win, re­gard­less of what over­all out­come you’d pre­fer be­fore the copy­ing. The pa­per­clip for­mu­la­tion works around this prob­lem, so the cor­rect way to an­a­lyze this would be in terms of mul­ti­player game the­ory with chance moves, as Psy-Kosh out­lined.

• if two copies of you play a game, the cor­rect be­hav­ior for both of them would be to try to win, re­gard­less of what over­all out­come you’d pre­fer be­fore the copying

That doesn’t make sense to me, un­less you’re as­sum­ing that the player isn’t ca­pa­ble of self-mod­ifi­ca­tion. If it was, wouldn’t it mod­ify it­self so that its copies won’t try to win in­di­vi­d­u­ally, but co­op­er­ate to ob­tain the out­come that it prefers be­fore the copy­ing?

• Yes, that’s right. I’ve shifted fo­cus from cor­rect pro­gram be­hav­ior to cor­rect hu­man be­hav­ior, be­cause that’s what ev­ery­one else here seems to be talk­ing about. If the prob­lem is about pro­grams, there’s no room for all this con­fu­sion in the first place. Just spec­ify the in­puts, out­puts and goal func­tion, then work out the op­ti­mal al­gorithm.

• ...wouldn’t it mod­ify it­self so that its copies...

Un­less the copies can mod­ify them­selves too.

• Huh. Read­ing this again, to­gether with byrnema’s poin­ter dis­cus­sion and Psy-Kosh’s non-an­thropic re­for­mu­la­tion...

It seems like the prob­lem is that whether each per­son gets to make a de­ci­sion de­pends on the ev­i­dence they think they have, in such a way to make that ev­i­dence mean­ingless. To con­struct an ex­treme ex­am­ple: The An­tecedent Mug­ger gath­ers a billion peo­ple in a room to­gether, and says:

“I challenge you to a game of wits! In this jar is a vari­able amount of coins, be­tween \$0 and \$10,000. I will al­low each of you to weigh the jar us­ing this set of ex­tremely im­pre­cise scales. Then I will ask each of you whether to ac­cept my offer: to as a group buy the jar off me for \$5000, the money to be dis­tributed equally among you. Note: al­though I will ask all of you, the only re­sponse I will con­sider is the one given by the per­son with the great­est sub­jec­tive ex­pected util­ity from say­ing ‘yes’.”

In this case, even if the jar always con­tains \$0, there will always be some­one who re­ceives enough in­for­ma­tion from the scales to think the jar con­tains >\$5000 with high prob­a­bil­ity, and there­fore to say yes. Since that per­son’s re­sponse is the one that is taken for the whole group, the group always pays out \$5000, re­sult­ing in a money pump in favour of the Mug­ger.

The prob­lem is that, from an out­side per­spec­tive, the ob­ser­va­tions of the one who gets to make the choice are al­most com­pletely un­cor­re­lated from the ac­tual con­tents of the jar, due to the Mug­ger’s se­lec­tion pro­cess. For any gen­eral strat­egy `Ob­ser­va­tions → Re­sponse`, the Mug­ger can always sum­mon enough peo­ple to find some­one who has seen the ob­ser­va­tions that will pro­duce the re­sponse he wants, un­less the strat­egy is a con­stant func­tion.

Similarly, in the prob­lem with the mar­bles, only the peo­ple with the ob­ser­va­tion `Green` get any in­fluence, so the ob­ser­va­tions of “peo­ple who get to make a de­ci­sion” are un­cor­re­lated with the ac­tual con­tents of the buck­ets (even though ob­ser­va­tions of the par­ti­ci­pants in gen­eral are cor­re­lated with the buck­ets).

• The prob­lem here is that your billion peo­ple are for some rea­son giv­ing the an­swer most likely to be cor­rect rather than the an­swer most likely to ac­tu­ally be prof­itable. If they were a lit­tle more savvy, they could rea­son as fol­lows:

“The scales tell me that there’s \$6000 worth of coins in the jar, so it seems like a good idea to buy the jar. How­ever, if I did not re­ceive the largest weight es­ti­mate from the scales, my de­ci­sion is ir­rele­vant; and if I did re­ceive the largest weight es­ti­mate, then con­di­tioned on that it seems over­whelm­ingly likely that there are many fewer coins in the jar than I’d think based on that es­ti­mate—and in that case, I ought to say no.”

• Ooh, and we can ap­ply similar rea­son­ing to the mar­ble prob­lem if we change it, in a seem­ingly iso­mor­phic way, so that in­stead of mak­ing the trade based on all the re­sponses of the peo­ple who saw a green mar­ble, Psy-Kosh se­lects one of the green-mar­ble-ob­servers at ran­dom and con­sid­ers that per­son’s re­sponse (this should make no differ­ence to the out­comes, as­sum­ing that the green-mar­blers can’t give differ­ent re­sponses due to no-spon­ta­neous-sym­me­try-break­ing and all that).

Then, con­di­tion­ing on draw­ing a green mar­ble, per­son A in­fers a 910 prob­a­bil­ity that the bucket con­tained 18 green and 2 red mar­bles. How­ever, if the bucket con­tains 18 green mar­bles, per­son A has a 118 chance of be­ing ran­domly se­lected given that she drew a green mar­ble, whereas if the bucket con­tains 2 green mar­bles, she has a 12 chance of be­ing se­lected. So, con­di­tion­ing on her re­sponse be­ing the one that mat­ters as well as the green mar­ble it­self, she in­fers a (9:1) * (1/​18)/​(1/​2) = (9:9) odds ra­tio, that is prob­a­bil­ity 12 the bucket con­tains 18 green mar­bles.

Which leaves us back at a kind of an­thropic up­dat­ing, ex­cept that this time it re­solves the prob­lem in­stead of in­tro­duc­ing it!

• isn’t this a prob­lem with the fre­quency you are pre­sented with the op­por­tu­nity to take the wa­ger? [no, see edit]

the equa­tion: (50% ((18 +\$1) + (2 -\$3))) + (50% ((18 -\$3) + (2 +\$1))) = -\$20 ne­glects to take into ac­count that you will be offered this wa­ger nine times more of­ten in con­di­tions where you win than when you lose.

for ex­am­ple, the wa­ger: “i will flip a fair coin and pay you \$1 when it is heads and pay you -\$2 when it is tails” is -EV in na­ture. how­ever if a con­di­tional is added where you will be asked if you want to take the bet 90% of the time given the coin is heads (10% of the time you are ‘in a red room’) and 10% of the time given the coin is tails (90% of the time you are ‘in a red room’), your EV changes from (.5)(1) + (.5)(-2) = -.5 to (.5)(.9)(\$1) + (.5)(.1)(-\$2) = \$.35 rep­re­sent­ing the shift from “odds the coin comes up heads” to “odds the coin comes up heads and i am asked if i want to take the bet”

it seems like the same prin­ci­ple would ap­ply to the green room sce­nario and your pre-copied self would have to con­clude that though the two out­comes are +\$12 or -\$52, they do not oc­cur with 50-50 fre­quency and given you are offered the bet, you have a 90% chance of win­ning. (.9)(\$12) + (.1)(-\$52) = \$5.6

EDIT: okay, af­ter think­ing about it, i am wrong. the rea­son i was hav­ing trou­ble with this was the fact that when the coin comes up tails and 90% of the time i am in a red room, even though “i” am not be­ing speci­fi­cally asked to wa­ger, my two copies in the green rooms are—and they are mak­ing the wrong choice be­cause of my pre­com­mit­ment to tak­ing the wa­ger given i am in a green room. this makes my fi­nal EV calcu­la­tion wrong as it ig­nores tri­als where “i” ap­pear in a red room even though the wa­ger still takes place.

its in­ter­est­ing that this para­dox ex­ists be­cause of en­tities other than your­self (copies of you, pa­per­clip max­i­miz­ers, etc) mak­ing the “in­cor­rect” choice the 90% of the time you are stuck in a red room with no say.

• some other thoughts. the para­dox ex­ists be­cause you can­not pre­com­mit your­self to tak­ing the wa­ger given you are in a green room as this com­mits you to tak­ing the wa­ger on 100% of coin­flips which is ter­rible for you.

when you find your­self in a green room, the right play IS to take the wa­ger. how­ever, you can’t make the right play with­out com­mit­ting your­self to mak­ing the wrong play in ev­ery uni­verse where the coin comes up tails. you are ba­si­cally screw­ing your par­allel selves over be­cause half of them ex­ist in a ‘tails’ re­al­ity. it seems like fac­tor­ing in your par­allel ex­pec­ta­tion can­cels out the ev shift of ad­just­ing you prior (50%) prob­a­bil­ity to 90%.

and if you don’t care about your par­allel selves, you can just think of them as the com­po­nents that av­er­age to your true ex­pec­ta­tion in any given situ­a­tion. if the over­all effect across all pos­si­ble uni­verses was nega­tive, it was a bad play even if it helped you in this uni­verse. meta­phys­i­cal hind­sight.

• I think I’ll have to sit and reread this a cou­ple times, but my INITIAL thought is “Isn’t the ap­par­ent in­con­sis­tancy here qual­i­ta­tively similar to the situ­a­tion with a coun­ter­fac­tual mug­ging?”

• This is my re­ac­tion too. This is a de­ci­sion in­volv­ing Omega in which the right thing to do is not up­date based on new in­for­ma­tion. In de­ci­sions not in­volv­ing Omega, you do want to up­date. It doesn’t mat­ter whether the new in­for­ma­tion is of an an­thropic na­ture or not.

• Yeah, thought about it a bit more, and still seems to be more akin to “para­dox of coun­ter­fac­tual mug­ging” than “para­dox of an­thropic rea­son­ing”

To me, con­fus­ing bits of an­thropic rea­son­ing would more come into play via stuff like “au­mann agree­ment the­o­rem vs an­thropic rea­son­ing”

• If the many wor­lds in­ter­pre­ta­tion of quan­tum me­chan­ics is true isn’t an­thropic rea­son­ing in­volved in mak­ing pre­dic­tions about the fu­ture of quan­tum sys­tems. There ex­ists some world in which, from the mo­ment this com­ment is posted on­ward, all at­tempts to de­tect quan­tum in­de­ter­mi­nacy fail, all two-slit ex­per­i­ments yield two dis­tinct lines in­stead of a wave pat­tern etc. Without an­thropic rea­son­ing we have no rea­son to find this re­sult at all sur­pris­ing. So ei­ther we need to re­ject an­thropic rea­son­ing or we need to re­ject the pre­dic­tive value of quan­tum me­chan­ics un­der the many wor­lds in­ter­pre­ta­tion. Right?

(Apolo­gies if this has been cov­ered, I’m play­ing catch-up and just try­ing to hash things out for my­self. Also should I ex­pect to be de­clared a prophet in the world in which quan­tum in­de­ter­mi­nacy dis­ap­pears from here on out?)

• If the many wor­lds in­ter­pre­ta­tion of quan­tum me­chan­ics is true isn’t an­thropic rea­son­ing in­volved in mak­ing pre­dic­tions about the fu­ture of quan­tum sys­tems.

Ba­sic QM seems to say that prob­a­bil­ity is on­tolog­i­cally ba­sic. In a col­lapse point of view, it’s what we usu­ally think of as prob­a­bil­ity that shows up in de­ci­sion the­ory. In MWI, both events hap­pen. But you could talk about usual prob­a­bil­ity ei­ther way. (“clas­si­cal prob­a­bil­ity is a de­gen­er­ate form of quan­tum prob­a­bil­ity” with or with­out col­lapse)

An­throp­ics is about the in­ter­ac­tion of prob­a­bil­ity with the num­ber of ob­servers.

Re­plac­ing usual prob­a­bil­ity with QM doesn’t seem to me to make a differ­ence. Quan­tum suicide is a kind of an­throp­ics, but it’s not clear to me in what sense it’s re­ally quan­tum. It’s mainly about re­ject­ing the claim that the Born prob­a­bities are on­tolog­i­cally ba­sic, that they mea­sure how real an out­come is.

• But in MWI isn’t the ob­served prob­a­bil­ity of some quan­tum state just the frac­tion of wor­lds in which an ob­server would de­tect that quan­tum state? As such, doesn’t keep­ing the prob­a­bil­ities of quan­tum events as QM pre­dicts re­quire that “one should rea­son as if one were a ran­dom sam­ple from the set of all ob­servers in one’s refer­ence class” (from a Nick Bostrom piece). The rea­son we think our the­ory of QM is right is that we think our branch in the multi-verse didn’t get cursed with an un­rep­re­sen­ta­tive set of ob­served phe­nom­ena.

Wouldn’t a branch in the multi-verse that ob­served quan­tum events in which val­ues were sys­tem­at­i­cally dis­torted (by ran­dom chance) come up with slightly differ­ent equa­tions to de­scribe quan­tum me­chan­ics? If so, what rea­son do we have to think that our equa­tions are cor­rect if we don’t con­sider our ob­ser­va­tions to be similar to the ob­ser­va­tions made in other pos­si­ble wor­lds?

• It’s not just world count­ing… (Although Robin Han­son’s Man­gled World’s idea does sug­gest a way that it may turn out to amount to world count­ing af­ter all)

es­sen­tially one has to in­te­grate the squared mod­u­lus of quan­tum am­pli­tude over a world. This is pro­por­tional to the sub­jec­tive prob­a­bil­ity of ex­pe­rienc­ing that world.

Yes… that it isn’t sim­ple world count­ing does seem to be a prob­lem. This is some­thing that we, or at least I, am con­fused about.

• Thanks. Good to know. I don’t sup­pose you can ex­plain why it works that way?

• As I said, that’s some­thing I’m con­fused about, and ap­par­ently oth­ers are as well.

We’ve got the lin­ear rules for how quan­tum am­pli­tude flows over con­figu­ra­tion space, then we’ve got this “oh, by the way, the sub­jec­tive prob­a­bil­ity of ex­pe­rienc­ing any chunk of re­al­ity is pro­por­tional to the square of the ab­solute value” rule.

There’re a few ideas out there, but...

• Would you ex­pand and sharpen your point? Woit comes to mind.

At one point you claim, pos­si­bly based on MWI, that “there is some world in which …”. As far as I can tell, the speci­fics of the sce­nario shouldn’t have any­thing to do with the cor­rect­ness of your ar­gu­ment.

This is how I would para­phrase your com­ment:

1. Ac­cord­ing to MWI, there ex­ists some world in which un­likely things hap­pen.

2. We find this sur­pris­ing.

3. An­thropic rea­son­ing is nec­es­sary to con­clude 2.

4. An­thropic rea­son­ing is in­volved in mak­ing pre­dic­tions about quan­tum sys­tems.

In step 2: Who is the “we”? What is the “this”? Why do we find it sur­pris­ing? In step 3: What do you mean by “an­thropic rea­son­ing”? In gen­eral, it is pretty hard metar­ea­son­ing to con­clude that a rea­son­ing step or ma­neu­ver is nec­es­sary for a con­clu­sion.

• We don’t need an­thropic rea­son­ing un­der MWI in or­der to be sur­prised when find­ing our­selves in wor­lds in which un­likely things hap­pen so much as we need an­thropic rea­son­ing to con­clude that an un­likely thing has hap­pened. And our abil­ity to con­clude that an un­likely thing has hap­pened is needed to ac­cept quan­tum me­chan­ics as a suc­cess­ful sci­en­tific the­ory.

“We” is the set of ob­servers in the wor­lds where events, de­clared to be un­likely by quan­tum me­chan­ics ac­tu­ally hap­pen. An ob­server is any phys­i­cal sys­tem with a par­tic­u­lar kind of causal re­la­tion to quan­tum states such that the phys­i­cal sys­tem can record in­for­ma­tion about quan­tum states and use the in­for­ma­tion to come up with meth­ods of pre­dict­ing the prob­a­bil­ity of pre­vi­ously un­ob­served quan­tum pro­cesses (or some­thing, but if we can’t come up with a defi­ni­tion of ob­server then we shouldn’t be talk­ing about an­thropic rea­son­ing any­way).

1. Ac­cord­ing to MWI, the (quan­tum) prob­a­bil­ity of a quan­tum state is defined as the frac­tion of wor­lds in which that state oc­curs.

2. The only way an ob­server some­where in the multi-verse can trust the ob­ser­va­tions used that con­firm quan­tum me­chan­ics prob­a­bil­is­tic in­ter­pre­ta­tions is if they rea­son as if they were a ran­dom sam­ple from the set of all ob­servers in the multi-verse (one ar­tic­u­la­tion of an­thropic rea­son­ing) be­cause if they can’t do that then they have no rea­son to think their ob­ser­va­tions aren’t wrong in a sys­tem­atic way.

3. An ob­server’s rea­son for be­liev­ing the stan­dard model of QM to be true the first place is that they can pre­dict atomic and sub­atomic par­ti­cles be­hav­ing ac­cord­ing a prob­a­bil­is­tic wave-func­tion.

4. Ob­servers lose their rea­son for trust­ing QM in the first place if they ac­cept the MWI AND are pro­hibited rea­son an­throp­i­cally.

In other words If MWI is likely, then QM is likely iff AR is ac­cept­able.

I think one could write a differ­ent ver­sion of this ar­gu­ment by refer­enc­ing ex­pected sur­prise at dis­cov­er­ing sud­den changes in quan­tum prob­a­bil­ities (which I was con­flat­ing with the first ar­gu­ment in my first com­ment) but the above ver­sion is prob­a­bly eas­ier to fol­low.

• Can I para­phrase what you just said as:

“If many-wor­lds is true, then all ev­i­dence is an­thropic ev­i­dence”

• I hadn’t come to that con­clu­sion un­til you said it… but yes, that is about right. I’m not sure I would say all ev­i­dence is an­thropic- I would pre­fer say­ing that all up­dat­ing in­volves a step of an­thropic rea­son­ing. I make that hedge just be­cause I don’t know that di­rect sen­sory in­for­ma­tion is an­thropic ev­i­dence, just that mak­ing good up­dates with that sen­sory in­for­ma­tion is go­ing to in­volve (im­plicit) an­thropic rea­son­ing.

• The no­tion of “I am a bolz­mann brain” goes away when you con­clude that con­scious ex­pe­rience is a Teg­mark-4 thing, and that equiv­a­lent con­scious ex­pe­riences are math­e­mat­i­cally equal and there­fore there is no differ­ence and you are at the same time a hu­man be­ing and a bolz­mann brain, at least un­til they di­verge.

Thus, antrhopic rea­son­ing is right out.

• Well, by the same to­ken “What I ex­pe­rience rep­re­sents what I think it does /​ I am not a Boltz­mann brain which may dwin­dle out of ex­is­tence in an in­stance” would go right out, just the same. This kind of rea­son­ing re­duces to some­thing similar to quan­tum suicide. The point at which your con­scious ex­pe­rience is ex­pected to di­verge, even if you take that per­spec­tive, does kind of mat­ter. The differ­ent paths and their prob­a­bil­is­tic weights which gov­ern the di­ver­gence al­ter your ex­pected ex­pe­rience, af­ter all. Or am I mi­s­un­der­stand­ing?

• I am not sure.

Let met try to clar­ify.

By virtue of ex­is­ten­tial quan­tifi­ca­tion in a ZF equiv­a­lent set the­ory, we can have any­thing.

In an ar­bi­trary en­cod­ing for­mat, I now by ex­is­ten­tial quant­fi­ca­tion se­lect a set which is the mo­men­tary sub­jec­tive ex­pe­rience of be­ing me as I write this post, e.g. mem­ory sen­sa­tions, ex­is­ten­tial sen­sa­tions, sen­sory in­put, etc.

It is a math­e­mat­i­cal ob­ject. I can choose it’s rep­re­sen­ta­tion for­mat in­de­pen­dent of any com­pu­ta­tional medium I might use to im­ple­ment it.

I just so hap­pens that there is a brain in the uni­verse we are in, which is im­ple­ment­ing this matem­at­i­cal ob­ject.

Brains are com­put­ers that com­pute con­scious ex­pe­riences.

They no more have bear­ing on the math­e­mat­i­cal ob­jects they im­ple­ment than a mod­ern com­puter has on the defi­ni­tion of con­ways game of life.

Does that clar­ify it?

• I just so hap­pens that there is a brain in the uni­verse we are in, which is im­ple­ment­ing this matem­at­i­cal ob­ject.

Which is why we’re still highly in­vested in the ques­tion whether (what­ever it is that gen­er­ates our con­scious ex­pe­rience) will “stay around” and con­tinue with our pat­tern in an ex­pected man­ner.

Let’s say we iden­tify with only the math­e­mat­i­cal ob­ject, not the rep­re­sen­ta­tion for­mat at all. That doesn’t ex­cuse us from an­thropic rea­son­ing, or from a per­sonal in­vest­ment in rea­son­ing about the im­ple­ment­ing “hard­ware”. We’d still be highly in­vested in the ques­tion, even as ‘math­e­mat­i­cal ob­jects’. We prob­a­bly still care about be­ing con­tinu­ally in­stan­ti­ated.

The shift in per­spec­tive you sug­gest doesn’t take away from that (and adds what could be con­strued as a fla­vor of du­al­ism).

• Hmmm.

I will have to mull on that, but let me leave with a mote of ex­pla­na­tion:

The rea­son­ing strat­egy I used to ar­rive at this con­clu­sion was similar to the one used in con­clud­ing that “ev­ery pos­si­ble hu­man ex­ists in par­alell uni­verses, so we need not make more hu­mans, but more hu­mans feel­ing good.”

• Doesn’t ev­ery pos­si­ble hu­man-feel­ing-good also ex­ist in par­allel uni­verses?

(And if you ar­gue that al­though they ex­ist you can in­crease their mea­sure, that ap­plies to the ev­ery-pos­si­ble-hu­man ver­sion as well.)

• Sure, but I will quote Karkat Van­tas on time-travel shenani­gans from An­drew Hussie’s Homestuck

CCG: EVERYBODY, DID YOU HEAR THAT?? SUPERFUTURE VRISKA HAS AN IMPORTANT LIFE LESSON FOR US ALL.
CCG: WE DON’T HAVE TO WORRY ABOUT OUR PRESENT RESPONSIBILIES AND OBLIGATIONS!
CCG: BECAUSE AS IT TURNS OUT, IN THE FUTURE ALL THAT STUFF ALREADY HAPPENED. WE’RE OFF THE FUCKING HOOK!

• Time­less de­ci­sion agents re­ply as if con­trol­ling all similar de­ci­sion pro­cesses, in­clud­ing all copies of them­selves. Clas­si­cal causal de­ci­sion agents, to re­ply “Yes” as a group, will need to some­how work out that other copies of them­selves re­ply “Yes”, and then re­ply “Yes” them­selves. We can try to help out the causal de­ci­sion agents on their co­or­di­na­tion prob­lem by sup­ply­ing rules such as “If con­flict­ing an­swers are de­liv­ered, ev­ery­one loses \$50″. If causal de­ci­sion agents can win on the prob­lem “If ev­ery­one says ‘Yes’ you all get \$10, if ev­ery­one says ‘No’ you all lose \$5, if there are con­flict­ing an­swers you all lose \$50” then they can pre­sum­ably han­dle this. If not, then ul­ti­mately, I de­cline to be re­spon­si­ble for the stu­pidity of causal de­ci­sion agents.

The co­or­di­na­tion hack to work around some of the stu­pidity of causal de­ci­sion agents doesn’t ap­pear to be nec­es­sary here.

“Some­how work­ing out that the other copies of them­selves re­ply ‘yes’” should be triv­ial for an agent fo­cussed on causal­ity when the copies are iden­ti­cal, have no in­cen­tive to ran­domise and have iden­ti­cal in­puts. If the pay­off for oth­ers dis­agree­ing is iden­ti­cal to the pay­off for ‘no’ they can be ig­nored. The con­flict penalty makes the co­or­di­na­tion prob­lem more difficult for the causal agent in this con­text, not less.

• The rea­son we shouldn’t up­date on the “room color” ev­i­dence has noth­ing to do with the fact that it con­sti­tutes an­thropic ev­i­dence. The rea­son we shouldn’t up­date is that we’re told, albeit in­di­rectly, that we shouldn’t up­date (be­cause if we do then some of our copies will up­date differ­ently and we will be pe­nal­ized for our dis­agree­ment).

In the real world, there is no in­cen­tive for all the copies of our­selves in all uni­verses to agree, so it’s all right to up­date on an­thropic ev­i­dence.

• [com­ment deleted]

Oops… my usual mis­take of equiv­o­cat­ing differ­ent things and evolv­ing the prob­lem un­til it barely re­sem­bles the origi­nal. I will up­date my “solu­tion” later if it still works for the origi­nal.

… Sigh. Won’t work. My pre­vi­ous “solu­tion” re­cov­ered the cor­rect an­swer of −20 be­cause I bent the rules enough to have each of my green-room-de­ciders make a global rather than an­thropic calcu­la­tion.

• Think­ing about how all the green-room peo­ple come to the wrong con­clu­sion makes my brain hurt. But I sup­pose, fi­nally, it is true. They can­not base their de­ci­sion on their sub­jec­tive ex­pe­rience, and here I’ll out­line some thoughts I’ve had as to un­der what con­di­tions they should know they can­not do so.

Sup­pose there are 20 peo­ple (Amy, Benny, Car­rie, Donny, …) and this ex­per­i­ment is done as de­scribed. If we always ask Tony (the 20th per­son) whether or not to say “yes”, and he bases his de­ci­sion on whether or not he is in a green room, then the ex­pected value of his de­ci­sion re­ally is \$5.6. Tony here is a spe­cial, sin­gled out “de­cider”. One way of look­ing at this situ­a­tion is that the ‘yes’ de­pends on some in­for­ma­tion in the sys­tem (that is, whether or not Tony was in a green room.)

If in­stead we say that the de­cider can be any­one, and in fact we choose the de­cider af­ter the as­sort­ment into rooms as some­one in a green room, then we are not re­ally given any in­for­ma­tion about the sys­tem.

It is the differ­ence be­tween (a) pick­ing a per­son, and see­ing if they wake up in a green room, and (b) pick­ing a per­son that is in a green room. (I know you are well aware of this differ­ence, but it helps to spell it out.)

You can’t pick the de­ciders from a set with a pre­speci­fied out­come. It’s a poin­ter prob­lem: You can learn about the sys­tem from the change of state from Tony to Tony (Tony: no room -->Tony: green room), but you can’t as­sign* the star af­ter the as­sign­ment (pick some­one in a green room and ask them).

When a per­son wakes in a green room and is asked, they should say ‘yes’ if they are ran­domly cho­sen to be asked in­de­pen­dently of their room color. If they were cho­sen af­ter the as­sign­ment, be­cause they awoke in a green room, they should rec­og­nize this as the “un­fixed poin­ter prob­lem” (a spe­cial kind of se­lec­tion bias).

Avoid­ing the poin­ter prob­lem is straight-for­ward. The peo­ple who wake in red rooms have a pos­te­rior prob­a­bil­ity of heads as 10%. The peo­ple who wake in green rooms have a pos­te­rior prob­a­bil­ity of heads as 90%. Your pos­te­rior prob­a­bil­ity is mean­ingful only if your pos­te­rior prob­a­bil­ity could have been ei­ther way. Since Eliezer only asks peo­ple who woke in green rooms, and never asks peo­ple who woke in red rooms, the pos­te­rior prob­a­bil­ities are not mean­ingful.

• The peo­ple who wake in red rooms have a pos­te­rior prob­a­bil­ity of heads as 10%. The peo­ple who wake in green rooms have a pos­te­rior prob­a­bil­ity of heads as 90%. Your pos­te­rior prob­a­bil­ity is mean­ingful only if your pos­te­rior prob­a­bil­ity could have been ei­ther way. Since Eliezer only asks peo­ple who woke in green rooms, and never asks peo­ple who woke in red rooms, the pos­te­rior prob­a­bil­ities are not mean­ingful.

The rest of your re­ply makes sense to me, but can I ask you to am­plify on this? Maybe I’m be­ing naive, but to me, a 90% prob­a­bil­ity is a 90% prob­a­bil­ity and I use it in all my strate­gic choices. At least that’s what I started out think­ing.

Now you’ve just shown that a de­ci­sion pro­cess won’t want to strate­gi­cally con­di­tion on this “90% prob­a­bil­ity”, be­cause it always ends up as “90% prob­a­bil­ity” re­gard­less of the true state of af­fairs, and so is not strate­gi­cally in­for­ma­tive to green agents—even if the prob­a­bil­ity seems well-cal­ibrated in the sense that, look­ing over im­pos­si­ble pos­si­ble wor­lds, green agents who say “90%” are cor­rect 9 times out of 10. This seems like a con­flict be­tween an an­thropic sense of prob­a­bil­ity (rel­a­tive fre­quency in a pop­u­la­tion of ob­servers) and a strate­gic sense of prob­a­bil­ity (sum­ma­riz­ing in­for­ma­tion that is to be used to make de­ci­sions), or some­thing along those lines. Is this where you’re point­ing to­ward by say­ing that a pos­te­rior prob­a­bil­ity is mean­ingful at some times but not oth­ers?

• a de­ci­sion pro­cess won’t want to strate­gi­cally con­di­tion on this “90% prob­a­bil­ity”, be­cause it always ends up as “90% prob­a­bil­ity” re­gard­less of the true state of af­fairs, and so is not strate­gi­cally in­for­ma­tive to green agents

The 90% prob­a­bil­ity is gen­er­ally strate­gi­cally in­for­ma­tive to green agents. They may le­gi­t­i­mately point to them­selves for in­for­ma­tion about the world, but in this spe­cific case, there is con­fu­sion about who is do­ing the point­ing.

When you think about a prob­lem an­throp­i­cally, you your­self are the poin­ter (the thing you are ob­serv­ing be­fore and af­ter to make an ob­ser­va­tion) and you as­sign your­self as the poin­ter. This is go­ing to be strate­gi­cally sound in all cases in which you don’t change as the poin­ter be­fore and af­ter an ob­ser­va­tion. (A pretty nor­mal con­di­tion. Ex­cep­tions would be ex­per­i­ments in which you try to de­ter­mine the prob­a­bil­ity that a cer­tain ac­tivity is fatal to your­self—you will never be able to figure out the prob­a­bil­ity that you will die of your shrimp allergy by re­peated tri­als of con­sum­ing shrimp, as it will be­come in­creas­ingly skewed to­wards lower and lower val­ues.)

Like­wise, if I am in the ex­per­i­ment de­scribed in the post and I awaken in a green room I should an­swer “yes” to your ques­tion if I de­ter­mine that you asked me ran­domly. That is, that you would have asked me even if I woke in a red room. In which case my an­thropic ob­ser­va­tion that there is a 90% prob­a­bil­ity that heads was flipped is quite sound, as usual.

On the other hand, if you ask me only if I wake in a green room, then you wouldn’t have asked “me” if I awoke in a red room. (So I must re­al­ize this isn’t re­ally about me as­sign­ing my­self as a poin­ter, be­cause “me” doesn’t change de­pend­ing on what room I wake up in.) It’s strange and re­quires some men­tal gym­nas­tics for me to un­der­stand that you Eliezer are pick­ing the poin­ter in this case, even though you are ask­ing me about my an­thropic ob­ser­va­tion, for which I would usu­ally ex­pect to as­sign my­self as the poin­ter.

So for me this is a poin­ter/​bi­ased-ob­ser­va­tion prob­lem. But the an­thropic prob­lem is re­lated, be­cause we as hu­mans can­not ask about the prob­a­bil­ity of cur­rently ob­served events based on the fre­quency of ob­ser­va­tions which, had they been oth­er­wise, would not have per­mit­ted our­selves to ask the ques­tion.

• On the other hand, if you ask me only if I wake in a green room, then you wouldn’t have asked “me” if I awoke in a red room. (So I must re­al­ize this isn’t re­ally about me as­sign­ing my­self as a poin­ter, be­cause “me” doesn’t change de­pend­ing on what room I wake up in.)

Huh. Very in­ter­est­ing again. So in other words, the prob­a­bil­ity that I would use for my­self, is not the prob­a­bil­ity that I should be us­ing to an­swer ques­tions from this de­ci­sion pro­cess, be­cause the de­ci­sion pro­cess is us­ing a differ­ent kind of poin­ter than my me-ness?

How would one for­mal­ize this? Bostrom’s di­vi­sion-of-re­spon­si­bil­ity prin­ci­ple?

• I haven’t had time to read this, but it looks pos­si­bly rele­vant (it talks about the im­por­tance of whether an ob­ser­va­tion point is fixed in ad­vance or not) and also pos­si­bly in­ter­est­ing, as it com­pares Bayesian and fre­quen­tist views.

I will read it when I have time later… or any­one else is wel­come to if they have time/​in­ter­est.

• What I got out of the ar­ti­cle above, since I skipped all the tech­ni­cal math, was that fre­quen­tists con­sider “the poin­ter prob­lem” (i.e., just your usual se­lec­tion bias) as some­thing that needs cor­rec­tion while Bayesi­ans don’t cor­rect in these cases. The au­thor con­cludes (I trust, via some kind of ar­gu­ment) that Bayesian’s don’t need to cor­rect if they choose the pos­te­ri­ors care­fully enough.

I now see that I was be­ing en­tirely con­sis­tent with my role as the res­i­dent fre­quen­tist when I iden­ti­fied this as a “poin­ter prob­lem” prob­lem (which it is) but that doesn’t mean the prob­lem can’t be pushed through with­out cor­rec­tion* -- the Bayesian way—by care­fully con­sid­er­ing the pri­ors.

*”Re­quiring cor­rec­tion” then might be a eu­phemism for time-de­pen­dent, while a prefer­ence for an up­date­less de­ci­sion the­ory is a good Bayesian qual­ity. A qual­ity, by the way, a fre­quen­tist can ap­pre­ci­ate as well, so this might be a point of con­tact on which to win fre­quen­tists over.

• Be­fore the ex­per­i­ment, you calcu­late the gen­eral util­ity of the con­di­tional strat­egy “Re­ply ‘Yes’ to the ques­tion if you wake up in a green room” as (50% ((18 +\$1) + (2 -\$3))) + (50% ((18 -\$3) + (2 +\$1))) = -\$20

This as­sumes that the ques­tion is asked only once, but then, to which of the 20 copies will it be asked?

If all 20 copies get asked the same ques­tion (or equiv­a­lently if a sin­gle copy cho­sen at ran­dom is) then the util­ity is (50% 1820 ((18 +\$1) + (2 -\$3))) + (50% 220 ((18 -\$3) + (2 +\$1))) = 2.8\$ = 50% * 5.6\$.

Con­sider the similar thought ex­per­i­ments:

• I flip a fair coin to de­ter­mine whether to switch to my headdy coin or my tailly coin, which have a 90% and 10% prob­a­bil­ity of heads re­spec­tively.

• Now I flip this bi­ased coin. If it comes up heads then I paint the room green, if it comes up tails I paint it red.

• You then find your­self in a green room.

• Then I flip the bi­ased coin again, and re­paint the room.

• Be­fore this sec­ond flip, I offer you the bet of +1\$ if the room stays green and −3\$ if it be­comes red.

The prior ex­pected util­ity be­fore the ex­per­i­ment is:

``````E(util|headdy) = 90% * 1\$ + 10% * −3\$ =  0.6\$
E(util|tailly) = 10% * 1\$ + 90% * −3\$ = −2.6\$
E(util) = 50% * E(util|headdy) + 50% * E(util|tailly) = −1\$
``````

Given that you find your­self in a green room af­ter the first flip, you can de­ter­mine the prob­a­bil­ity that the headdy coin is used:

``````P(green) = 0.5
``````

Which gives a pos­te­rior util­ity:

``````E(util|green) = 0.9 * E(util|headdy) + 0.1 * E(util|tailly) = 0.28\$
``````
• This as­sumes that the ques­tion is asked only once, but then, to which of the 20 copies will it be asked?

Every copy that is in a green room is asked the ques­tion (so ei­ther 2 or 18 copies to­tal are asked). If all an­swer Play, we play. If all an­swer Don’t Play, we don’t. In any other case we fine all 20 copies some huge amount; this is in­tended to make them agree be­fore­hand on what an­swer to give. (This is re­worded from the OP.)

For your other thought ex­per­i­ment—if there aren’t ac­tual N copies be­ing asked the ques­tion, then there’s no dilemma; you (the only copy) sim­ply up­date on the ev­i­dence available (that the room is green). So yes, the origi­nal prob­lem re­quires copies be­ing asked in par­allel to in­tro­duce the pos­si­bil­ity that you’re hurt­ing other copies of your­self by giv­ing a self-serv­ing an­swer. Whereas if you’re the only copy, you always give a self-serv­ing an­swer, i.e. play only if the room is green.

• This as­sumes that the ques­tion is asked only once, but then, to which of the 20 copies will it be asked?

Every copy that is in a green room is asked the ques­tion (so ei­ther 2 or 18 copies to­tal are asked). If all an­swer Play, we play. If all an­swer Don’t Play, we don’t. In any other case we fine all 20 copies some huge amount; this is in­tended to make them agree be­fore­hand on what an­swer to give. (This is re­worded from the OP.)

For your other thought ex­per­i­ment—if there aren’t ac­tual N copies be­ing asked the ques­tion, then there’s no dilemma; you (the only copy) sim­ply up­date on the ev­i­dence available (that the room is green). So yes, the origi­nal prob­lem re­quires copies be­ing asked in par­allel to in­tro­duce the pos­si­bil­ity that you’re hurt­ing other copies of your­self by giv­ing a self-serv­ing an­swer. Whereas if you’re the only copy, you always give a self-serv­ing an­swer, i.e. play only if the room is green.

• I keep hav­ing trou­ble think­ing of prob­a­bil­ities when I’m to be copied and >=1 of “me” will see red and >=1 of “me” will see green. My thought is that it is 100% likely that “I” will see red and know there are oth­ers, once-mes, who see green, and 100% likely vice-versa. Wak­ing up to see red (green) is ex­actly the ex­pected re­sult.

I do not know what to make of this opinion of mine. It’s as if my defi­ni­tion of self—or choice of body—is in su­per­po­si­tion. Am I com­mit­ting an er­ror here? Sugges­tions for fur­ther read­ing would be ap­pre­ci­ated.

• I re­main con­vinced that the prob­a­bil­ity is 90%.

The con­fu­sion is over whether you want to max­i­mize the ex­pec­ta­tion of the num­ber of utilons there will be if you wake up in a green room or the ex­pec­ta­tion of the num­ber of utilons you will ob­serve if you wake up in a green room.

• Whoohoo! I just figured out the cor­rect way to han­dle this prob­lem, that ren­ders the global and ego­cen­tric/​in­ter­nal re­flec­tions con­sis­tent.

We will see if my solu­tion makes sense in the morn­ing, but the up­shot is that there was/​is noth­ing wrong with the green roomer’s pos­te­rior, as many peo­ple have been cor­rectly defend­ing. The green roomer who com­puted an EV of \$5.60 mod­eled the money pay-off scheme wrong.

In the in­cor­rect calcu­la­tion that yields \$5.6 EV, the green roomer mod­els him­self as win­ning (get­ting the fa­vor­able +\$12) when he is right and los­ing (pay­ing the -\$52) when he is wrong. But no, not ex­actly. The green roomer doesn’t win ev­ery time he’s right—even though cer­tainly he’s right ev­ery time he’s right.

The green roomer wins 1 out of ev­ery 18 times that he’s right, be­cause 17 copies of him­self that were also right do not get their own in­de­pen­dent win­nings, and he loses 1 out of ev­ery 2 times he’s wrong, be­cause there are 2 of him that are wrong in the room that pays \$52.

So it is Bostrom’s di­vi­sion-of-re­spon­si­bil­ity, with the jus­tifi­ca­tion. It is prob­a­bly more apt to name it di­vi­sion-of-re­ward.

Here’s is the cor­rect green roomer calcu­la­tion:

EV = P(heads)(pay­off given heads)(rate of pay­off given heads)+ P(tails)(pay­off given tails)(rate of pay­off given tails)

=.9(\$12)(1/​18)+.1(-\$52)(1/​2) = −2

(By the way, this doesn’t mod­ify what I said about poin­t­ers, but I must ad­mit I don’t un­der­stand at the mo­ment how the two per­spec­tives are re­lated. Yet; some thoughts.)

• This is my at­tempt at a ped­a­gog­i­cal ex­po­si­tion of “the solu­tion”. It’s overly long, and I’ve lost per­spec­tive com­pletely about what is un­der­stood by the group here and what isn’t. But since I’ve writ­ten up this solu­tion for my­self, I’ll go ahead and share it.

The cases I’m de­scribing be­low are al­tered from the OP so that they com­pletely non-meta­phys­i­cal, in the sense that you could im­ple­ment them in real life with real peo­ple. Thus there is an ob­jec­tive re­al­ity re­gard­ing whether money is col­lec­tively lost or won, so there is fi­nally no am­bi­guity about what the cor­rect calcu­la­tion ac­tu­ally is.

Sup­pose that there are twenty differ­ent grad­u­ate stu­dents {Amy, Betty, Cindy, …, Tony} and two ho­tels con­nected by a breeze­way. Ho­tel Green has 18 green rooms and 2 red rooms. Ho­tel Red has 18 red rooms and 2 green rooms. Every night for many years, stu­dents will be as­signed a room in ei­ther Ho­tel Green or Ho­tel Red de­pend­ing on a coin flip (heads --> Ho­tel Green for the night, tails --> Ho­tel Red for the night). Stu­dents won’t know what ho­tel they are in but can see their own room color only. If a stu­dent sees a green room, that stu­dent cor­rectly de­duces they are in Ho­tel Green with 90% prob­a­bil­ity.

Case 1: Sup­pose that ev­ery morn­ing, Tony is al­lowed to bet that he is in a green room. If he bets ‘yes’ and is cor­rect, he pock­ets \$12. If he bets ‘yes’ and is wrong, he has to pay \$52. (In other words, his pay­off for a cor­rect vote is \$12, the pay­off for a wrong vote is -\$52.) What is the ex­pected value of his bet­ting if he always says ‘yes’ if he is in a green room?

For ev­ery 20 times that Tony says ‘yes’, he wins 18 times (wins \$12x18) and he loses twice (loses \$52x2), con­sis­tent with his pos­te­rior. One av­er­age he wins \$5.60 per bet , or \$2.80 per night. (He says “yes” to the bet 1 out of ev­ery 2 nights, be­cause that is the fre­quency with which he finds him­self in a green room.) This is a steady money pump in the stu­dent’s fa­vor.

The cor­rect calcu­la­tion for Case 1 is:

av­er­age pay­off per bet = (prob­a­bil­ity of be­ing right)x(pay­off if right)+ (prob­a­bil­ity of be­ing wrong)x(pay­off if wrong) = .9x18+.1x-52 =5.6.

Case 2: Sup­pose that Tony doesn’t pocket the money, but in­stead the money is placed in a tip jar in the breeze­way. Tony’s bet­ting con­tributes \$2.80 per night on av­er­age to the tip jar.

Case 3: Sup­pose there is noth­ing spe­cial about Tony, and all the stu­dents get to make bets. They will all make bets when they wake in green rooms, and add \$2.80 per night to the tip jar on av­er­age. Col­lec­tively, the stu­dents add \$56 per night to the tip jar on av­er­age. (If you think about it a minute, you will see that they add \$216 to the tip jar on nights that they are as­signed to ho­tel Green and lose \$104 on nights that they are as­signed to ho­tel Red.) If the money is dis­tributed back to the stu­dents, they each are mak­ing \$2.80 per night, the same steady money pump in their fa­vor that Tony took ad­van­tage of in Case 1.

Case 4: Now con­sider the case de­scribed in the OP. We already un­der­stand that the stu­dents will vote “yes” if they wake in a green room and that they ex­pect to make money do­ing so. Now the rules are go­ing to change, how­ever, so that when all the green roomers unan­i­mously vote “yes”, \$12 are added to the tip jar if they are cor­rect and \$52 are sub­tracted if they are wrong. Since the stu­dents are as­signed to Ho­tel Green half the time and to Ho­tel Red half the time, on av­er­age the tip jar loses \$20 ev­ery night. Sud­denly, the stu­dents are los­ing \$1 a night!

Each time a stu­dent votes cor­rectly, it is be­cause they are all in Ho­tel Green, as per the ini­tial set up of the prob­lem in the OP. So all 18 green roomer votes are cor­rect and col­lec­tively earn \$12 for that night. The pay­off is \$12/​18 per cor­rect vote. Like­wise, the pay­off per wrong vote is -\$52/​2.

So the cor­rect calcu­la­tion for case 4 is as fol­lows:

av­er­age pay­off per bet = (prob­a­bil­ity of be­ing right)x(pay­off if right)+ (prob­a­bil­ity of be­ing wrong)x(pay­off if wrong) = .9x(18/​12)+.1x(-52/​2) = −2.

So in con­clu­sion, in the OP prob­lem, the green roomer must rec­og­nize that he is deal­ing with case #4 and not Case #1, in which the pay­off is differ­ent (but not the pos­te­rior).

• I be­lieve both of your com­pu­ta­tions are cor­rect, and the fal­lacy lies in mix­ing up the pay­off for the group with the pay­off for the in­di­vi­d­ual—which the frame of the prob­lem as posed does sug­gest, with mul­ti­ple iden­tities that are ac­tu­ally the same per­son. More pre­cisely, the prob­a­bil­ities for the in­di­vi­d­ual are 9010 , but the prob­a­bil­ities for the groups are 5050, and if you com­pute pay­offs for the group (+\$12/​-\$52), you need to use the group prob­a­bil­ities. (It would be differ­ent if the nar­ra­tor (“I”) offered the guinea pig (“you”) the \$12/​\$52 odds in­di­vi­d­u­ally.)

byrnema looked at the re­sult from the group view­point; you get the same re­sult when you ap­proach it from the in­di­vi­d­ual view­point, if done cor­rectly, as fol­lows:

For a sin­gle per­son, the cor­rect pay­off is not \$12 vs. -\$52, but rather (\$1 minus \$6/​18 to re­im­burse the reds, mak­ing \$0.67) 90% and (\$1 minus \$54/​2 = -\$26) 10%, so each of the copies of the guinea pig is go­ing to be out of pocket by 23 0.9 + (-26) 0.1 = 0.6 − 2.6 = −2, on av­er­age.

The fal­lacy of Eliezer’s guinea pigs is that each of them thinks they get the \$18 each time, which means that the 18 goes into his com­pu­ta­tion twice (squared) for their win­nings (18 * 1820). This is not a prob­lem with antropic rea­son­ing, but with statis­tics.

A dis­trust­ful in­di­vi­d­ual would ask them­selves, “what is the nar­ra­tor get­ting out of it”, and re­al­ize that the nar­ra­tor will see the -\$12 /​ + \$52 out­come, not the guinea pig—and that to the nar­ra­tor, the 5050 prob­a­bil­ity ap­plies. Don’t mix them up!

• It was 3:30 in the morn­ing just a short while ago, and I woke up with a bunch of non-sen­si­cal ideas about the prop­er­ties of this prob­lem, and then while I was try­ing to get back to sleep I re­al­ized that one of the ideas made sense. Ev­i­dence that un­der­stand­ing this prob­lem for my­self re­quired a right-brain re­boot.

I’m not sur­prised about the re­boot: I’ve been think­ing about this prob­lem a lot, which sig­nals to my brain that it’s im­por­tant, and it liter­ally hurt my brain to think about why the green roomers were los­ing for the group when they thought they were win­ning, strongly sug­gest­ing I was hit­ting my apol­o­gist limit.

• In per­sonal con­ver­sa­tion, Nick Bostrom sug­gested that a di­vi­sion-of-re­spon­si­bil­ity prin­ci­ple might can­cel out the an­thropic up­date—i.e., the pa­per­clip max­i­mizer would have to rea­son, “If the log­i­cal coin came up heads then I am 1/​18th re­spon­si­ble for adding +1 pa­per­clip, if the log­i­cal coin came up tails then I am 12 re­spon­si­ble for de­stroy­ing 3 pa­per­clips.” I con­fess that my ini­tial re­ac­tion to this sug­ges­tion was “Ewwww”, but I’m not ex­actly com­fortable con­clud­ing I’m a Boltz­mann brain, ei­ther.

I would per­haps pre­fer to use differ­ent lan­guage in the de­scrip­tion but this seems to be roughly the an­swer to the ap­par­ent in­con­sis­tency. When rea­son­ing an­throp­i­cally you must de­cide an­throp­i­cally. Un­for­tu­nately it is hard to de­scribe such de­ci­sion mak­ing with­out us­ing sound­ing ei­ther un­scien­tific or out­right incomprehensible

I’m rather look­ing for­ward to an­other Eleizer post on this topic once he has finished dis­solv­ing his con­fu­sion. I’ve gained plenty from ab­sorb­ing the posts and dis­cus­sions and more from men­tally re­duc­ing the con­cepts my­self. But this stuff is rather com­pli­cated and to be perfectly hon­est, I don’t trust my­self to not have missed some­thing.

• Let the dilemma be, “I will ask all peo­ple who wake up in green rooms if they are will­ing to take the bet ‘Create 1 pa­per­clip if the log­i­cal coin­flip came up heads, de­stroy 3 pa­per­clips if the log­i­cal coin­flip came up tails’. (Should they dis­agree on their an­swers, I will de­stroy 5 pa­per­clips.)” Then a pa­per­clip max­i­mizer, be­fore the ex­per­i­ment, wants the pa­per­clip max­i­miz­ers who wake up in green rooms to re­fuse the bet. But a con­scious pa­per­clip max­i­mizer who up­dates on an­thropic ev­i­dence, who wakes up in a green room, will want to take the bet, with ex­pected util­ity ((90% +1 pa­per­clip) + (10% −3 pa­per­clips)) = +0.6 pa­per­clips.

That last calcu­la­tion doesn’t look right to me : the pa­per­clip max­i­mizer in the green room still knows that there are other pa­per­clip max­i­miz­ers in red rooms who will re­fuse the bet whether or not they rely on an­thropic ev­i­dence. So the ex­pected util­ity of tak­ing the bet would be 100% * − 5 pa­per­clips.

Or did I mi­s­un­der­stand some­thing?

• Or did I mi­s­un­der­stand some­thing?

Red Clippy doesn’t get a vote.

• Can some­one come up with a situ­a­tion of the same gen­eral form as this one where an­thropic rea­son­ing re­sults in op­ti­mal ac­tions and nonan­thropic rea­son­ing re­sults in sub­op­ti­mal ac­tions?

• How about if the wa­ger is that any­body in any room can guess the out­come of the coin­flip, and if they get it right they win 1\$ and if they get it wrong they lose 2\$?

If you still think it’s 50% af­ter wak­ing up in a green room, you won’t take the bet, and you’ll win 0\$, if you think it’s 90% you’ll take the bet and come out 14\$ ahead on bal­ance, with two of you los­ing 2\$ each and 18 of you get­ting \$1.

Doesn’t this show an­thropic rea­son­ing is right as much as the OP shows it’s wrong?

• I think you’re miss­ing a term in your sec­ond calcu­la­tion. And why are an­thropism and copies of you nec­es­sary for this puz­zle. I sus­pect the an­swer will in­di­cate some­thing I’m com­pletely miss­ing about this se­ries.

Take this for straight-up prob­a­bil­ity:

I have two jars of mar­bles, one with 18 green and 2 red, the other with 18 red and two green. Pick one jar at ran­dom, then look at one mar­ble from that jar at ran­dom.

If you pick green, what’s the chance that your jar is mostly green? I say 90%, by fairly straight­for­ward ap­pli­ca­tion of bayes’ rule.

I offer a wa­ger: you get \$1 per green and lose \$3 per red mar­ble in the jar you chose.

After see­ing a green mar­ble, I think your EV is \$5.60. After see­ing a red mar­ble, I think your EV is \$0 (you de­cline the bet). If you are forced to make the wa­ger be­fore see­ing any­thing, con­di­tional on draw­ing green, I think your EV is \$2.80. I calcu­late it thus: 50% to get mostly-green jar, and 90% of that will you see green and take the bet, which is worth +\$1*18 - \$3*2 in this case. 50% to get mostly-red, 10% of which will you draw green, worth +1*2 - \$3*18. 0.5 * 0.9 (1 \ 18 − 3 * 2) + 0.5 * 0.1 (1 \ 2 − 3 * 18) = 2.80, which is con­sis­tent: half the time you pick green, with EV of 5.60.

I think you left out the prob­a­bil­ity that you’ll get green and take the bet in each of your 0.5 prob­a­bil­ities for the con­di­tional strat­egy. Mul­ti­ply a 0.9 to the first term and 0.1 into the sec­ond, and ev­ery­thing gets con­sis­tent.

• The prob­lem is that we aren’t ask­ing one ran­domly se­lected per­son, we’re ask­ing all of the green ones (they have to agree unan­i­mously for the Yes vote to go through).

• Ah, I see. You’re ask­ing all the green ones, but only pay­ing each pod once. This feels like re­verse-weight­ing the pay­out, so it should still be -EV even af­ter wak­ing up, but I haven’t quite worked out a way to in­clude that in the num­bers...

• The sec­ond sum still seems wrong. Here it is:

“How­ever, be­fore the ex­per­i­ment, you calcu­late the gen­eral util­ity of the con­di­tional strat­egy “Re­ply ‘Yes’ to the ques­tion if you wake up in a green room” as (50% ((18 +\$1) + (2 -\$3))) + (50% ((18 -\$3) + (2 +\$1))) = -\$20. You want your fu­ture selves to re­ply ‘No’ un­der these con­di­tions.”

The sum given is the one you would perform if you did not know which room you woke up in. Surely a differ­ent sum is ap­pro­pri­ate with the ad­di­tional ev­i­dence that you awoke in a green room.

In­ci­den­tally, this prob­lem seems far too com­pli­cated! I feel like the pro­gram­mer faced with a bug re­port which failed to find some sim­ple code that non­the­less man­ages to re­pro­duce the prob­lem. Sim­plify, sim­plify, sim­plify!

• In this com­ment:

http://​​less­wrong.com/​​lw/​​17d/​​forc­ing_an­throp­ics_boltz­mann_brains/​​138u

I put for­ward my view that the best solu­tion is to just max­i­mize to­tal util­ity, which cor­rectly han­dles the forc­ing an­throp­ics case, and ex­pressed cu­ri­os­ity as to whether it would han­dle the out­law­ing an­throp­ics case.

It now seems my solu­tion does cor­rectly han­dle the out­law­ing an­throp­ics case, which would seem to be a data point in its fa­vor.

• Max­i­miz­ing to­tal he­do­nic util­ity fails the out­law­ing an­throp­ics case: sub­sti­tute he­dons for pa­per­clips.

• I don’t think I un­der­stand your claim here. We agree that my solu­tion works if you mea­sure util­ity in pa­per­clips? Why do you think it fails if you mea­sure util­ity in he­dons?

• As­sume that each agent has his own game (that is one game for each agent). That is there are over­all 18 (or 2) games (de­pend­ing the re­sult of the coin flip.)

Then the first calcu­la­tion would be cor­rect in ev­ery re­spect, and it makes sense to say yes from a global point of view. (And also with any other re­ward ma­trix, the dy­namic up­date would be con­sis­tent with the apri­ori de­ci­sion all the time)

This shows that the er­ror made by the agent was to im­plic­itely as­sume that he has his own game.

• How about give all of your po­ten­tial clones a vote, even though you can’t com­mu­ni­cate?

So, in one case, 18 of you would say “Yes, take the bet!” and 2 would say “No, let me keep my money.” In the other case, 18 would say no and two would say yes. In ei­ther case, of course, you’re one of the ones who would vote yes. OK, that leaves us tied. So why not let ev­ery­one’s vote be pro­por­tional to what they stand to gain/​lose? That leaves us with 20 −3 vs. 20 1. Don’t take the bet.

(Yes, I re­al­ize half the peo­ple that just voted above don’t ex­ist. We just don’t know which half...)

• As it’s been pointed out, this is not an an­thropic prob­lem, how­ever there still is a para­dox. I’m may be stat­ing the ob­vi­ous, but the root of the prob­lem is that you’re do­ing some­thing fishy when you say that the other peo­ple will think the same way and that your de­ci­sion will theirs.

The proper way to make a de­ci­sion is to have a prob­a­bil­ity dis­tri­bu­tion on the code of the other agents (which will in­clude their prior on your code). From this I be­lieve (but can’t prove) that you will take the cor­rect course of ac­tion.

New­comb like prob­lem fall in the same cat­e­gory, the trick is that there is always a be­lief about some­one’s de­ci­sion mak­ing hid­den in the prob­lem.

• EDIT: at first I thought this was equiv­a­lent, but then I tried the num­bers and re­al­ized it’s not.

1. I’ll flip a coin to choose which roulette wheel to spin. If it comes up heads, I’ll spin a wheel that’s 90% green and 10% red. If it comes up tails, a wheel that’s 10% green and 90% red.

2. I won’t show you the wheel or the coin (at this point) but I’ll tell you which color came up.

3. If it’s green, you can bet on the coin­flip: win \$3 for heads and lose \$13 for tails.

If the color is green, do you take the bet?

EDIT: After play­ing with the num­bers, I think rea­son it’s not equiv­a­lent is that in the rooms, there are always some of you who see green. I still think it’s pos­si­ble to cre­ate an equiv­a­lent situ­a­tion in real life, with­out copy­ing peo­ple. Maybe if you had a group of peo­ple draw lots and all the peo­ple who get green vote on whether to bet on which lot they were draw­ing from.

• [EDIT:] Warn­ing: This post was based on a mi­s­un­der­stand­ing of the OP. Thanks or­thonor­mal for point­ing out the the mis­take! I leave this post here so that the replies stay in con­text.

I think that de­ci­sion ma­trix of the agent wak­ing up in green room is not com­plete: it should con­tain the out­come of los­ing \$50 if the an­swers are not con­sis­tent.

There­fore, it would com­pute that even if the prob­a­bil­ity of the coin was flipped to 1 is 90%, it still does not make sense to an­swer “yes” since two other copies would an­swer “no” and there­fore the penalty for not giv­ing a uniform an­swer will out­weigh the po­ten­tial) win of \$5.60. (Even with­out the penalty, the agent could in­fer that there were two dis­sent­ing copies of it­self in that case and he has no ways to gen­er­ate all the nec­es­sary votes to get the money.)

The er­ror of the agent is not the P=90% es­ti­mate, but the im­plicit as­sump­tion that he is the only one to in­fluence the out­come.

• The copies in red rooms don’t get to vote in this setup.

• Thanks for point­ing that out. Now I un­der­stand the prob­lem.

How­ever, I still think that the mis­take made by the agent is the im­plicit as­sump­tion the he is the only one in­fluenc­ing the out­come.

Since all of the copies as­sume that they solely de­cide the out­come, they over­es­ti­mate the re­ward af­ter the an­thropic up­date (each of the copies claim the whole re­ward for his de­ci­sion, al­though the de­ci­sion is col­lec­tive and each vote is nec­es­sary).

• By the way, please don’t delete a com­ment if you change your mind or re­al­ize an er­ror; it makes the con­ver­sa­tion difficult for oth­ers to read. You can always put in an edit (and mark it as such) if you want.

I’d only delete one of my com­ments if I felt that its pres­ence ac­tu­ally harmed read­ers, and that there was no dis­claimer I could add that would pre­vent that harm.

• OK, sorry. (In this spe­cial case, I re­mem­ber think­ing that your re­mark was perfectly un­der­stand­able even with­out the con­text.)

• Is there any ver­sion of this post that doesn’t in­volve tech­nolo­gies that we don’t have? If not, then might the re­s­olu­tion to this para­dox be that the copy­ing tech­nol­ogy as­sumed to ex­ist can’t ex­ist be­cause if it did it would give rise to a log­i­cal in­con­sis­tency.

• Cute.

You may be able to trans­late into the lan­guage of “wake, query, in­duce am­ne­sia”—many copies would cor­re­spond to many wak­ings.

• No, the dilemma de­pends on hav­ing many copies. You’re try­ing to op­ti­mize the out­come av­er­aged over all copies (be­fore the copies are made), be­cause you don’t know which copy “you” will “be”.

In the no-copies /​ am­ne­sia ver­sion, the up­date­less ap­proach is clearly cor­rect. You have no data to up­date on—awak­en­ing in a green room tells you noth­ing about the coin tosses be­cause ei­ther way you’d wake up in a green room at least once (and you for­get about it, so you don’t know how many times it hap­pened). There­fore you will always re­fuse to play.

• No, the dilemma de­pends on hav­ing many copies. You’re try­ing to op­ti­mize the out­come av­er­aged over all copies (be­fore the copies are made), be­cause you don’t know which copy “you” will “be”.

In the no-copies /​ am­ne­sia ver­sion, the up­date­less ap­proach is ovbi­ously cor­rect. You have no data to up­date on (you don’t know how many times you’ve wo­ken and for­got­ten about it), so you always re­fuse to play, even in a green room. IOW: awak­en­ing in a green room tells you noth­ing about the coin tosses, since ei­ther way you’d awake in a green room at least once.

• But we don’t have the type of am­ne­sia drugs re­quired to man­i­fest the Sleep­ing Beauty prob­lem, and per­haps there is some­thing about con­scious­ness that would pre­vent them from ever be­ing cre­ated. (Isn’t there some law of physics that pre­cludes the to­tal de­struc­tion of in­for­ma­tion.)

• I don’t un­der­stand—what type of am­ne­sia drug is re­quired? For ex­am­ple, this lab:

http://​​mem­ory.psy.cmu.edu/​​

ap­par­ently rou­tinely does ex­per­i­ments in­duce tem­po­rary am­ne­sia us­ing a drug called mi­dalozam. In gen­eral, I was un­der the im­pres­sion that a wide va­ri­ety of drugs have side effects of var­i­ous de­grees and kinds of am­ne­sia, in­clud­ing both an­tero­grade and ret­ro­grade.

Your pro­posal that con­scious­ness might be con­served, and more­over that this might be proved by arm­chair rea­son­ing seems a bit far­fetched. Are you:

1. just spec­u­lat­ing idly?

2. se­ri­ously pur­su­ing this hy­poth­e­sis as the best av­enue to­wards re­solv­ing EY’s puz­zle?

3. pur­su­ing some crypto-re­li­gious (i.e. “con­scious­ness con­served”=>”eter­nal life”) agenda?

• My first com­ment was (2) the sec­ond (1).

If DanArmk’s com­ment is cor­rect then it isn’t im­por­tant for my origi­nal com­ment whether there ex­ists am­ne­sia drugs.

If your post is cor­rect then my sec­ond com­ment is in­cor­rect.

• Micro­scopic re­versibil­ity pro­hibits any de­struc­tion of the in­for­ma­tion nec­es­sary to run things back­wards—and that’s all the in­for­ma­tion in the uni­verse as far as we know.

• Per­haps we should look at Dresher’s Carte­sian Cam­corder as a way of re­duc­ing con­scious­ness, and thereby elimi­nate this para­dox.

Or, to turn it around, this para­dox is a lit­mus test for the­o­ries of con­scious­ness.

• Edit: pre­sum­ably there’s an an­swer already dis­cussed that I’m not aware of, prob­a­bly com­mon to all games where Omega cre­ates N copies of you. (Since so many of them have been dis­cussed here.) Can some­one please point me to it?

I’m hav­ing difficul­ties ig­nor­ing the in­her­ent value of hav­ing N copies of you cre­ated. The sce­nario as­sumes that the copies go on ex­ist­ing af­ter the game, and that they each have the same amount of utilons as the origi­nal (in­stead of a di­vi­sion of some kind).

For sup­pose the copies are short lived: Omega de­stroys them af­ter the game. (Hu­man-like agents will be de­terred by the nega­tive util­ity of hav­ing N-1 copies cre­ated just to ex­pe­rience death.) Then ev­ery copy effec­tively de­cides for it­self, be­cause its siblings won’t get to keep their utilons for long, and the strat­egy “play game iff room is green” is valid.

Now sup­pose the copies are long lived. It’s very likely that cre­at­ing them has sig­nifi­cant util­ity value.

For goals on which the N copies can co­op­er­ate (e.g. build­ing pa­per­clips or ac­quiring knowl­edge), the to­tal re­sources available (and so util­ity) will have in­creased, of­ten lin­early (N times), some­times a lot more. An AI might de­cide to pool all re­sources /​ com­put­ing power and de­stroy N-1 copies im­me­di­ately af­ter the game is played.

For goals on which the copies com­pete (e.g. prop­erty and iden­tity), util­ity will be much re­duced by in­creased com­pe­ti­tion.

In the ab­sence of any com­mon or con­tested goals, all copies will prob­a­bly profit from trade and spe­cial­iza­tion.

The util­ity out­come of hav­ing N copies cre­ated prob­a­bly far out­weighs the game stakes, and cer­tainly can’t be ig­nored.

• Um, you get copied N times re­gard­less of your choice, so the util­ity of be­ing copied shouldn’t fac­tor into your choice. I’m afraid I don’t un­der­stand your ob­jec­tion.

• The more I think about this, the more I sus­pect that the prob­lem lies in the dis­tinc­tion be­tween quan­tum and log­i­cal coin-flips.

Sup­pose this ex­per­i­ment is car­ried out with a quan­tum coin-flip. Then, un­der many-wor­lds, both out­comes are re­al­ized in differ­ent branches. There are 40 fu­ture selves--2 red and 18 green in one world, 18 red and 2 green in the other world—and your duty is clear:

(50% ((18 +\$1) + (2 -\$3))) + (50% ((18 -\$3) + (2 +\$1))) = -\$20.

Don’t take the bet.

So why Eliezer’s in­sis­tence on us­ing a log­i­cal coin-flip? Be­cause, I sus­pect, it pre­vents many-wor­lds from be­ing rele­vant. Log­i­cal coin-flips don’t cre­ate pos­si­ble wor­lds the way quan­tum coin-flips do.

But what is a log­i­cal coin-flip, any­way?

Us­ing the ex­am­ple given at the top of this post, an agent that was not only ra­tio­nal but clever would sit down and calcu­late the 256th bi­nary digit of pi be­fore an­swer­ing. Pick­ing a more difficult log­i­cal coin-flip just makes the calcu­la­tion more difficult; a more in­tel­li­gent agent could solve it, even if you can’t.

So there are two differ­ent kinds of log­i­cal coin-flips: the sort that are in­dis­t­in­guish­able from quan­tum coin-flips even in prin­ci­ple, in which case they ought to cause the same sort of branch­ing events un­der many-wor­lds—and the sort that are solv­able, but only by some­one smarter than you.

If you’re not smart enough to solve the log­i­cal coin-flip, you may as well treat it as a quan­tum coin-flip, be­cause it’s already been es­tab­lished that you can’t pos­si­bly do bet­ter. That doesn’t mean your de­ci­sion al­gorithm is flawed; just that if you were more pow­er­ful, it would be more pow­er­ful too.