If a tree falls on Sleeping Beauty...

Sev­eral months ago, we had an in­ter­est­ing dis­cus­sion about the Sleep­ing Beauty prob­lem, which runs as fol­lows:

Sleep­ing Beauty vol­un­teers to un­dergo the fol­low­ing ex­per­i­ment. On Sun­day she is given a drug that sends her to sleep. A fair coin is then tossed just once in the course of the ex­per­i­ment to de­ter­mine which ex­per­i­men­tal pro­ce­dure is un­der­taken. If the coin comes up heads, Beauty is awak­ened and in­ter­viewed on Mon­day, and then the ex­per­i­ment ends. If the coin comes up tails, she is awak­ened and in­ter­viewed on Mon­day, given a sec­ond dose of the sleep­ing drug, and awak­ened and in­ter­viewed again on Tues­day. The ex­per­i­ment then ends on Tues­day, with­out flip­ping the coin again. The sleep­ing drug in­duces a mild am­ne­sia, so that she can­not re­mem­ber any pre­vi­ous awak­en­ings dur­ing the course of the ex­per­i­ment (if any). Dur­ing the ex­per­i­ment, she has no ac­cess to any­thing that would give a clue as to the day of the week. How­ever, she knows all the de­tails of the ex­per­i­ment.

Each in­ter­view con­sists of one ques­tion, “What is your cre­dence now for the propo­si­tion that our coin landed heads?”

In the end, the fact that there were so many rea­son­able-sound­ing ar­gu­ments for both sides, and so much dis­agree­ment about a sim­ple-sound­ing prob­lem among above-av­er­age ra­tio­nal­ists, should have set off ma­jor alarm bells. Yet only a few peo­ple pointed this out; most com­menters, in­clud­ing me, fol­lowed the silly strat­egy of try­ing to an­swer the ques­tion, and I did so even af­ter I no­ticed that my in­tu­ition could see both an­swers as be­ing right de­pend­ing on which way I looked at it, which in ret­ro­spect would have been a perfect time to say “I no­tice that I am con­fused” and back­track a bit…

And on re­flec­tion, con­sid­er­ing my con­fu­sion rather than try­ing to con­sider the ques­tion on its own terms, it seems to me that the prob­lem (as it’s nor­mally stated) is com­pletely a tree-fal­ling-in-the-for­est prob­lem: a de­bate about the nor­ma­tively “cor­rect” de­gree of cre­dence which only seemed like an is­sue be­cause any con­clu­sions about what Sleep­ing Beauty “should” be­lieve weren’t pay­ing their rent, were dis­con­nected from any ex­pec­ta­tion of feed­back from re­al­ity about how right they were.

It may seem ei­ther im­plau­si­ble or alarm­ing that as fun­da­men­tal a con­cept as prob­a­bil­ity can be the sub­ject of such de­bates, but re­mem­ber that the “If a tree falls in the for­est…” ar­gu­ment only comes up be­cause the un­der­stand­ing of “sound” as “vibra­tions in the air” and “au­di­tory pro­cess­ing in a brain” co­in­cide of­ten enough that most peo­ple other than philoso­phers have bet­ter things to do than ar­gue about which is more cor­rect. Like­wise, in situ­a­tions that we ac­tu­ally en­counter in real life where we must rea­son or act on in­com­plete in­for­ma­tion, long-run fre­quency is gen­er­ally about the same as op­ti­mal de­ci­sion-the­o­retic weight­ing. If you’re given the ques­tion “If you have a bag con­tain­ing a white mar­ble and two black mar­bles, and an­other bag con­tain­ing two white mar­bles and a black mar­ble, and you pick a bag at ran­dom and pick a mar­ble out of it at ran­dom and it’s white, what’s the prob­a­bil­ity that you chose the sec­ond bag?” then you can just an­swer it as given, with­out wor­ry­ing about spec­i­fy­ing a pay­off struc­ture, be­cause no mat­ter how you re­for­mu­late it in terms of bets and pay­offs, if your de­ci­sion-the­o­retic rea­son­ing talks about prob­a­bil­ities at all then there’s only go­ing to be one sane prob­a­bil­ity you can put into it. You can as­sume that an­swers to non-es­o­teric prob­a­bil­ity prob­lems will be able to pay their rent if they are called upon to do so, and so you can do plenty within pure prob­a­bil­ity the­ory long be­fore you need your rea­son­ing to gen­er­ate any de­ci­sions.

But when you start get­ting into prob­lems where there may be mul­ti­ple copies of you and you don’t know how their re­sponses will be ag­gre­gated — or, more gen­er­ally, where you may or may not be scored on your prob­a­bil­ity es­ti­mate mul­ti­ple times or may not be scored at all, or when you don’t know how it’s be­ing scored, or when there may be other agents fol­low­ing rea­son­ing cor­re­lated with but not nec­es­sar­ily iden­ti­cal to yours — then I think talk­ing too much about “prob­a­bil­ity” di­rectly will cause differ­ent peo­ple to be solv­ing differ­ent prob­lems, given the differ­ent ways they will im­plic­itly imag­ine be­ing scored on their an­swers so that the ques­tion of “What sub­jec­tive prob­a­bil­ity should be as­signed to x?” has any nor­ma­tively cor­rect an­swer. Here are a few ways that the Sleep­ing Beauty prob­lem can be framed as a de­ci­sion prob­lem ex­plic­itly:

Each in­ter­view con­sists of Sleep­ing Beauty guess­ing whether the coin came up heads or tails, and be­ing given a dol­lar if she was cor­rect. After the ex­per­i­ment, she will keep all of her ag­gre­gate win­nings.

In this case, in­tend­ing to guess heads has an ex­pected value of $.50 (be­cause if the coin came up heads, she’ll get $1, and if it came up tails, she’ll get noth­ing), and in­tend­ing to guess tails has an ex­pected value of $1 (be­cause if the coin came up heads, she’ll get noth­ing, and if it came up tails, she’ll get $2). So she should in­tend to guess tails.

Each in­ter­view con­sists of Sleep­ing Beauty guess­ing whether the coin came up heads or tails. After the ex­per­i­ment, she will be given a dol­lar if she was cor­rect on Mon­day.

In this case, she should clearly be in­differ­ent (which you can call “.5 cre­dence” if you’d like, but it seems a bit un­nec­es­sary).

Each in­ter­view con­sists of Sleep­ing Beauty be­ing told whether the coin landed on heads or tails, fol­lowed by one ques­tion, “How sur­prised are you to hear that?” Should Sleep­ing Beauty be more sur­prised to learn that the coin landed on heads than that it landed on tails?

I would say no; this seems like a case where the sim­ple prob­a­bil­ity-the­o­retic rea­son­ing ap­plies. Be­fore the ex­per­i­ment, Sleep­ing Beauty knows that a coin is go­ing to be flipped, and she knows it’s a fair coin, and go­ing to sleep and wak­ing up isn’t go­ing to change any­thing she knows about it, so she should not be even slightly sur­prised one way or the other. (I’m pretty sure that sur­pris­ing­ness has some­thing to do with like­li­hood. I may write a sep­a­rate post on that, but for now: af­ter find­ing out whether the coin did come up heads or tails, the rele­vant ques­tion is not “What is the prob­a­bil­ity that the coin came up {heads,tails} given that I re­mem­ber go­ing to sleep on Sun­day and wak­ing up to­day?”, but “What is the prob­a­bil­ity that I’d re­mem­ber go­ing to sleep on Sun­day and wak­ing up to­day given that the coin came up {heads,tails}?”, in which case ei­ther out­come should be equally sur­pris­ing, in which case nei­ther out­come should be sur­pris­ing at all.)

Each in­ter­view con­sists of one ques­tion, “What is the limit of the fre­quency of heads as the num­ber of rep­e­ti­tions of this ex­per­i­ment goes to in­finity?”

Here of course the right an­swer is “.5, and I hope that’s just a hy­po­thet­i­cal…”

Each in­ter­view con­sists of one ques­tion, “What is your cre­dence now for the propo­si­tion that our coin landed heads?”, and the an­swer given will be scored ac­cord­ing to a log­a­r­ith­mic scor­ing rule, with the ag­gre­gate re­sult cor­re­spond­ing to the num­ber of utilons (con­verted to dol­lars, let’s say) she will be pe­nal­ized af­ter the ex­per­i­ment.

In this case it is op­ti­mal to bet 13 that the coin came up heads, 23 that it came up tails:

Bet on heads: 12 13
Ac­tual flip: Heads Tails Heads Tails
Mon­day: −1 bit −1 bit −1.585 bits −0.585 bits
Tues­day: n/​a −1 bit n/​a −0.585 bits
To­tal: −1 bit −2 bits −1.585 bits −1.17 bits
Ex­pected: −1.5 bits −1.3775 bits

(If you’re not used to the log­a­r­ith­mic scor­ing rule enough to trust that 13 is bet­ter than ev­ery other op­tion too, you can check this by graph­ing y = (log2x + 2 log2(1 - x))/​2, where x is the prob­a­bil­ity you as­sign to heads, and y is ex­pected util­ity.)

So I hope it is self-ev­i­dent that re­fram­ing seem­ingly-para­dox­i­cal prob­a­bil­ity prob­lems as de­ci­sion prob­lems gen­er­ally makes them triv­ial, or at least agree­ably solv­able and non-para­dox­i­cal. What may be more con­tro­ver­sial is that I claim that this is satis­fac­tory not as a cir­cum­ven­tion but as a dis­solu­tion of the ques­tion “What prob­a­bil­ity should be as­signed to x?”, when you have a clear enough idea of why you’re won­der­ing about the “prob­a­bil­ity.” Can we re­ally taboo con­cepts like “prob­a­bil­ity” and “plau­si­bil­ity” and “cre­dence”? I should cer­tainly hope so; judg­ments of prob­a­bil­ity had bet­ter be about some­thing, and not just rit­u­als of cog­ni­tion that we use be­cause it seems like we’re sup­posed to rather than be­cause it wins.

But when I try to re­place “prob­a­bil­ity” with what I mean by it, and when I mean it in any nor­ma­tive sense — not, like, out there in the ter­ri­tory, but just “nor­ma­tive” by what­ever stan­dard says that as­sign­ing a fair coin flip a prob­a­bil­ity of .5 heads tends to be a bet­ter idea than as­sign­ing it a prob­a­bil­ity of .353289791 heads — then I always find my­self talk­ing about op­ti­mal bets or av­er­age ex­per­i­men­tal out­comes. Can that re­ally be all there is to prob­a­bil­ity as de­gree of be­lief? Can’t we en­joy, for its own sake, the ex­pe­rience of hav­ing max­i­mally ac­cu­rate be­liefs given what­ever in­for­ma­tion we already have, even in cir­cum­stances where we don’t get to test it any fur­ther? Well, yes and no; if your be­lief is re­ally about any­thing, then you’ll be able to spec­ify, at the very least, a ridicu­lous hy­po­thet­i­cal ex­per­i­ment that would give you in­for­ma­tion about how cor­rect you are, or a ridicu­lous hy­po­thet­i­cal bet that would give you an in­cen­tive to op­ti­mally solve a more well-defined ver­sion of the prob­lem. And if you’re work­ing with a prob­lem where it’s at all un­clear how to do this, it is prob­a­bly best to back­track and ask what prob­lem you’re try­ing to solve, why you’re ask­ing the ques­tion in the first place. So when in doubt, ask for de­ci­sions rather than prob­a­bil­ities. In the end, the point (aside from sig­nal­ing) of be­liev­ing things is (1) to al­low you to effec­tively op­ti­mize re­al­ity for the things you care about, and (2) to al­low you to be sur­prised by some pos­si­ble ex­pe­riences and not oth­ers so you get feed­back on how well you’re do­ing. If a be­lief does not do ei­ther of those things, I’d hes­i­tate to call it a be­lief at all; yet that is what the origi­nal ver­sion of the Sleep­ing Beauty prob­lem asks you to do.

Now, it does seem to me that fol­low­ing the usual rules of prob­a­bil­ity the­ory (the ones that tend to gen­er­ate op­ti­mal bets in that strange land where in­ter­galac­tic su­per­in­tel­li­gences aren’t reg­u­larly mak­ing copies of you and sci­en­tists aren’t knock­ing you out and eras­ing your mem­ory) tells Sleep­ing Beauty to as­sign .5 cre­dence to the propo­si­tion that the coin landed on heads. Be­fore the ex­per­i­ment has started, Sleep­ing Beauty already knows what she’s go­ing to ex­pe­rience — wak­ing up and pon­der­ing prob­a­bil­ity — so if she doesn’t already be­lieve with 23 prob­a­bil­ity that the coin will land on tails (which would be a strange thing to be­lieve about a fair coin), then she can’t up­date to that af­ter ex­pe­rienc­ing what she already knew she was go­ing to ex­pe­rience. But in the origi­nal prob­lem, when she is asked “What is your cre­dence now for the propo­si­tion that our coin landed heads?”, a much bet­ter an­swer than “.5” is “Why do you want to know?”. If she knows how she’s be­ing graded, then there’s an easy cor­rect an­swer, which isn’t always .5; if not, she will have to do her best to guess what type of an­swer the ex­per­i­menters are look­ing for; and if she’s not be­ing graded at all, then she can say what­ever the hell she wants (ac­cept­able an­swers would in­clude “0.0001,” “3/​2,” and “pur­ple”).

I’m not sure if there is more to it than that. Pre­sum­ably the “should” in “What sub­jec­tive prob­a­bil­ity should I as­sign x?” isn’t a moral “should,” but more of an “if-should” (as in “If you want x to hap­pen, you should do y”), and if the ques­tion it­self seems con­fus­ing, that prob­a­bly means that un­der the cir­cum­stances, the im­plied “if” part is am­bigu­ous and needs to be made ex­plicit. Is there some un­der­ly­ing true essence of prob­a­bil­ity that I’m ne­glect­ing? I don’t know, but I am pretty sure that even if there were one, it wouldn’t nec­es­sar­ily be the thing we’d care about know­ing in these types of prob­lems any­way. You want to make op­ti­mal use of the in­for­ma­tion available to you, but it has to be op­ti­mal for some­thing.

I think this prin­ci­ple should help to clar­ify other an­thropic prob­lems. For ex­am­ple, sup­pose Omega tells you that she just made an ex­act copy of you and ev­ery­thing around you, enough that the copy of you wouldn’t be able to tell the differ­ence, at least for a while. Be­fore you have a chance to gather more in­for­ma­tion, what prob­a­bil­ity should you as­sign to the propo­si­tion that you your­self are the copy? The an­swer is non-ob­vi­ous, given that there already is a huge and po­ten­tially in­finite num­ber of copies of you, and it’s not clear how adding one more copy to the mix should af­fect your be­lief about how spread out you are over what wor­lds. On the other hand, if you’re Dr. Evil and you’re in your moon base prepar­ing to fire your gi­ant laser at Wash­ing­ton, DC when you get a phone call from Austin “Omega” Pow­ers, and he tells you that he has made an ex­act replica of the moon base on ex­actly the spot at which the moon laser is aimed, com­plete with an iden­ti­cal copy of you (and an iden­ti­cal copy of your iden­ti­cal mi­ni­a­ture clone) re­ceiv­ing the same phone call, and that its laser is trained on your origi­nal base on the moon, then the de­ci­sion is a lot eas­ier: hold off on firing your laser and gather more in­for­ma­tion or make other plans. Without talk­ing about the “prob­a­bil­ity” that you are the origi­nal Dr. Evil or the copy or one of the po­ten­tially in­finite Teg­mark du­pli­cates in other uni­verses, we can sim­ply look at the situ­a­tion from the out­side and see that if you do fire your laser then you’ll blow both of your­selves up, and that if you don’t fire your laser then you have some new com­peti­tors at worst and some new al­lies at best.

So: in prob­lems where you are mak­ing one judg­ment that may be eval­u­ated more or less than one time, and where you won’t have a chance to up­date be­tween those eval­u­a­tions (e.g. be­cause your one judg­ment will be eval­u­ated mul­ti­ple times be­cause there are mul­ti­ple copies of you or your mem­ory will be erased), just ask for de­ci­sions and leave prob­a­bil­ities out of it to what­ever ex­tent pos­si­ble.

In a fol­lowup post, I will gen­er­al­ize this point some­what and demon­strate that it helps solve some prob­lems that re­main con­fus­ing even when they spec­ify a pay­off struc­ture.