# Paradoxes in all anthropic probabilities

In a pre­vi­ous post, I re-dis­cov­ered full non-in­dex­i­cal up­dat­ing (FNC), an an­thropic the­ory I’m ashamed to say I had once known and then for­got. Thanks to Wei Dai for re­mind­ing me of that.

There is a prob­lem with FNC, though. In fact, there are prob­lems with all an­thropic prob­a­bil­ity the­o­ries. Both FNC and SIA vi­o­late con­ser­va­tion of ex­pected ev­i­dence: you can be in a situ­a­tion where you know with cer­tainty that your fu­ture prob­a­bil­ity will be differ­ent in a par­tic­u­lar di­rec­tion, from your cur­rent one. SSA has a differ­ent prob­lem: it al­lows you to make de­ci­sions that change the prob­a­bil­ity of past events.

Th­ese para­doxes are pre­sented to illus­trate the fact that an­thropic prob­a­bil­ity is not a co­her­ent con­cept, and that deal­ing with mul­ti­ple copies of a sin­gle agent is in the realm of de­ci­sion the­ory.

## FNC and ev­i­dence non-conservation

Let’s pre­sume that the band­width of the hu­man brain is bits per minute. Then we flip a coin. Upon it com­ing up heads, we cre­ate iden­ti­cal copies of you. Upon it com­ing up tails, we cre­ate copies of you.

Then if we as­sume that the ex­pe­riences of your differ­ent copies are ran­dom, for the first minute, you will give equal prob­a­bil­ity to heads and tails. That’s be­cause there is a be­ing with ex­actly the same ob­ser­va­tions as you, in both uni­verses.

After two min­utes, you will shift to odds in favour of tails: you’re cer­tain there’s a be­ing with your ob­ser­va­tions in the tails uni­verse, and, with prob­a­bil­ity , there’s one in the heads uni­verse.

After a full three min­utes, you will fi­nally sta­bil­ise on odds in favour of tails, and stay there.

Thus, dur­ing the first minute, you know that FNC will be giv­ing you differ­ent odds in the com­ing min­utes, and you can pre­dict the di­rec­tion those odds will take.

If the ob­ser­va­tions are non-ran­dom, then the di­ver­gence will be slower, and the FNC odds will be chang­ing for a longer pe­riod.

## SIA and ev­i­dence non-conservation

If we use SIA in­stead of FNC, then, in the above situ­a­tion, the odds of tails will be and will stay there, so that set­ting is not an is­sue for SIA.

To show a prob­lem with SIA, as­sume there is one copy of you, that we flip a coin, and, if comes out tails, we will im­me­di­ately du­pli­cate you (putting the du­pli­cate in a sep­a­rate room). If it comes out heads, we will wait a minute be­fore du­pli­cat­ing you.

Then SIA im­plies in favour of tails dur­ing that minute, but equal odds af­ter­wards.

You can’t get around this with tweaked refer­ences classes: one of the good prop­er­ties of SIA is that it works the same what­ever the refer­ence class, as long as it in­cludes agent cur­rently sub­jec­tively in­dis­t­in­guish­able from you.

## SSA and chang­ing the past

SSA has a lot of is­sues. It has the whole prob­lem with refer­ence classes; these are hard to define co­her­ently, and agents in differ­ent refer­ence classes with the same pri­ors can agree to dis­agree (for in­stance, if we ex­pect that there will be a sin­gle gen­der in the fu­ture, then if I’m in the refer­ence class of males, I ex­pect that sin­gle gen­der will be fe­male—and the op­po­site will be ex­pected for some­one in the refer­ence class of fe­males). It vi­o­lates causal­ity: it as­signs differ­ent prob­a­bil­ities to an event, purely de­pend­ing on the fu­ture con­se­quence of that event.

But I think I’ll fo­cus on an­other way it vi­o­lates causal­ity: your cur­rent ac­tions can change the prob­a­bil­ity of past events.

Sup­pose that the prover­bial coin is flipped, and that if it comes up heads, one ver­sion of you is cre­ated, and, if it comes up tails, copies of you are cre­ated. You are the last of these copies: ei­ther the only one in the heads world, or the last one in the tails world, you don’t know. Un­der SSA, you as­sign odds of in favour of heads.

You have a con­ve­nient lever, how­ever. If you pull it, then fu­ture copies of you will be cre­ated, in the heads world only (noth­ing will hap­pen in the tails world). There­fore, of you pull it, the odds of the coin be­ing tails—an event long past, and known to be past—will shift to from to in favour.

No nominations.
No reviews.
• I’d also love some clar­ifi­ca­tions:

For in­stance, if we ex­pect that there will be a sin­gle gen­der in the fu­ture, then if I’m in the refer­ence class of males, I ex­pect that sin­gle gen­der will be fe­male—and the op­po­site will be ex­pected for some­one in the refer­ence class of females

Fur­ther:

It vi­o­lates causal­ity: it as­signs differ­ent prob­a­bil­ities to an event, purely de­pend­ing on the fu­ture con­se­quence of that event.

• Why shouldn’t the prob­a­bil­ity pre­dictably up­date for the Self-Indica­tive As­sump­tion? This ver­sion of prob­a­bil­ity isn’t sup­posed to re­fer to a prop­erty of the world, but some com­bi­na­tion of the state of the world and the num­ber of agents in var­i­ous sce­nar­ios.

Re­gard­ing the self-sam­pling as­sump­tion, if you knew ahead of time that the lever would be pul­led, then you could have up­dated be­fore it was pul­led. If you didn’t know that the lever would be pul­led, then you gained in­for­ma­tion: speci­fi­cally about the num­ber of copies in the heads case. It’s not that you knew there was only one copy in the heads world, it’s just that you thought there was only one copy, be­cause you didn’t think a mechanism like the lever would ex­ist and be pul­led. In fact, if you knew that the prob­a­bil­ity of the lever ex­ist­ing and then be­ing pul­led was 0, then the sce­nario would con­tra­dict it­self.

As Char­lie Steiner notes, it looks like you’ve lost in­for­ma­tion, but that’s only be­cause you’ve gained in­for­ma­tion that your in­for­ma­tion is in­ac­cu­rate. Here’s an anal­ogy: sup­pose I give you a sci­en­tific pa­per that con­tains what ap­pears to be lots of valuable in­for­ma­tion. Next I tell you that the pa­per is a fraud. It seems like you’ve lost in­for­ma­tion as you are less cer­tain about the world, but you’ve ac­tu­ally gained it. I’ve writ­ten about this be­fore for the Dr Evil prob­lem.

• You are de­cid­ing whether or not to pull the lever. The prob­a­bil­ity of a past event, known to be in the past, de­pends on your ac­tions now.

To use your anal­ogy, it’s you de­cid­ing whether to la­bel a sci­en­tific pa­per in­ac­cu­rate or not—your choice of la­bel, not any­thing else, makes it in­ac­cu­rate or not.

• Oh, ac­tu­ally I think what I wrote above was wrong. The self-sam­pling as­sump­tion is sup­posed to pre­serve prob­a­bil­ities like this. It’s the self-in­dica­tive as­sump­tion that is rel­a­tive to agents.

That said, I have a differ­ent ob­jec­tion. I’m con­fused about why pul­ling the lever would change the odds though. Your refer­ence class is all copies that were the <last copy that was origi­nally cre­ated>. So any fur­ther clones you cre­ate fall out­side the refer­ence class.

If you want to set your refer­ence class to <the last copy that was cre­ated at any point> then:

• Heads case—first round: if you pull the lever then you fall out­side the refer­ence class

• Heads case—sec­ond round: lever doesn’t do any­thing any­more as it has already been used

• Tails case: pul­ling the lever does nothing

So you don’t re­ally have the op­tion to pull the lever to cre­ate clones. If you were us­ing a differ­ent refer­ence class, what was it.

• Does your post also show that self­ish prefer­ences are in­co­her­ent, be­cause any self­ish prefer­ence must rely on a weight­ing of your copies and ev­ery such weight­ing has weird prop­er­ties?

• I agree with that, but I don’t think the post shows it di­rectly. My video https://​​www.youtube.com/​​watch?v=aiGOGkBiWEo does look at two pos­si­ble ver­sions of self­ish­ness; my own po­si­tion is that self­ish­ness is in­co­her­ent, un­less it’s ex­treme ob­server-mo­ment self­ish­ness, which is use­less.

• In­ter­est­ing! Can you ex­plain that in text?

• For most ver­sions of self­ish­ness, if you’re du­pli­cated, then the two copies will have di­ver­gent prefer­ences. How­ever, if one of the copies is de­stroyed dur­ing du­pli­ca­tion, this just counts as trans­porta­tion. So the pre­vi­ous self val­ues ei­ther fu­ture copies if only one ex­ists. There­fore it seems in­co­her­ent for the pre­vi­ous self not to value both fu­ture copies if both ex­ist, and hence for the two fu­ture copies not to value each other.

(btw, the log­i­cal con­clu­sion is that the two copies have the same prefer­ences, not that the two agents must value each other—it’s pos­si­ble that copy A only cares about them­selves, and copy B only cares about copy A).

• The prob­a­bil­ity is only differ­ent when you think the world is in a differ­ent state. This no more vi­o­lates con­ser­va­tion of ex­pected ev­i­dence than putting the ket­tle on vi­o­lates con­ser­va­tion of ex­pected ev­i­dence by pre­dictably chang­ing the prob­a­bil­ity of hot wa­ter. The weird part is which part of the world is differ­ent.

• All the odds are about the out­come of a past coin flip, known to be in the past. This should not change in the ways de­scribed here.

• Hm, you’re right, I guess there is some­thing weird here (I’m not talk­ing about FNC—I think that part is weird too—I mean “or­di­nary” an­thropic prob­a­bil­ities).

If I had to try to put my finger on what’s differ­ent, there is an ap­par­ent de­vi­a­tion from nor­mal Bayesian up­dat­ing. Nor­mally, when you add some sense-data to your big his­tory-o’-sense-data, you up­date by set­ting all in­com­pat­i­ble hy­pothe­ses to zero and renor­mal­iz­ing what’s left. But this “an­thropic up­date” seems to add hy­pothe­ses rather than only re­mov­ing them—when you’re du­pli­cated there’s now more pos­si­ble ex­pla­na­tions for your sense-data, rather than the nor­mal case of less and less pos­si­ble ex­pla­na­tions.

I think a Carte­sian agent wouldn’t do an­thropic rea­son­ing, though it might learn to simu­late it if put through a se­ries of Sleep­ing Beauty type games.

• To show a prob­lem with SIA, as­sume there is one copy of you, that we flip a coin, and, if comes out tails, we will im­me­di­ately du­pli­cate you (putting the du­pli­cate in a sep­a­rate room). If it comes out heads, we will wait a minute be­fore du­pli­cat­ing you.

We could make it sim­pler: flip a coin and du­pli­cate you in a minute if it comes up heads. A.k.a. sleep­ing beauty. We already know that SIA forces you to up­date when you go through an an­thropic “split” (like wak­ing up in sleep­ing beauty), not just when you learn some­thing.

• With re­gards to your SIA ob­jec­tion, I think it is im­por­tant to clar­ify ex­actly what we mean by ev­i­dence con­ser­va­tion here. The usual for­mu­la­tion is some­thing like “If I ex­pect to as­sign cre­dence X to propo­si­tion P at fu­ture time T, then I should as­sign cre­dence X to propo­si­tion P right now, un­less by time T I ex­pect to have lost in­for­ma­tion in a pre­dictable way”. Now if you are go­ing to be du­pli­cated, then it is not ex­actly clear what you mean by “I ex­pect to as­sign … at fu­ture time T”, since there will be mul­ti­ple copies of you that ex­ist at time T. So, maybe you want to get around this by say­ing that you are refer­ring to the “origi­nal” ver­sion of you that ex­ists at time T, rather than any du­pli­cates. But then the prob­lem seems to be that by wait­ing, you will ac­tu­ally lose in­for­ma­tion in a pre­dictable way! Namely, right now you know that you are not a du­pli­cate, but the fu­ture ver­sion of you will not know that it is not a du­pli­cate. Since you are los­ing in­for­ma­tion, it is not sur­pris­ing that your prob­a­bil­ity will pre­dictably change. So, I don’t think SIA vi­o­lates ev­i­dence con­ser­va­tion.

In­ci­den­tally, here is an in­tu­ition pump that I think sup­ports SIA: sup­pose I flip a coin and if it is heads then I kill you, tails I keep you al­ive. Then if you are al­ive at the end of the ex­per­i­ment, surely you should as­sign 100% prob­a­bil­ity to tails (dis­count­ing model un­cer­tainty of course). But you could eas­ily rea­son that this vi­o­lates ev­i­dence con­ser­va­tion: you pre­dictably know that all fu­ture agents de­scended from you will as­sign 100% prob­a­bil­ity to tails, while you cur­rently only as­sign 50% to tails. This points to the im­por­tance of pre­cisely defin­ing and an­a­lyz­ing ev­i­dence con­ser­va­tion as I have done in the pre­vi­ous para­graph. Ad­di­tion­ally, if we gen­er­al­ize to the set­ting where I make/​keep X copies of you if the coin lands heads and Y copies if tails, then SIA gives the el­e­gant for­mula X/​(X+Y) as the prob­a­bil­ity for heads af­ter the ex­per­i­ment, and it is nice that our straight­for­ward in­tu­itions about the cases X=0 and Y=0 provide a dou­ble-check for this for­mula.

• Re­mem­ber that this is about a coin flip that is in the past and known to be in the past. And that the fu­ture du­pli­cates can re­mem­ber ev­ery­thing their past po­ten­tial-non-du­pli­cate knew. So they might be­lieve “now I’m not sure I’m not a du­pli­cate, but it used to be the case that I thought that be­ing a non-du­pli­cate was more likely”. So if that in­for­ma­tion was rele­vant, they can just put them­selves in the shoes of their past selves.

• They can’t put them­selves in the shoes of their past selves, be­cause in some sense they are not re­ally sure whether they have past selves at all, rather than merely be­ing du­pli­cates of some­one. Just be­cause your brain is copied from some­one else doesn’t mean that you are in the same episte­molog­i­cal state as them. And the true de­scen­dants are also not in the same episte­molog­i­cal state, be­cause they do not know whether they are copies or not.

• As re­duc­tios of an­thropic views go, these are all pretty mild. Aban­don­ing con­ser­va­tion of ex­pected ev­i­dence isn’t ex­actly an un-bite­able bul­let. And “Vio­lat­ing causal­ity” is par­tic­u­larly mild, es­pe­cially for those of us who like non-causal de­ci­sion the­o­ries. As a one-boxer I’ve been ac­cused of be­liev­ing in retro­causal­ity dozens of times… sticks and stones, you know. This sort of “causal­ity vi­o­la­tion” seems similarly frivolous. Oh, and the SSA refer­ence class ar­bi­trari­ness thing can be avoided by steel­man­ning SSA to make it more el­e­gant—just get rid of the refer­ence class idea and do it with cen­tered wor­lds. SSA is what you get if you just do or­di­nary Bayesian con­di­tion­al­iza­tion on cen­tered wor­lds in­stead of on pos­si­ble wor­lds. (Which is ac­tu­ally the more el­e­gant and nat­u­ral way of do­ing it, since pos­si­ble wor­lds are a weird re­stric­tion on the sorts of sen­tences we use. Cen­tered wor­lds, by con­trast, are sim­ply max­i­mally con­sis­tent sets of sen­tences, full stop.) As for chang­ing the prob­a­bil­ity of past events… this isn’t mys­te­ri­ous in prin­ci­ple. We change the prob­a­bil­ity of past events all the time. Prob­a­bil­ities are just our cre­dences in things! More se­ri­ously though, let A be the hy­po­thet­i­cal state of the past light-cone that would re­sult in your choos­ing to stretch your arm ten min­utes from now, and B be the hy­po­thet­i­cal state of the past light-cone that would re­sult in your choos­ing to not stretch your arm. A and B are past events, but you should be un­cer­tain about which one ob­tained un­til about ten min­utes from now, at which point (de­pend­ing on what you choose!) the prob­a­bil­ity of A will in­crease or de­crease.

There are strong re­duc­tios in the vicinity though, if I re­call cor­rectly. (I did my MA on this stuff, but it was a while ago so I’m a lit­tle rusty.)

FNC-type views have the re­sult that (a) we al­most in­stantly be­come con­vinced, no mat­ter what we ex­pe­rience, that the uni­verse is an in­finite soup of ran­dom noise oc­ca­sion­ally co­a­lesc­ing to form Boltz­mann Brains, be­cause this is the sim­plest hy­poth­e­sis that as­signs prob­a­bil­ity 1 to the data; (b) we stay in this state for­ever and act ac­cord­ingly—which means think­ing happy thoughts, or some­thing like that, whether we are av­er­age util­i­tar­i­ans or to­tal util­i­tar­i­ans or ego­ists.

SIA-type views are as far as I can tell in­co­her­ent, in the fol­low­ing sense: The pop­u­la­tion size of uni­verses grows much faster than their prob­a­bil­ity can shrink. So if you want to say that their prob­a­bil­ity is pro­por­tional to their pop­u­la­tion size… how? (Flag: I no­tice I am con­fused about this part.) A more down-to-earth way of putting this prob­lem is that the hy­poth­e­sis in which there is one uni­verse is dom­i­nated by the hy­poth­e­sis in which there are 3^^^^3 copies of that uni­verse in par­allel di­men­sions, which in turn is dom­i­nated by the hy­poth­e­sis in which there are 4^^^^^4...

SSA-type views are the only game in town, as far as I’m con­cerned—ex­cept for the “Let’s aban­don prob­a­bil­ity en­tirely and just do de­ci­sion the­ory” idea you fa­vor. I’m not sure what to make of it yet. Any­how, the big prob­lem I see for SSA-type views is the one you men­tion about us­ing the abil­ity to cre­ate tons of copies of your­self to in­fluence the world. That seems weird all right. I’d like to avoid that con­se­quence if pos­si­ble. But it doesn’t seem worse than weird to me yet. It doesn’t seem… un-bite­able.

EDIT: I should add that I think your con­clu­sion is prob­a­bly right—I think your move away from prob­a­bil­ity and to­wards de­ci­sion the­ory seems very promis­ing. As we went up­date­less in de­ci­sion the­ory, so too should we go up­date­less in prob­a­bil­ity. Some­thing like that (I have to think & read about it more). I’m just ob­ject­ing to the strong word­ing in your ar­gu­ments to get there. :)

• Is there a strong ar­gu­ment as to why we should care about the con­ser­va­tion of ex­pected ev­i­dence? My be­lief at the mo­ment is that some­thing very close to SIA is true, and that con­ser­va­tion of ex­pected ev­i­dence is a prin­ci­ple which sim­ply doesn’t hold in sce­nar­ios of mul­ti­ple ob­servers.

• If prob­a­bil­ity makes sense at all, then “I be­lieve that the odds are 2:1, but I *know* that in a minute I’ll be­lieve that it’s 1:1” de­stroys it as a co­her­ent for­mal­i­sa­tion of be­liefs. Should the 2:1 you force their fu­ture copy to stick with 2:1 rather than 1:1? If not, why do they think their own be­liefs are right?

• Which in­ter­pre­ta­tion of prob­a­bil­ity do you use? I go with stan­dard sub­jec­tive bayesi­anism: Prob­a­bil­ities are your cre­dences are your de­grees of be­lief.

So, there’s noth­ing con­tra­dic­tory or in­co­her­ent about be­liev­ing that you will be­lieve some­thing else in the fu­ture. Triv­ial case: Some­one will brain­wash you in the fu­ture and you know this. Why do you think your own be­liefs are right? First of all, why do I need to an­swer that ques­tion in or­der to co­her­ently have those be­liefs? Not ev­ery be­lief can be jus­tified in that way. Se­condly, if I fol­low SSA, here’s my jus­tifi­ca­tion: “Well, here are my pri­ors. Here is my ev­i­dence. I then con­di­tion­al­ized on the ev­i­dence, and this is what I got. That fu­ture ver­sion of me has the same pri­ors but differ­ent ev­i­dence, so they got a differ­ent re­sult.” Why is that not jus­tifi­ca­tion enough?

Yes, it’s weird when you are mo­ti­vated to force your fu­ture copy to do things. Per­haps we should do for prob­a­bil­ity what we did for de­ci­sion the­ory, and talk about agents that have the abil­ity to ir­re­vo­ca­bly bind their fu­ture selves. (Isn’t this ba­si­cally what you think we should do?)

But it’s not in­co­her­ent or sense­less to think that yes, I have cre­dence X now and in the fu­ture I will have cre­dence Y. Just as it isn’t in­co­her­ent or sense­less to wish that your fu­ture self would re­fuse the black­mail even though your fu­ture self would ac­tu­ally de­cide to give in.

• Yes, it’s weird when you are mo­ti­vated to force your fu­ture copy to do things

If you cou­ple these prob­a­bil­ity the­o­ries with the right de­ci­sion the­o­ries, this should never come up. FNC yields the cor­rect an­swer if you use a de­ci­sion the­ory that lets you de­cide for all your iden­ti­cal copies (but not the ones who has had differ­ent ex­pe­riences), and SIA yields the cor­rect an­swer if you as­sume that you can’t af­fect the choices of the rest of your copies.

• Does MWI im­ply SIA?

Here’s a model of Sleep­ing Beauty un­der MWI. The uni­verse has two apart­ments with mul­ti­ple rooms. Each apart­ment has a room con­tain­ing a copy of you. You have 50:50 be­liefs about which apart­ment you’re in. One minute from now, a new copy of you will be cre­ated in the first apart­ment (in an­other iden­ti­cal room). At that mo­ment, de­spite get­ting no new in­for­ma­tion, should you change your be­lief about which apart­ment you’re in? Ob­vi­ously yes.

So it seems like your “con­ser­va­tion of ex­pected ev­i­dence” ar­gu­ment has a mis­take some­where, and SIA is ac­tu­ally fine.

• I’m just get­ting started with SIA, SSA, FNC and the like, so prob­a­bly I’m miss­ing some core un­der­stand­ing, but: A minute from now you do gain new in­for­ma­tion: One minute has passed.

• Was that un­ex­pected?

• should you change your be­lief about which apart­ment you’re in? Ob­vi­ously yes.

Do you want the ma­jor­ity of your copies to be cor­rect as to what branch of the mul­ti­verse they are in? Go SIA. Do you want your copies to be cor­rect in the ma­jor­ity of branches of the mul­ti­verse? SSA, then.

• This ex­per­i­ment doesn’t have branches though, it has apart­ments and rooms. You could care about be­ing right in the ma­jor­ity of apart­ments, or you could care about be­ing right in the ma­jor­ity of rooms, but these are ar­bi­trary di­vi­sions of space and give con­flict­ing an­swers. Or you could care about ma­jor­ity of copies be­ing right, which is ob­jec­tive and doesn’t have a con­flict­ing coun­ter­part. You can re­pro­duce it by cre­at­ing three iden­ti­cal copies and then dis­tribut­ing them into rooms. So SIA has an ob­jec­tive ba­sis and SSA doesn’t.

The next ques­tion is whether apart­ments are a good anal­ogy for MWI, but ac­cord­ing to what we know so far, that seems likely. Espe­cially if it turns out that quan­tum and cos­molog­i­cal mul­ti­verses are the same.

• My preferred way of do­ing an­throp­ics while keep­ing prob­a­bil­ities around is to up­date your prob­a­bil­ities ac­cord­ing to the chance that at least one of the de­ci­sion mak­ing agents that your de­ci­sion is log­i­cally linked to ex­ists, and then pri­ori­tise the wor­lds where there are more of those agents by ac­knowl­edg­ing that you’re mak­ing the de­ci­sion for all of them. This yields the same (cor­rect) con­clu­sions as SIA when you’re only mak­ing de­ci­sions for your­self, and FNC when you’re mak­ing de­ci­sions for all of your iden­ti­cal copies, but it avoids the para­doxes brought up in this ar­ti­cle and it al­lows you to take into ac­count that you’re mak­ing de­ci­sions for all of your similar copies, which you want to have for new­combs prob­lem like situ­a­tions.

How­ever, I think it’s pos­si­ble to con­struct even more con­torted sce­nar­ios where con­ser­va­tion of ex­pected ev­i­dence is vi­o­lated for this as well. If there are 2 copies of you, a coin is flipped, and:

• If it’s heads the copies are pre­sented with two differ­ent choices.

• If it’s tails the copies are pre­sented with the same choice.

then you know that you will up­date to­wards heads when you’re pre­sented with a choice af­ter a minute, since heads make it twice as likely that any­one would be pre­sented with that spe­cific choice. I don’t know if there’s any way around this. Maybe if you up­date your prob­a­bil­ities ac­cord­ing to the chance that some­one fol­low­ing your de­ci­sion the­ory is around, rather than some­one mak­ing your ex­act choice, or some­thing like that?