L-zombies! (L-zombies?)

Re­ply to: Benja2010′s Self-mod­ifi­ca­tion is the cor­rect jus­tifi­ca­tion for up­date­less de­ci­sion the­ory; Wei Dai’s Late great filter is not bad news

P-zom­bie” is short for “philo­soph­i­cal zom­bie”, but here I’m go­ing to re-in­ter­pret it as stand­ing for “phys­i­cal philo­soph­i­cal zom­bie”, and con­trast it to what I call an “l-zom­bie”, for “log­i­cal philo­soph­i­cal zom­bie”.

A p-zom­bie is an or­di­nary hu­man body with an or­di­nary hu­man brain that does all the usual things that hu­man brains do, such as the things that cause us to move our mouths and say “I think, there­fore I am”, but that isn’t con­scious. (The usual con­sen­sus on LW is that p-zom­bies can’t ex­ist, but some philoso­phers dis­agree.) The no­tion of p-zom­bie ac­cepts that hu­man be­hav­ior is pro­duced by phys­i­cal, com­putable pro­cesses, but imag­ines that these phys­i­cal pro­cesses don’t pro­duce con­scious ex­pe­rience with­out some ad­di­tional epiphe­nom­e­nal fac­tor.

An l-zom­bie is a hu­man be­ing that could have ex­isted, but doesn’t: a Tur­ing ma­chine which, if any­body ever ran it, would com­pute that hu­man’s thought pro­cesses (and its in­ter­ac­tions with a simu­lated en­vi­ron­ment); that would, if any­body ever ran it, com­pute the hu­man say­ing “I think, there­fore I am”; but that never gets run, and there­fore isn’t con­scious. (If it’s con­scious any­way, it’s not an l-zom­bie by this defi­ni­tion.) The no­tion of l-zom­bie ac­cepts that hu­man be­hav­ior is pro­duced by com­putable pro­cesses, but sup­poses that these com­pu­ta­tional pro­cesses don’t pro­duce con­scious ex­pe­rience with­out be­ing phys­i­cally in­stan­ti­ated.

Ac­tu­ally, there prob­a­bly aren’t any l-zom­bies: The way the ev­i­dence is point­ing, it seems like we prob­a­bly live in a spa­tially in­finite uni­verse where ev­ery phys­i­cally pos­si­ble hu­man brain is in­stan­ti­ated some­where, al­though some are in­stan­ti­ated less fre­quently than oth­ers; and if that’s not true, there are the “bub­ble uni­verses” aris­ing from cos­molog­i­cal in­fla­tion, the branches of many-wor­lds quan­tum me­chan­ics, and Teg­mark’s “level IV” mul­ti­verse of all math­e­mat­i­cal struc­tures, all sug­gest­ing again that all pos­si­ble hu­man brains are in fact in­stan­ti­ated. But (a) I don’t think that even with all that ev­i­dence, we can be over­whelm­ingly cer­tain that all brains are in­stan­ti­ated; and, more im­por­tantly ac­tu­ally, (b) I think that think­ing about l-zom­bies can yield some use­ful in­sights into how to think about wor­lds where all hu­mans ex­ist, but some of them have more mea­sure (“mag­i­cal re­al­ity fluid”) than oth­ers.

So I ask: Sup­pose that we do in­deed live in a world with l-zom­bies, where only some of all math­e­mat­i­cally pos­si­ble hu­mans ex­ist phys­i­cally, and only those that do have con­scious ex­pe­riences. How should some­one liv­ing in such a world rea­son about their ex­pe­riences, and how should they make de­ci­sions — keep­ing in mind that if they were an l-zom­bie, they would still say “I have con­scious ex­pe­riences, so clearly I can’t be an l-zom­bie”?

If we can’t up­date on our ex­pe­riences to con­clude that some­one hav­ing these ex­pe­riences must ex­ist in the phys­i­cal world, then we must of course con­clude that we are al­most cer­tainly l-zom­bies: After all, if the phys­i­cal uni­verse isn’t com­bi­na­to­ri­ally large, the vast ma­jor­ity of math­e­mat­i­cally pos­si­ble con­scious hu­man ex­pe­riences are not in­stan­ti­ated. You might ar­gue that the uni­verse you live in seems to run on rel­a­tively sim­ple phys­i­cal rules, so it should have high prior prob­a­bil­ity; but we haven’t re­ally figured out the ex­act rules of our uni­verse, and al­though what we un­der­stand seems com­pat­i­ble with the hy­poth­e­sis that there are sim­ple un­der­ly­ing rules, that’s not re­ally proof that there are such un­der­ly­ing rules, if “the real uni­verse has sim­ple rules, but we are l-zom­bies liv­ing in some ran­dom simu­la­tion with a hodge­podge of rules (that isn’t ac­tu­ally ran)” has the same prior prob­a­bil­ity; and worse, if you don’t have all we do know about these rules loaded into your brain right now, you can’t re­ally ver­ify that they make sense, since there is some math­e­mat­i­cally pos­si­ble simu­la­tion whose ini­tial state has you re­mem­ber see­ing ev­i­dence that such sim­ple rules ex­ist, even if they don’t; and much worse still, even if there are such sim­ple rules, what ev­i­dence do you have that if these rules were ac­tu­ally ex­e­cuted, they would pro­duce you? Only the fact that you, like, ex­ist, but we’re ask­ing what hap­pens if we don’t let you up­date on that.

I find my­self quite un­will­ing to ac­cept this con­clu­sion that I shouldn’t up­date, in the world we’re talk­ing about. I mean, I ac­tu­ally have con­scious ex­pe­riences. I, like, feel them and stuff! Yes, true, my slightly al­tered al­ter ego would rea­son the same way, and it would be wrong; but I’m right...

...and that ac­tu­ally seems to offer a way out of the co­nun­drum: Sup­pose that I de­cide to up­date on my ex­pe­rience. Then so will my al­ter ego, the l-zom­bie. This leads to a lot of l-zom­bies con­clud­ing “I think, there­fore I am”, and be­ing wrong, and a lot of ac­tual peo­ple con­clud­ing “I think, there­fore I am”, and be­ing right. All the thoughts that are ac­tu­ally con­sciously ex­pe­rienced are, in fact, cor­rect. This doesn’t seem like such a ter­rible out­come. There­fore, I’m will­ing to pro­vi­sion­ally en­dorse the rea­son­ing “I think, there­fore I am”, and to en­dorse up­dat­ing on the fact that I have con­scious ex­pe­riences to draw in­fer­ences about phys­i­cal re­al­ity — tak­ing into ac­count the simu­la­tion ar­gu­ment, of course, and con­di­tion­ing on liv­ing in a small uni­verse, which is all I’m dis­cussing in this post.

NB. There’s still some­thing quite un­com­fortable about the idea that all of my be­hav­ior, in­clud­ing the fact that I say “I think there­fore I am”, is ex­plained by the math­e­mat­i­cal pro­cess, but ac­tu­ally be­ing con­scious re­quires some ex­tra mag­i­cal re­al­ity fluid. So I still feel con­fused, and us­ing the word l-zom­bie in anal­ogy to p-zom­bie is a way of high­light­ing that. But this line of rea­son­ing still feels like progress. FWIW.

But if that’s how we jus­tify be­liev­ing that we phys­i­cally ex­ist, that has some im­pli­ca­tions for how we should de­cide what to do. The ar­gu­ment is that noth­ing very bad hap­pens if the l-zom­bies wrongly con­clude that they ac­tu­ally ex­ist. Mostly, that also seems to be true if they act on that be­lief: mostly, what l-zom­bies do doesn’t seem to in­fluence what hap­pens in the real world, so if only things that ac­tu­ally hap­pen are morally im­por­tant, it doesn’t seem to mat­ter what the l-zom­bies de­cide to do. But there are ex­cep­tions.

Con­sider the coun­ter­fac­tual mug­ging: Ac­cu­rate and trust­wor­thy Omega ap­pears to you and ex­plains that it just has thrown a very bi­ased coin that had only a 1/​1000 chance of land­ing heads. As it turns out, this coin has in fact landed heads, and now Omega is offer­ing you a choice: It can ei­ther (A) cre­ate a Friendly AI or (B) de­stroy hu­man­ity. Which would you like? There is a catch, though: Be­fore it threw the coin, Omega made a pre­dic­tion about what you would do if the coin fell heads (and it was able to make a con­fi­dent pre­dic­tion about what you would choose). If the coin had fallen tails, it would have cre­ated an FAI if it has pre­dicted that you’d choose (B), and it would have de­stroyed hu­man­ity if it has pre­dicted that you would choose (A). (If it hadn’t been able to make a con­fi­dent pre­dic­tion about what you would choose, it would just have de­stroyed hu­man­ity out­right.)

There is a clear ar­gu­ment that, if you ex­pect to find your­self in a situ­a­tion like this in the fu­ture, you would want to self-mod­ify into some­body who would choose (B), since this gives hu­man­ity a much larger chance of sur­vival. Thus, a de­ci­sion the­ory sta­ble un­der self-mod­ifi­ca­tion would an­swer (B). But if you up­date on the fact that you con­sciously ex­pe­rience Omega tel­ling you that the coin landed heads, (A) would seem to be the bet­ter choice!

One way of look­ing at this is that if the coin falls tails, the l-zom­bie that is told the coin landed heads still ex­ists math­e­mat­i­cally, and this l-zom­bie now has the power to in­fluence what hap­pens in the real world. If the ar­gu­ment for up­dat­ing was that noth­ing bad hap­pens even though the l-zom­bies get it wrong, well, that ar­gu­ment breaks here. The math­e­mat­i­cal pro­cess that is your mind doesn’t have any ev­i­dence about whether the coin landed heads or tails, be­cause as a math­e­mat­i­cal ob­ject it ex­ists in both pos­si­ble wor­lds, and it has to make a de­ci­sion in both wor­lds, and that de­ci­sion af­fects hu­man­ity’s fu­ture in both wor­lds.

Back in 2010, I wrote a post ar­gu­ing that yes, you would want to self-mod­ify into some­thing that would choose (B), but that that was the only rea­son why you’d want to choose (B). Here’s a vari­a­tion on the above sce­nario that illus­trates the point I was try­ing to make back then: Sup­pose that Omega tells you that it ac­tu­ally threw its coin a mil­lion years ago, and if it had fallen tails, it would have turned Alpha Cen­tauri pur­ple. Now through­out your his­tory, the ar­gu­ment goes, you would never have had any mo­tive to self-mod­ify into some­thing that chooses (B) in this par­tic­u­lar sce­nario, be­cause you’ve always known that Alpha Cen­tauri isn’t, in fact, pur­ple.

But this ar­gu­ment as­sumes that you know you’re not a l-zom­bie; if the coin had in fact fallen tails, you wouldn’t ex­ist as a con­scious be­ing, but you’d still ex­ist as a math­e­mat­i­cal de­ci­sion-mak­ing pro­cess, and that pro­cess would be able to in­fluence the real world, so you-the-de­ci­sion-pro­cess can’t rea­son that “I think, there­fore I am, there­fore the coin must have fallen heads, there­fore I should choose (A).” Partly be­cause of this, I now ac­cept choos­ing (B) as the (most likely to be) cor­rect choice even in that case. (The rest of my change in opinion has to do with all ways of mak­ing my ear­lier in­tu­ition for­mal get­ting into trou­ble in de­ci­sion prob­lems where you can in­fluence whether you’re brought into ex­is­tence, but that’s a topic for an­other post.)

How­ever, should you feel cheer­ful while you’re an­nounc­ing your choice of (B), since with high (prior) prob­a­bil­ity, you’ve just saved hu­man­ity? That would lead to an ac­tual con­scious be­ing feel­ing cheer­ful if the coin has landed heads and hu­man­ity is go­ing to be de­stroyed, and an l-zom­bie com­put­ing, but not ac­tu­ally ex­pe­rienc­ing, cheer­ful­ness if the coin has landed heads and hu­man­ity is go­ing to be saved. Noth­ing good comes out of feel­ing cheer­ful, not even al­ign­ment of a con­scious’ be­ing’s map with the phys­i­cal ter­ri­tory. So I think the cor­rect thing is to choose (B), and to be deeply sad about it.

You may be ask­ing why I should care what the right prob­a­bil­ities to as­sign or the right feel­ings to have are, since these don’t seem to play any role in mak­ing de­ci­sions; some­times you make your de­ci­sions as if up­dat­ing on your con­scious ex­pe­rience, but some­times you don’t, and you always get the right an­swer if you don’t up­date in the first place. In­deed, I ex­pect that the “cor­rect” de­sign for an AI is to fun­da­men­tally use (more pre­cisely: ap­prox­i­mate) up­date­less de­ci­sion the­ory (though I also ex­pect that prob­a­bil­ities up­dated on the AI’s sen­sory in­put will be use­ful for many in­ter­me­di­ate com­pu­ta­tions), and “I com­pute, there­fore I am”-style rea­son­ing will play no fun­da­men­tal role in the AI. And I think the same is true for hu­mans’ de­ci­sions — the cor­rect way to act is given by up­date­less rea­son­ing. But as a hu­man, I find my­self un­satis­fied by not be­ing able to have a pic­ture of what the phys­i­cal world prob­a­bly looks like. I may not need one to figure out how I should act; I still want one, not for in­stru­men­tal rea­sons, but be­cause I want one. In a small uni­verse where most math­e­mat­i­cally pos­si­ble hu­mans are l-zom­bies, the ar­gu­ment in this post seems to give me a jus­tifi­ca­tion to say “I think, there­fore I am, there­fore prob­a­bly I ei­ther live in a simu­la­tion or what I’ve learned about the laws of physics de­scribes how the real world works (even though there are many l-zom­bies who are think­ing similar thoughts but are wrong about them).”

And be­cause of this, even though I dis­agree with my 2010 post, I also still dis­agree with Wei Dai’s 2010 post ar­gu­ing that a late Great Filter is good news, which my own 2010 post was try­ing to ar­gue against. Wei ar­gued that if Omega gave you a choice be­tween (A) de­stroy­ing the world now and (B) hav­ing Omega de­stroy the world a mil­lion years ago (so that you are never in­stan­ti­ated as a con­scious be­ing, though your choice as an l-zom­bie still in­fluences the real world), then you would choose (A), to give hu­man­ity at least the time it’s had so far. Wei con­cluded that this means that if you learned that the Great Filter is in our fu­ture, rather than our past, that must be good news, since if you could choose where to place the filter, you should place it in the fu­ture. I now agree with Wei that (A) is the right choice, but I don’t think that you should be happy about it. And similarly, I don’t think you should be happy about news that tells you that the Great Filter is later than you might have ex­pected.