The Hidden Complexity of Wishes

“I wish to live in the lo­ca­tions of my choice, in a phys­i­cally healthy, un­in­jured, and ap­par­ently nor­mal ver­sion of my cur­rent body con­tain­ing my cur­rent men­tal state, a body which will heal from all in­juries at a rate three sig­mas faster than the av­er­age given the med­i­cal tech­nol­ogy available to me, and which will be pro­tected from any dis­eases, in­juries or ill­nesses caus­ing dis­abil­ity, pain, or de­graded func­tion­al­ity or any sense, or­gan, or bod­ily func­tion for more than ten days con­sec­u­tively or fif­teen days in any year...”
-- The Open-Source Wish Pro­ject, Wish For Im­mor­tal­ity 1.1

There are three kinds of ge­nies: Ge­nies to whom you can safely say “I wish for you to do what I should wish for”; ge­nies for which no wish is safe; and ge­nies that aren’t very pow­er­ful or in­tel­li­gent.

Sup­pose your aged mother is trapped in a burn­ing build­ing, and it so hap­pens that you’re in a wheelchair; you can’t rush in your­self. You could cry, “Get my mother out of that build­ing!” but there would be no one to hear.

Luck­ily you have, in your pocket, an Out­come Pump. This handy de­vice squeezes the flow of time, pour­ing prob­a­bil­ity into some out­comes, drain­ing it from oth­ers.

The Out­come Pump is not sen­tient. It con­tains a tiny time ma­chine, which re­sets time un­less a speci­fied out­come oc­curs. For ex­am­ple, if you hooked up the Out­come Pump’s sen­sors to a coin, and speci­fied that the time ma­chine should keep re­set­ting un­til it sees the coin come up heads, and then you ac­tu­ally flipped the coin, you would see the coin come up heads. (The physi­cists say that any fu­ture in which a “re­set” oc­curs is in­con­sis­tent, and there­fore never hap­pens in the first place—so you aren’t ac­tu­ally kil­ling any ver­sions of your­self.)

What­ever propo­si­tion you can man­age to in­put into the Out­come Pump, some­how hap­pens, though not in a way that vi­o­lates the laws of physics. If you try to in­put a propo­si­tion that’s too un­likely, the time ma­chine will suffer a spon­ta­neous me­chan­i­cal failure be­fore that out­come ever oc­curs.

You can also redi­rect prob­a­bil­ity flow in more quan­ti­ta­tive ways us­ing the “fu­ture func­tion” to scale the tem­po­ral re­set prob­a­bil­ity for differ­ent out­comes. If the tem­po­ral re­set prob­a­bil­ity is 99% when the coin comes up heads, and 1% when the coin comes up tails, the odds will go from 1:1 to 99:1 in fa­vor of tails. If you had a mys­te­ri­ous ma­chine that spit out money, and you wanted to max­i­mize the amount of money spit out, you would use re­set prob­a­bil­ities that diminished as the amount of money in­creased. For ex­am­ple, spit­ting out $10 might have a 99.999999% re­set prob­a­bil­ity, and spit­ting out $100 might have a 99.99999% re­set prob­a­bil­ity. This way you can get an out­come that tends to be as high as pos­si­ble in the fu­ture func­tion, even when you don’t know the best at­tain­able max­i­mum.

So you des­per­ately yank the Out­come Pump from your pocket—your mother is still trapped in the burn­ing build­ing, re­mem­ber? - and try to de­scribe your goal: get your mother out of the build­ing!

The user in­ter­face doesn’t take English in­puts. The Out­come Pump isn’t sen­tient, re­mem­ber? But it does have 3D scan­ners for the near vicinity, and built-in util­ities for pat­tern match­ing. So you hold up a photo of your mother’s head and shoulders; match on the photo; use ob­ject con­ti­guity to se­lect your mother’s whole body (not just her head and shoulders); and define the fu­ture func­tion us­ing your mother’s dis­tance from the build­ing’s cen­ter. The fur­ther she gets from the build­ing’s cen­ter, the less the time ma­chine’s re­set prob­a­bil­ity.

You cry “Get my mother out of the build­ing!”, for luck, and press En­ter.

For a mo­ment it seems like noth­ing hap­pens. You look around, wait­ing for the fire truck to pull up, and res­cuers to ar­rive—or even just a strong, fast run­ner to haul your mother out of the build­ing -

BOOM! With a thun­der­ing roar, the gas main un­der the build­ing ex­plodes. As the struc­ture comes apart, in what seems like slow mo­tion, you glimpse your mother’s shat­tered body be­ing hurled high into the air, trav­el­ing fast, rapidly in­creas­ing its dis­tance from the former cen­ter of the build­ing.

On the side of the Out­come Pump is an Emer­gency Re­gret But­ton. All fu­ture func­tions are au­to­mat­i­cally defined with a huge nega­tive value for the Re­gret But­ton be­ing pressed—a tem­po­ral re­set prob­a­bil­ity of nearly 1 - so that the Out­come Pump is ex­tremely un­likely to do any­thing which up­sets the user enough to make them press the Re­gret But­ton. You can’t ever re­mem­ber press­ing it. But you’ve barely started to reach for the Re­gret But­ton (and what good will it do now?) when a flam­ing wooden beam drops out of the sky and smashes you flat.

Which wasn’t re­ally what you wanted, but scores very high in the defined fu­ture func­tion...

The Out­come Pump is a ge­nie of the sec­ond class. No wish is safe.

If some­one asked you to get their poor aged mother out of a burn­ing build­ing, you might help, or you might pre­tend not to hear. But it wouldn’t even oc­cur to you to ex­plode the build­ing. “Get my mother out of the build­ing” sounds like a much safer wish than it re­ally is, be­cause you don’t even con­sider the plans that you as­sign ex­treme nega­tive val­ues.

Con­sider again the Tragedy of Group Selec­tion­ism: Some early biol­o­gists as­serted that group se­lec­tion for low sub­pop­u­la­tion sizes would pro­duce in­di­vi­d­ual re­straint in breed­ing; and yet ac­tu­ally en­forc­ing group se­lec­tion in the lab­o­ra­tory pro­duced can­ni­bal­ism, es­pe­cially of im­ma­ture fe­males. It’s ob­vi­ous in hind­sight that, given strong se­lec­tion for small sub­pop­u­la­tion sizes, can­ni­bals will out­re­pro­duce in­di­vi­d­u­als who vol­un­tar­ily forego re­pro­duc­tive op­por­tu­ni­ties. But eat­ing lit­tle girls is such an un-aes­thetic solu­tion that Wynne-Ed­wards, Allee, Br­ere­ton, and the other group-se­lec­tion­ists sim­ply didn’t think of it. They only saw the solu­tions they would have used them­selves.

Sup­pose you try to patch the fu­ture func­tion by spec­i­fy­ing that the Out­come Pump should not ex­plode the build­ing: out­comes in which the build­ing ma­te­ri­als are dis­tributed over too much vol­ume, will have ~1 tem­po­ral re­set prob­a­bil­ities.

So your mother falls out of a sec­ond-story win­dow and breaks her neck. The Out­come Pump took a differ­ent path through time that still ended up with your mother out­side the build­ing, and it still wasn’t what you wanted, and it still wasn’t a solu­tion that would oc­cur to a hu­man res­cuer.

If only the Open-Source Wish Pro­ject had de­vel­oped a Wish To Get Your Mother Out Of A Burn­ing Build­ing:

“I wish to move my mother (defined as the woman who shares half my genes and gave birth to me) to out­side the bound­aries of the build­ing cur­rently clos­est to me which is on fire; but not by ex­plod­ing the build­ing; nor by caus­ing the walls to crum­ble so that the build­ing no longer has bound­aries; nor by wait­ing un­til af­ter the build­ing finishes burn­ing down for a res­cue worker to take out the body...”

All these spe­cial cases, the seem­ingly un­limited num­ber of re­quired patches, should re­mind you of the parable of Ar­tifi­cial Ad­di­tion—pro­gram­ming an Arith­metic Ex­pert Sys­tems by ex­plic­itly adding ever more as­ser­tions like “fif­teen plus fif­teen equals thirty, but fif­teen plus six­teen equals thirty-one in­stead”.

How do you ex­clude the out­come where the build­ing ex­plodes and flings your mother into the sky? You look ahead, and you fore­see that your mother would end up dead, and you don’t want that con­se­quence, so you try to for­bid the event lead­ing up to it.

Your brain isn’t hard­wired with a spe­cific, pre­re­corded state­ment that “Blow­ing up a burn­ing build­ing con­tain­ing my mother is a bad idea.” And yet you’re try­ing to pre­re­cord that ex­act spe­cific state­ment in the Out­come Pump’s fu­ture func­tion. So the wish is ex­plod­ing, turn­ing into a gi­ant lookup table that records your judg­ment of ev­ery pos­si­ble path through time.

You failed to ask for what you re­ally wanted. You wanted your mother to go on liv­ing, but you wished for her to be­come more dis­tant from the cen­ter of the build­ing.

Ex­cept that’s not all you wanted. If your mother was res­cued from the build­ing but was hor­ribly burned, that out­come would rank lower in your prefer­ence or­der­ing than an out­come where she was res­cued safe and sound. So you not only value your mother’s life, but also her health.

And you value not just her bod­ily health, but her state of mind. Be­ing res­cued in a fash­ion that trau­ma­tizes her—for ex­am­ple, a gi­ant pur­ple mon­ster roar­ing up out of nowhere and seiz­ing her—is in­fe­rior to a fire­man show­ing up and es­cort­ing her out through a non-burn­ing route. (Yes, we’re sup­posed to stick with physics, but maybe a pow­er­ful enough Out­come Pump has aliens co­in­ci­den­tally show­ing up in the neigh­bor­hood at ex­actly that mo­ment.) You would cer­tainly pre­fer her be­ing res­cued by the mon­ster to her be­ing roasted al­ive, how­ever.

How about a worm­hole spon­ta­neously open­ing and swal­low­ing her to a desert is­land? Bet­ter than her be­ing dead; but worse than her be­ing al­ive, well, healthy, un­trau­ma­tized, and in con­tinual con­tact with you and the other mem­bers of her so­cial net­work.

Would it be okay to save your mother’s life at the cost of the fam­ily dog’s life, if it ran to alert a fire­man but then got run over by a car? Clearly yes, but it would be bet­ter ce­teris paribus to avoid kil­ling the dog. You wouldn’t want to swap a hu­man life for hers, but what about the life of a con­victed mur­derer? Does it mat­ter if the mur­derer dies try­ing to save her, from the good­ness of his heart? How about two mur­der­ers? If the cost of your mother’s life was the de­struc­tion of ev­ery ex­tant copy, in­clud­ing the mem­o­ries, of Bach’s Lit­tle Fugue in G Minor, would that be worth it? How about if she had a ter­mi­nal ill­ness and would die any­way in eigh­teen months?

If your mother’s foot is crushed by a burn­ing beam, is it worth­while to ex­tract the rest of her? What if her head is crushed, leav­ing her body? What if her body is crushed, leav­ing only her head? What if there’s a cry­on­ics team wait­ing out­side, ready to sus­pend the head? Is a frozen head a per­son? Is Terry Schi­avo a per­son? How much is a chim­panzee worth?

Your brain is not in­finitely com­pli­cated; there is only a finite Kol­mogorov com­plex­ity /​ mes­sage length which suffices to de­scribe all the judg­ments you would make. But just be­cause this com­plex­ity is finite does not make it small. We value many things, and no they are not re­ducible to valu­ing hap­piness or valu­ing re­pro­duc­tive fit­ness.

There is no safe wish smaller than an en­tire hu­man moral­ity. There are too many pos­si­ble paths through Time. You can’t vi­su­al­ize all the roads that lead to the des­ti­na­tion you give the ge­nie. “Max­i­miz­ing the dis­tance be­tween your mother and the cen­ter of the build­ing” can be done even more effec­tively by deto­nat­ing a nu­clear weapon. Or, at higher lev­els of ge­nie power, fling­ing her body out of the So­lar Sys­tem. Or, at higher lev­els of ge­nie in­tel­li­gence, do­ing some­thing that nei­ther you nor I would think of, just like a chim­panzee wouldn’t think of deto­nat­ing a nu­clear weapon. You can’t vi­su­al­ize all the paths through time, any more than you can pro­gram a chess-play­ing ma­chine by hard­cod­ing a move for ev­ery pos­si­ble board po­si­tion.

And real life is far more com­pli­cated than chess. You can­not pre­dict, in ad­vance, which of your val­ues will be needed to judge the path through time that the ge­nie takes. Espe­cially if you wish for some­thing longer-term or wider-range than res­cu­ing your mother from a burn­ing build­ing.

I fear the Open-Source Wish Pro­ject is fu­tile, ex­cept as an illus­tra­tion of how not to think about ge­nie prob­lems. The only safe ge­nie is a ge­nie that shares all your judg­ment crite­ria, and at that point, you can just say “I wish for you to do what I should wish for.” Which sim­ply runs the ge­nie’s should func­tion.

In­deed, it shouldn’t be nec­es­sary to say any­thing. To be a safe fulfiller of a wish, a ge­nie must share the same val­ues that led you to make the wish. Other­wise the ge­nie may not choose a path through time which leads to the des­ti­na­tion you had in mind, or it may fail to ex­clude hor­rible side effects that would lead you to not even con­sider a plan in the first place. Wishes are leaky gen­er­al­iza­tions, de­rived from the huge but finite struc­ture that is your en­tire moral­ity; only by in­clud­ing this en­tire struc­ture can you plug all the leaks.

With a safe ge­nie, wish­ing is su­perflu­ous. Just run the ge­nie.