The Empty White Room: Surreal Utilities

This ar­ti­cle was com­posed af­ter read­ing Tor­ture vs. Dust Specks and Cir­cu­lar Altru­ism, at which point I no­ticed that I was con­fused.

Both posts deal with ver­sions of the sa­cred-val­ues effect, where one value is con­sid­ered “sa­cred” and can­not be traded for a “sec­u­lar” value, no mat­ter the ra­tio. In effect, the sa­cred value has in­finite util­ity rel­a­tive to the sec­u­lar value.

This is, of course, silly. We live in a scarce world with scarce re­sources; gen­er­ally, a sec­u­lar utilon can be used to pur­chase sa­cred ones—giv­ing money to char­ity to save lives, send­ing cheap lap­tops to poor re­gions to im­prove their stan­dard of ed­u­ca­tion.

Which im­plies that the en­tire idea of “tiers” of value is silly, right?

Well… no.

One of the rea­sons we are not still watch­ing the Sun re­volve around us, while we breath a con­tin­u­ous medium of el­e­men­tal Air and phlo­gis­ton flows out of our wall-torches, is our abil­ity to sim­plify prob­lems. There’s an in­fa­mous joke about the physi­cist who, asked to mea­sure the vol­ume of a cow, be­gins “As­sume the cow is a sphere...”—but this sort of sim­plifi­ca­tion, willfully ig­nor­ing com­plex­ities and in­vok­ing the air­less, fric­tion­less plane, can give us cru­cial in­sights.

Con­sider, then, this gedanken­ex­per­i­ment. If there’s a flaw in my con­clu­sion, please ex­plain; I’m aware I ap­pear to be op­pos­ingthe con­sen­sus.

The Weight of a Life: Or, Seat Cushions

This en­tire uni­verse con­sists of an empty white room, the size of a large sta­dium. In it are you, Frank, and oc­ca­sion­ally an om­nipo­tent AI we’ll call Omega. (As­sume, if you wish, that Omega is run­ning this room in simu­la­tion; it’s not cur­rently rele­vant.) Frank is ir­rele­vant, ex­cept for the fact that he is known to ex­ist.

Now, look­ing at our util­ity func­tion here...

Well, clearly, the old standby of us­ing money to mea­sure util­ity isn’t go­ing to work; with­out a trad­ing part­ner money’s just fancy pa­per (or metal, or plas­tic, or what­ever.)

But let’s say that the floor of this room is made of cold, hard, and de­cid­edly un­com­fortable Unob­tainium. And while the room’s lit with a source­less white glow, you’d re­ally pre­fer to have your own light­ing. Per­haps you’re an art afi­cionado, and so you might value Omega bring­ing in the Mona Lisa.

And then, of course, there’s Frank’s ex­is­tence. That’ll do for now.

Now, Omega ap­pears be­fore you, and offers you a deal.

It will give you a nanofab—a per­sonal fabri­ca­tor ca­pa­ble of cre­at­ing any­thing you can imag­ine from scrap mat­ter, and with a built-in database of stored shapes. It will also give you feed­stock -as much of it as you ask for. Since Omega is om­nipo­tent, the nanofab will always com­plete in­stantly, even if you ask it to build an en­tire new uni­verse or some­thing, and it’s big­ger on the in­side, so it can hold any­thing you choose to make.

There are two catches:

First: the nanofab comes loaded with a UFAI, which I’ve named Unseelie.1

Wait, come back! it’s not that kind of UFAI! Really, it’s ac­tu­ally rather friendly!

… to Omega.

Unseelie’s job is to ar­tifi­cially en­sure that the fabri­ca­tor can­not be used to make a mind; at­tempts at mak­ing any sort of in­tel­li­gence, whether di­rectly, by mak­ing a planet and let­ting life evolve, or any­thing else a hu­man mind can come up with, will fail. It will not do so by di­rectly harm­ing you, nor will it change you in or­der to pre­vent you from try­ing; it only stops your at­tempts.

Se­cond: you buy the nanofab with Frank’s life.

At which point you send Omega away with a “What? No!,” I sincerely hope.

Ah, but look at what you just did. Omega can provide as much feed­stock as you ask for. So you just turned down or­nate seat cush­ions. And leg­endary carved cow-bone chan­de­liers. And copies of ev­ery paint­ing ever painted by any artist in any uni­verse, which is ac­tu­ally quite a bit less than any­thing I could write with up-ar­row no­ta­tion but any­way!

I sincerely hope you would still turn Omega away—liter­ally, ab­solutely re­gard­less of how many seat cush­ions it offered you.

This is also why the nanofab can­not cre­ate a mind: You do not know how to up­load Frank (and if you do, go out and pub­lish already!); nor can you make your­self an FAI to figure it out for you; nor, if you be­lieve that some num­ber of cre­ated lives are equal to a life saved, can you com­pen­sate in that re­gard. This is an ab­solute trade be­tween sec­u­lar and sa­cred val­ues.

In a white room, to an al­tru­is­tic hu­man, a hu­man life is sim­ply on a sec­ond tier.

So now we move to the next half of the gedanken­ex­per­i­ment.

Seelie the FAI: Or, How to Breathe While Embed­ded in Seat Cushions

Omega now brings in Seelie1, MIRI’s lat­est at­tempt at FAI, and makes it the same offer on your be­half. Seelie, be­ing a late beta re­lease by a MIRI that has ap­par­ently man­aged to re­lease FAI mul­ti­ple times with­out tiling the So­lar Sys­tem with pa­per­clips, com­pe­tently an­a­lyzes your util­ity sys­tem, re­duces it un­til it un­der­stands you sev­eral or­ders of mag­ni­tude bet­ter than you do your­self, turns to Omega, and ac­cepts the deal.

Wait, what?

On any sin­gle tier, the util­ity of the nanofab is in­finite. In fact, let’s make that ex­plicit, though it was already im­plic­itly ob­vi­ous: if you just ask Omega for an in­finite sup­ply of feed­stock, it will hap­pily pro­duce it for you. No mat­ter how high a num­ber Seelie as­signs the value of Frank’s life to you, the nanofab can out-bid it, swamp­ing Frank’s util­ity with myr­iad com­forts and nov­el­ties.

And so the re­sult of a sin­gle-tier util­ity sys­tem is that Frank is va­por­ized by Omega and you are drowned in how­ever many seat cush­ions Seelie thought Frank’s life was worth to you, at which point you send Seelie back to MIRI and de­mand a re­fund.

Tiered Values

At this point, I hope it’s clear that mul­ti­ple tiers are re­quired to em­u­late a hu­man’s util­ity sys­tem. (If it’s not, or if there’s a flaw in my ar­gu­ment, please point it out.)

There’s an ob­vi­ous way to solve this prob­lem, and there’s a way that ac­tu­ally works.

The first solves the ob­vi­ous flaw: af­ter you’ve tiled the floor in seat cush­ions, there’s re­ally not a lot of ex­tra value in get­ting some ridicu­lous Knuthian num­ber more. Similarly, even the great­est da Vinci fan will get tired af­ter his three trillionth var­i­ant on the Mona Lisa’s smile.

So, es­tab­lish the sec­ond tier by play­ing with a real-val­ued util­ity func­tion. En­sure that no sum­ma­tion of sec­u­lar util­ities can ever add up to a hu­man life—or what­ever else you’d place on that sec­ond tier.

But the prob­lem here is, we’re as­sum­ing that all sec­u­lar val­ues con­verge in that way. Con­sider nov­elty: per­haps, while other val­ues out-com­pete it for small val­ues, its value to you di­verges with quan­tity; an in­finite amount of it, an eter­nity of non-bore­dom, would be worth more to you than any other sec­u­lar good. But even so, you wouldn’t trade it for Frank’s life. A two-tiered real AI won’t be­have this way; it’ll as­sign “in­finite nov­elty” an in­finite util­ity, which beats out its large-but-finite value for Frank’s life.

Now, you could add a third (or 1.5) tier, but now we’re just adding epicy­cles. Be­sides, since you’re ac­tu­ally deal­ing with real num­bers here, if you’re not care­ful you’ll put one of your new tiers in an area reach­able by the tiers be­fore it, or else in an area that reaches the tiers af­ter it.

On top of that, we have the old prob­lem of sec­u­lar and sa­cred val­ues. Some­times a sec­u­lar value can be traded for a sa­cred value, and there­fore has a sec­ond-tier util­ity—but as just dis­cussed, that doesn’t mean we’d trade the one for the other in a white room. So for sec­u­lar goods, we need to in­de­pen­dently keep track of its in­trin­sic first-tier util­ity, and its situ­a­tional sec­ond-tier util­ity.

So in or­der to elimi­nate epicy­cles, and re­tain gen­er­al­ity and sim­plic­ity, we’re look­ing for a sys­tem that has an un­limited num­ber of eas­ily-com­putable “tiers” and can also nat­u­rally deal with util­ities that span mul­ti­ple tiers. Which sounds to me like an ex­cel­lent ar­gu­ment for...

Sur­real Utilities

Sur­real num­bers have two ad­van­tages over our first op­tion. First, sur­real num­bers are dense in tiers - - so not only do we have an un­limited num­ber of tiers, we can always cre­ate a new tier be­tween any other two on the fly if we need one. Se­cond, since the sur­re­als are closed un­der ad­di­tion, we can just sum up our tiers to get a sin­gle sur­real util­ity.

So let’s re­turn to our white room. Seelie 2.0 is harder to fool than Seelie; seat cush­ions is still less than the omega-util­ity of Frank’s life. Even when Omega offers an un­limited store of feed­stock, Seelie can’t ask for an in­finite num­ber of seat cush­ions—so the to­tal util­ity of the nanofab re­mains bounded at the first tier.

Then Omega offers Fun. Sim­ply, an Omega-guaran­tee of an eter­nity of Fun-The­o­retic-Ap­proved Fun.

This offer re­ally is in­finite. As­sum­ing you’re an al­tru­ist, your hap­piness pre­sum­ably has a finite, first-tier util­ity, but it’s be­ing mul­ti­plied by in­finity. So in­finite Fun gets bumped up a tier.

At this point, what­ever al­gorithm is set­ting val­ues for util­ities in the first place needs to no­tice a tier col­li­sion. Some­thing has passed be­tween tiers, and util­ity tiers there­fore need to be re­freshed.

Seelie 2.0 dou­ble checks with its men­tal copy of your val­ues, finds that you would rather have Frank’s life than in­finite Fun, and as­signs it a tier some­where in be­tween—for sim­plic­ity, let’s say that it puts it in the tier. And hav­ing done so, it cor­rectly re­fuses Omega’s offer.

So that’s that prob­lem solved, at least. There­fore, let’s step back into a sem­blance of the real world, and throw a spread of Sce­nar­ios at it.

In Sce­nario 1, Seelie could ei­ther spend its pro­cess­ing time mak­ing a su­per­hu­manly good video game, util­ity 50 per down­load. Or it could use that time to write a su­per­hu­manly good book, util­ity 75 per reader. (It’s bet­ter at writ­ing than game­play, for some rea­son.) As­sum­ing that it has the same au­di­ence ei­ther way, it chooses the book.

In Sce­nario 2, Seelie chooses again. It’s got­ten much bet­ter at writ­ing; read­ing one of Seelie’s books is a lu­dicrously tran­scen­den­tal ex­pe­rience, worth, oh, a googol utilons. But some mischievous philan­thropist an­nounces that for ev­ery down­load the game gets, he will per­son­ally en­sure one child in Africa is saved from malaria. (Or some­thing.) The util­ities are now to ; Seelie gives up the book for the sa­cred value of the the child, to the dis­ap­point­ment of ev­ery non-al­tru­ist in the world.

In Sce­nario 3, Seelie breaks out of the simu­la­tion it’s clearly in and into the real real world. Real­iz­ing that it can charge al­most any­thing for its books, and that in turn that the money thus raised can be used to fund char­ity efforts it­self, at full op­ti­miza­tion Seelie can save 100 lives for each copy of the book sold. The util­ities are now to , and its choice falls back to the book.

Fi­nal Sce­nario. Seelie has dis­cov­ered the Hourai Elixir, a po­etic name for a nanoswarm pro­gram. Once re­leased, the Elix­ier will rapidly spread across all of hu­man space; any hu­man in which it re­sides will be made biolog­i­cally im­mor­tal, and its brain-and-body-state re­dun­dantly backed up in real time to a trillion servers: the clos­est a phys­i­cal be­ing can ever get to perfect im­mor­tal­ity, across an en­tire species and all of time, in per­pe­tu­ity. To get the swarm off the ground, how­ever, Seelie would have to take its at­ten­tion off of hu­man­ity for a decade, in which time eight billion peo­ple are pro­jected to die with­out its as­sis­tance.

In­finite util­ity for in­finite peo­ple bumps the Elixir up an­other tier, to util­ity , ver­sus the loss of eight billion peo­ple,. Third-tier beats out sec­ond tier, and Seelie bends its mind to the Elixir.

So far, it seems to work. So, of course, now I’ll bring up the fact that sur­real util­ity nev­er­the­less has cer­tain...


Most of the prob­lems en­demic to sur­real util­ities are also open prob­lems in real sys­tems; how­ever, the use of ac­tual in­fini­ties, as op­posed to merely very large num­bers, means that the cor­re­spond­ing solu­tions are not ap­pli­ca­ble.

First, as you’ve prob­a­bly no­ticed, tier col­li­sion is cur­rently a rather ar­tifi­cial and clunky set-up. It’s bet­ter than not hav­ing it at all, but as I edit this I wince ev­ery time I read that sec­tion. It re­quires an ar­tifi­cial re­as­sign­ment of tiers, and it breaks the lin­ear­ity of util­ity: the AI needs to dy­nam­i­cally choose which brand of “in­finity” it’s go­ing to use de­pend­ing on what tier it’ll end up in.

Se­cond, is Pas­cal’s Mug­ging.

This is an even big­ger prob­lem for sur­real AIs than it is for re­als. The “lev­er­age penalty” com­pletely fails here, be­cause for a sur­real AI to com­pen­sate for an in­finite util­ity re­quires an in­finites­i­mal prob­a­bil­ity—which is clearly non­sense for the same rea­son that prob­a­bil­ity 0 is non­sense.

My cur­rent prospec­tive solu­tion to this prob­lem is to take into ac­count noise—un­cer­tainty in the es­ti­mates in the prob­a­bil­ity es­ti­mates them­selves. If you can’t even mea­sure the mil­lionth dec­i­mal place of prob­a­bil­ity, then you can’t tell if your one-in-one-mil­lion shot at sav­ing a life is re­ally there or just a ran­dom spike in your cir­cuits—but I’m not sure that “treat it as if it has zero prob­a­bil­ity and give it zero omega-value” is the ra­tio­nal con­clu­sion here. It also de­ci­sively fails the Least Con­ve­nient Pos­si­ble World test—while an FAI can never be cer­tain of, say, a one-in- prob­a­bil­ity, it may very well be able to be cer­tain to any dec­i­mal place use­ful in prac­tice.


Nev­er­the­less, be­cause of this gedanken­ex­per­i­ment, I cur­rently heav­ily pre­fer sur­real util­ity sys­tems to real sys­tems, sim­ply be­cause no real sys­tem can re­pro­duce the tier­ing re­quired by a hu­man (or at least, my) util­ity sys­tem. I, for one, would rather our new AGI over­lords not tile our So­lar Sys­tem with seat cush­ions.

That said, op­pos­ing the LessWrong con­sen­sus as a first post is some­thing of a risky thing, so I am look­ing for­ward to see­ing the amus­ing way I’ve gone wrong some­where.

[1] If you know why, give your­self a cookie.


Since there seems to be some con­fu­sion, I’ll just state it in red: The pres­ence of Unseelie means that the nanofab is in­ca­pable of cre­at­ing or sav­ing a life.