Singletons Rule OK

Re­ply to: To­tal Tech Wars

How does one end up with a per­sis­tent dis­agree­ment be­tween two ra­tio­nal­ist-wannabes who are both aware of Au­mann’s Agree­ment The­o­rem and its im­pli­ca­tions?

Such a case is likely to turn around two axes: ob­ject-level in­cre­dulity (“no mat­ter what AAT says, propo­si­tion X can’t re­ally be true”) and meta-level dis­trust (“they’re try­ing to be ra­tio­nal de­spite their emo­tional com­mit­ment, but are they re­ally ca­pa­ble of that?”).

So far, Robin and I have fo­cused on the ob­ject level in try­ing to hash out our dis­agree­ment. Tech­ni­cally, I can’t speak for Robin; but at least in my own case, I’ve acted thus be­cause I an­ti­ci­pate that a meta-level ar­gu­ment about trust­wor­thi­ness wouldn’t lead any­where in­ter­est­ing. Be­hind the scenes, I’m do­ing what I can to make sure my brain is ac­tu­ally ca­pa­ble of up­dat­ing, and pre­sum­ably Robin is do­ing the same.

(The linch­pin of my own cur­rent effort in this area is to tell my­self that I ought to be learn­ing some­thing while hav­ing this con­ver­sa­tion, and that I shouldn’t miss any scrap of origi­nal thought in it—the In­cre­men­tal Up­date tech­nique. Be­cause I can gen­uinely be­lieve that a con­ver­sa­tion like this should pro­duce new thoughts, I can turn that feel­ing into gen­uine at­ten­tive­ness.)

Yes­ter­day, Robin in­veighed hard against what he called “to­tal tech wars”, and what I call “win­ner-take-all” sce­nar­ios:

Robin: “If you be­lieve the other side is to­tally com­mit­ted to to­tal vic­tory, that sur­ren­der is un­ac­cept­able, and that all in­ter­ac­tions are zero-sum, you may con­clude your side must never co­op­er­ate with them, nor tol­er­ate much in­ter­nal dis­sent or lux­ury.”

Robin and I both have emo­tional com­mit­ments and we both ac­knowl­edge the dan­ger of that. There’s noth­ing ir­ra­tional about feel­ing, per se; only failure to up­date is blame­wor­thy. But Robin seems to be very strongly against win­ner-take-all tech­nolog­i­cal sce­nar­ios, and I don’t un­der­stand why.

Among other things, I would like to ask if Robin has a Line of Re­treat set up here—if, re­gard­less of how he es­ti­mates the prob­a­bil­ities, he can vi­su­al­ize what he would do if a win­ner-take-all sce­nario were true.

Yes­ter­day Robin wrote:

“Eliezer, if ev­ery­thing is at stake then ‘win­ner take all’ is ‘to­tal war’; it doesn’t re­ally mat­ter if they shoot you or just starve you to death.”

We both have our emo­tional com­mit­ments, but I don’t quite un­der­stand this re­ac­tion.

First, to me it’s ob­vi­ous that a “win­ner-take-all” tech­nol­ogy should be defined as one in which, ce­teris paribus, a lo­cal en­tity tends to end up with the op­tion of be­com­ing one kind of Bostro­mian sin­gle­ton—the de­ci­sion­maker of a global or­der in which there is a sin­gle de­ci­sion-mak­ing en­tity at the high­est level. (A su­per­in­tel­li­gence with un­shared nan­otech would count as a sin­gle­ton; a fed­er­ated world gov­ern­ment with its own mil­i­tary would be a differ­ent kind of sin­gle­ton; or you can imag­ine some­thing like a galac­tic op­er­at­ing sys­tem with a root ac­count con­trol­lable by 80% ma­jor­ity vote of the pop­u­lace, etcetera.)

The win­ner-take-all op­tion is cre­ated by prop­er­ties of the tech­nol­ogy land­scape, which is not a moral stance. Noth­ing is said about an agent with that op­tion, ac­tu­ally be­com­ing a sin­gle­ton. Nor about us­ing that power to shoot peo­ple, or reuse their atoms for some­thing else, or grab all re­sources and let them starve (though “all re­sources” should in­clude their atoms any­way).

Noth­ing is yet said about var­i­ous patches that could try to avert a tech­nolog­i­cal sce­nario that con­tains up­ward cliffs of progress—e.g. bind­ing agree­ments en­forced by source code ex­am­i­na­tion or con­tin­u­ous mon­i­tor­ing, in ad­vance of the event. (Or if you think that ra­tio­nal agents co­op­er­ate on the Pri­soner’s Dilemma, so much work might not be re­quired to co­or­di­nate.)

Su­per­in­tel­li­gent agents not in a hu­man­ish moral refer­ence frame—AIs that are just max­i­miz­ing pa­per­clips or sort­ing peb­bles—who hap­pen on the op­tion of be­com­ing a Bostro­mian Sin­gle­ton, and who have not pre­vi­ously ex­e­cuted any some­how-bind­ing treaty; will ce­teris paribus choose to grab all re­sources in ser­vice of their util­ity func­tion, in­clud­ing the atoms now com­pos­ing hu­man­ity. I don’t see how you could rea­son­ably deny this! It’s a straight­for­ward de­ci­sion-the­o­retic choice be­tween pay­off 10 and pay­off 1000!

But con­versely, there are pos­si­ble agents in mind de­sign space who, given the op­tion of be­com­ing a sin­gle­ton, will not kill you, starve you, re­pro­gram you, tell you how to live your life, or even med­dle in your des­tiny un­seen. See Bostrom’s (short) pa­per on the pos­si­bil­ity of good and bad sin­gle­tons of var­i­ous types.

If Robin thinks it’s im­pos­si­ble to have a Friendly AI or maybe even any sort of benev­olent su­per­in­tel­li­gence at all, even the de­scen­dants of hu­man up­loads—if Robin is as­sum­ing that su­per­in­tel­li­gent agents will act ac­cord­ing to roughly self­ish mo­tives, and that only economies of trade are nec­es­sary and suffi­cient to pre­vent holo­caust—then Robin may have no Line of Re­treat open, as I try to ar­gue that AI has an up­ward cliff built in.

And in this case, it might be time well spent, to first ad­dress the ques­tion of whether Friendly AI is a rea­son­able thing to try to ac­com­plish, so as to cre­ate that line of re­treat. Robin and I are both try­ing hard to be ra­tio­nal de­spite emo­tional com­mit­ments; but there’s no par­tic­u­lar rea­son to need­lessly place one­self in the po­si­tion of try­ing to per­suade, or try­ing to ac­cept, that ev­ery­thing of value in the uni­verse is cer­tainly doomed.

For me, it’s par­tic­u­larly hard to un­der­stand Robin’s po­si­tion in this, be­cause for me the non-sin­gle­ton fu­ture is the one that is ob­vi­ously ab­hor­rent.

If you have lots of en­tities with root per­mis­sions on mat­ter, any of whom has the phys­i­cal ca­pa­bil­ity to at­tack any other, then you have en­tities spend­ing huge amounts of pre­cious ne­gen­tropy on defense and de­ter­rence. If there’s no cen­tral­ized sys­tem of prop­erty rights in place for sel­l­ing off the uni­verse to the high­est bid­der, then you have a race to burn the cos­mic com­mons, and the de­gen­er­a­tion of the vast ma­jor­ity of all agents into ra­pa­cious hard­scrap­ple fron­tier repli­ca­tors.

To me this is a vi­sion of fu­til­ity—one in which a fu­ture light cone that could have been full of happy, safe agents hav­ing com­plex fun, is mostly wasted by agents try­ing to seize re­sources and defend them so they can send out seeds to seize more re­sources.

And it should also be men­tioned that any fu­ture in which slav­ery or child abuse is suc­cess­fully pro­hibited, is a world that has some way of pre­vent­ing agents from do­ing cer­tain things with their com­put­ing power. There are vastly worse pos­si­bil­ities than slav­ery or child abuse opened up by fu­ture tech­nolo­gies, which I flinch from refer­ring to even as much as I did in the pre­vi­ous sen­tence. There are things I don’t want to hap­pen to any­one—in­clud­ing a pop­u­la­tion of a sep­til­lion cap­tive minds run­ning on a star-pow­ered Ma­tri­oshka Brain that is owned, and defended against all res­cuers, by the mind-de­scen­dant of Lawrence Bit­taker (se­rial kil­ler, aka “Pliers”). I want to win against the hor­rors that ex­ist in this world and the hor­rors that could ex­ist in to­mor­row’s world—to have them never hap­pen ever again, or, for the re­ally awful stuff, never hap­pen in the first place. And that vic­tory re­quires the Fu­ture to have cer­tain global prop­er­ties.

But there are other ways to get sin­gle­tons be­sides fal­ling up a tech­nolog­i­cal cliff. So that would be my Line of Re­treat: If minds can’t self-im­prove quickly enough to take over, then try for the path of up­loads set­ting up a cen­tral­ized Con­sti­tu­tional op­er­at­ing sys­tem with a root ac­count con­trol­led by ma­jor­ity vote, or some­thing like that, to pre­vent their de­scen­dants from hav­ing to burn the cos­mic com­mons.

So for me, any satis­fac­tory out­come seems to nec­es­sar­ily in­volve, if not a sin­gle­ton, the ex­is­tence of cer­tain sta­ble global prop­er­ties upon the fu­ture—suffi­cient to pre­vent burn­ing the cos­mic com­mons, pre­vent life’s de­gen­er­a­tion into ra­pa­cious hard­scrap­ple fron­tier repli­ca­tion, and pre­vent su­per­sadists tor­tur­ing sep­til­lions of hel­pless dolls in pri­vate, ob­scure star sys­tems.

Robin has writ­ten about burn­ing the cos­mic com­mons and ra­pa­cious hard­scrap­ple fron­tier ex­is­tences. This doesn’t im­ply that Robin ap­proves of these out­comes. But Robin’s strong re­jec­tion even of win­ner-take-all lan­guage and con­cepts, seems to sug­gest that our emo­tional com­mit­ments are some­thing like 180 de­grees op­posed. Robin seems to feel the same way about sin­gle­tons as I feel about ¬sin­gle­tons.

But why? I don’t think our real val­ues are that strongly op­posed—though we may have ver­bally-de­scribed and at­ten­tion-pri­ori­tized those val­ues in differ­ent ways.