CEV: a utilitarian critique

I’m post­ing this ar­ti­cle on be­half of Brian To­masik, who au­thored it but is at pre­sent too busy to re­spond to com­ments.

Up­date from Brian: “As of 2013-2014, I have be­come more sym­pa­thetic to at least the spirit of CEV speci­fi­cally and to the pro­ject of com­pro­mise among differ­ing value sys­tems more gen­er­ally. I con­tinue to think that pure CEV is un­likely to be im­ple­mented, though democ­racy and in­tel­lec­tual dis­cus­sion can help ap­prox­i­mate it. I also con­tinues to feel ap­pre­hen­sive about the con­clu­sions that a CEV might reach, but the best should not be the en­emy of the good, and co­op­er­a­tion is in­her­ently about not get­ting ev­ery­thing you want in or­der to avoid get­ting noth­ing at all.”


I’m of­ten asked ques­tions like the fol­low­ing: If wild-an­i­mal suffer­ing, lab uni­verses, sen­tient simu­la­tions, etc. are so bad, why can’t we as­sume that Co­her­ent Ex­trap­o­lated Vo­li­tion (CEV) will figure that out and do the right thing for us?


Most of my knowl­edge of CEV is based on Yud­kowsky’s 2004 pa­per, which he ad­mits is ob­so­lete. I have not yet read most of the more re­cent liter­a­ture on the sub­ject.

Rea­son 1: CEV will (al­most cer­tainly) never happen

CEV is like a dream for a cer­tain type of moral philoso­pher: Fi­nally, the most ideal solu­tion for dis­cov­er­ing what we re­ally want upon re­flec­tion!

The fact is, the real world is not de­cided by moral philoso­phers. It’s de­cided by power poli­tics, eco­nomics, and Dar­wi­nian se­lec­tion. Mo­ral philoso­phers can cer­tainly have an im­pact through these chan­nels, but they’re un­likely to con­vince the world to rally be­hind CEV. Can you imag­ine the US mil­i­tary—dur­ing its AGI de­vel­op­ment pro­cess—de­cid­ing to adopt CEV? No way. It would adopt some­thing that en­sures the con­tinued mil­i­tary and poli­ti­cal dom­i­nance of the US, driven by main­stream Amer­i­can val­ues. Same goes for China or any other coun­try. If AGI is de­vel­oped by a cor­po­ra­tion, the val­ues will re­flect those of the cor­po­ra­tion or the small group of de­vel­op­ers and su­per­vi­sors who hold the most power over the pro­ject. Un­less that group is ex­tremely en­light­ened, CEV is not what we’ll get.

Any­way, this is as­sum­ing that the de­vel­op­ers of AGI can even keep it un­der con­trol. Most likely AGI will turn into a pa­per­clip­per or else evolve into some other kind of Dar­wi­nian force over which we lose con­trol.

Ob­jec­tion 1: “Okay. Fu­ture mil­i­tary or cor­po­rate de­vel­op­ers of AGI prob­a­bly won’t do CEV. But why do you think they’d care about wild-an­i­mal suffer­ing, etc. ei­ther?”

Well, they might not, but if we make the wild-an­i­mal move­ment suc­cess­ful, then in ~50-100 years when AGI does come along, the no­tion of not spread­ing wild-an­i­mal suffer­ing might be suffi­ciently main­stream that even mil­i­tary or cor­po­rate ex­ec­u­tives would care about it, at least to some de­gree.

If post-hu­man­ity does achieve as­tro­nom­i­cal power, it will only be through AGI, so there’s high value for in­fluenc­ing the fu­ture de­vel­op­ers of an AGI. For this rea­son I be­lieve we should fo­cus our meme-spread­ing on those tar­gets. How­ever, this doesn’t mean they should be our only fo­cus, for two rea­sons: (1) Fu­ture AGI de­vel­op­ers will them­selves be in­fluenced by their friends, pop­u­lar me­dia, con­tem­po­rary philo­soph­i­cal and cul­tural norms, etc., so if we can change those things, we will diffusely im­pact fu­ture AGI de­vel­op­ers too. (2) We need to build our move­ment, and the low­est-hang­ing fruit for new sup­port­ers are those most in­ter­ested in the cause (e.g., an­ti­speciesists, en­vi­ron­men­tal-ethics stu­dents, tran­shu­man­ists). We should reach out to them to ex­pand our base of sup­port be­fore go­ing af­ter the big tar­gets.

Ob­jec­tion 2: “Fine. But just as we can ad­vance val­ues like pre­vent­ing the spread of wild-an­i­mal suffer­ing, couldn’t we also in­crease the like­li­hood of CEV by pro­mot­ing that idea?”

Sure, we could. The prob­lem is, CEV is not an op­ti­mal thing to pro­mote, IMHO. It’s suffi­ciently gen­eral that lots of peo­ple would want it, so for our­selves, the higher lev­er­age comes from ad­vanc­ing our par­tic­u­lar, more idiosyn­cratic val­ues. Pro­mot­ing CEV is kind of like pro­mot­ing democ­racy or free speech: It’s fine to do, but if you have a par­tic­u­lar cause that you think is more im­por­tant than other peo­ple re­al­ize, it’s prob­a­bly go­ing to be bet­ter to pro­mote that spe­cific cause than to jump on the band­wagon and do the same thing ev­ery­one else is do­ing, since the band­wagon’s cause may not be what you your­self pre­fer.

In­deed, for my­self, it’s pos­si­ble CEV could be a net bad thing, if it would re­duce the like­li­hood of pa­per­clip­ping—a fu­ture which might (or might not) con­tain far less suffer­ing than a fu­ture di­rected by hu­man­ity’s ex­trap­o­lated val­ues.

Rea­son 2: CEV would lead to val­ues we don’t like

Some be­lieve that moral­ity is ab­solute, in which case a CEV’s job would be to un­cover what that is. This view is mis­taken, for the fol­low­ing rea­sons: (1) Ex­is­tence of a sep­a­rate realm of re­al­ity where eth­i­cal truths reside vi­o­lates Oc­cam’s ra­zor, and (2) even if they did ex­ist, why would we care what they were?

Yud­kowsky and the LessWrong com­mu­nity agree that ethics is not ab­solute, so they have differ­ent mo­ti­va­tions be­hind CEV. As far as I can gather, the fol­low­ing are two of them:

Mo­ti­va­tion 1: Some be­lieve CEV is gen­uinely the right thing to do

As Eliezer said in his 2004 pa­per (p. 29), “Im­ple­ment­ing CEV is just my at­tempt not to be a jerk.” Some may be­lieve that CEV is the ideal meta-eth­i­cal way to re­solve eth­i­cal dis­putes.

I have to differ. First, the set of minds in­cluded in CEV is to­tally ar­bi­trary, and hence, so will be the out­put. Why in­clude only hu­mans? Why not an­i­mals? Why not dead hu­mans? Why not hu­mans that weren’t born but might have been? Why not pa­per­clip max­i­miz­ers? Baby eaters? Peb­ble sorters? Suffer­ing max­i­miz­ers? Wher­ever you draw the line, there you’re already in­sert­ing your val­ues into the pro­cess.

And then once you’ve picked the set of minds to ex­trap­o­late, you still have as­tro­nom­i­cally many ways to do the ex­trap­o­la­tion, each of which could give wildly differ­ent out­puts. Hu­mans have a thou­sand ran­dom shards of in­tu­ition about val­ues that re­sulted from all kinds of lit­tle, ar­bi­trary per­tur­ba­tions dur­ing evolu­tion and en­vi­ron­men­tal ex­po­sure. If the CEV al­gorithm hap­pens to make some more salient than oth­ers, this will po­ten­tially change the out­come, per­haps dras­ti­cally (but­terfly effects).

Now, I would be in fa­vor of a rea­son­able ex­trap­o­la­tion of my own val­ues. But hu­man­ity’s val­ues are not my val­ues. There are peo­ple who want to spread life through­out the uni­verse re­gard­less of suffer­ing, peo­ple who want to pre­serve na­ture free from hu­man in­terfer­ence, peo­ple who want to cre­ate lab uni­verses be­cause it would be cool, peo­ple who op­pose util­itro­n­ium and sup­port re­tain­ing suffer­ing in the world, peo­ple who want to send mem­bers of other re­li­gions to eter­nal tor­ture, peo­ple who be­lieve sin­ful chil­dren should burn for­ever in red-hot ovens, and on and on. I do not want these val­ues to be part of the mix.

Maybe (hope­fully) some of these be­liefs would go away once peo­ple learned more about what these wishes re­ally im­plied, but some would not. Take abor­tion, for ex­am­ple: Some non-re­li­gious peo­ple gen­uinely op­pose it, and not for triv­ial, mis­in­formed rea­sons. They have thought long and hard about abor­tion and still find it to be wrong. Others have thought long and hard and still find it to be not wrong. At some point, we have to ad­mit that hu­man in­tu­itions are gen­uinely in con­flict in an ir­rec­on­cilable way. Some hu­man in­tu­itions are ir­rec­on­cilably op­posed to mine, and I don’t want them in the ex­trap­o­la­tion pro­cess.

Mo­ti­va­tion 2: Some ar­gue that even if CEV isn’t ideal, it’s the best game-the­o­retic ap­proach be­cause it amounts to co­op­er­at­ing on the pris­oner’s dilemma

I think the idea is that if you try to pro­mote your spe­cific val­ues above ev­ery­one else’s, then you’re time­lessly caus­ing this to be the de­ci­sion of other groups of peo­ple who want to push for their val­ues in­stead. But if you de­cided to co­op­er­ate with ev­ery­one, you would time­lessly in­fluence oth­ers to do the same.

This seems worth con­sid­er­ing, but I’m doubt­ful that the ar­gu­ment is com­pel­ling enough to take too se­ri­ously. I can al­most guaran­tee that if I de­cided to start co­op­er­at­ing by work­ing to­ward CEV, ev­ery­one else work­ing to shape val­ues of the fu­ture wouldn’t sud­denly jump on board and do the same.

Ob­jec­tion 1: “Sup­pose CEV did hap­pen. Then spread­ing con­cern for wild an­i­mals and the like might have lit­tle value, be­cause the CEV pro­cess would re­al­ize that you had tried to rig the sys­tem ahead of time by mak­ing more peo­ple care about the cause, and it would at­tempt to neu­tral­ize your efforts.”

Well, first of all, CEV is (al­most cer­tainly) never go­ing to hap­pen, so I’m not too wor­ried. Se­cond of all, it’s not clear to me that such a scheme would ac­tu­ally be put in place. If you’re try­ing to undo pre-CEV in­fluences that led to the dis­tri­bu­tion of opinions to that point, you’re go­ing to have a heck of a lot of un­do­ing to do. Are you go­ing to undo the abun­dance of Catholics be­cause their re­li­gion dis­cour­aged birth con­trol and so led to large num­bers of sup­port­ers? Are you go­ing to undo the over-rep­re­sen­ta­tion of healthy hu­mans be­cause nat­u­ral se­lec­tion un­fairly re­moved all those sickly ones? Are you go­ing to undo the un­der-rep­re­sen­ta­tion of dinosaurs be­cause an ar­bi­trary as­ter­oid kil­led them off be­fore CEV came around?

The fact is that who has power at the time of AGI will prob­a­bly mat­ter a lot. If we can im­prove the val­ues of those who will have power in the fu­ture, this will in ex­pec­ta­tion lead to bet­ter out­comes—re­gard­less of whether the CEV fairy tale comes true.