Against the Linear Utility Hypothesis and the Leverage Penalty

[Roughly the sec­ond half of this is a re­ply to: Pas­cal’s Mug­gle]

There’s an as­sump­tion that peo­ple of­ten make when think­ing about de­ci­sion the­ory, which is that util­ity should be lin­ear with re­spect to amount of stuff go­ing on. To be clear, I don’t mean lin­ear with re­spect to amount of money/​cook­ies/​etc that you own; most peo­ple know bet­ter than that. The as­sump­tion I’m talk­ing about is that the state of the rest of the uni­verse (or mul­ti­verse) does not af­fect the marginal util­ity of there also be­ing some­one hav­ing cer­tain ex­pe­riences at some lo­ca­tion in the uni-/​multi-verse. For in­stance, if 1 util is the differ­ence in util­ity be­tween noth­ing ex­ist­ing, and there be­ing a planet that has some hu­mans and other an­i­mals liv­ing on it for a while be­fore go­ing ex­tinct, then the differ­ence in util­ity be­tween noth­ing ex­ist­ing and there be­ing n copies of that planet should be n utils. I’ll call this the Lin­ear Utility Hy­poth­e­sis. It seems to me that, de­spite its pop­u­lar­ity, the Lin­ear Utility Hy­poth­e­sis is poorly mo­ti­vated, and a very poor fit to ac­tual hu­man prefer­ences.

The Lin­ear Utility Hy­poth­e­sis gets im­plic­itly as­sumed a lot in dis­cus­sions of Pas­cal’s mug­ging. For in­stance, in Pas­cal’s Mug­gle, Eliezer Yud­kowsky says he “[doesn’t] see any way around” the con­clu­sion that he must be as­sign­ing a prob­a­bly at most on the or­der of 1/​3↑↑↑3 to the propo­si­tion that Pas­cal’s mug­ger is tel­ling the truth, given that the mug­ger claims to be in­fluenc­ing 3↑↑↑3 lives and that he would re­fuse the mug­ger’s de­mands. This im­plies that he doesn’t see any way that in­fluenc­ing 3↑↑↑3 lives could not have on the or­der of 3↑↑↑3 times as much util­ity as in­fluenc­ing one life, which sounds like an in­vo­ca­tion of the Lin­ear Utility Hy­poth­e­sis.

One ar­gu­ment for some­thing kind of like the Lin­ear Utility Hy­poth­e­sis is that there may be a vast mul­ti­verse that you can in­fluence only a small part of, and un­less your util­ity func­tion is weirdly non­differ­en­tiable and you have very pre­cise in­for­ma­tion about the state of the rest of the mul­ti­verse (or if your util­ity func­tion de­pends pri­mar­ily on things you per­son­ally con­trol), then your util­ity func­tion should be lo­cally very close to lin­ear. That is, if your util­ity func­tion is a smooth func­tion of how many peo­ple are ex­pe­rienc­ing what con­di­tions, then the util­ity from in­fluenc­ing 1 life should be 1/​n times the util­ity of hav­ing the same in­fluence on n lives, be­cause n is in­evitably go­ing to be small enough that a lin­ear ap­prox­i­ma­tion to your util­ity func­tion will be rea­son­ably ac­cu­rate, and even if your util­ity func­tion isn’t smooth, you don’t know what the rest of the uni­verse looks like, so you can’t pre­dict how the small changes you can make will in­ter­act with dis­con­ti­nu­ities in your util­ity func­tion. This is a scaled-up ver­sion of a com­mon ar­gu­ment that you should be will­ing to pay 10 times as much to save 20,000 birds as you would be will­ing to pay to save 2,000 birds. I am sym­pa­thetic to this ar­gu­ment, though not con­vinced of the premise that you can only in­fluence a tiny por­tion of what is ac­tu­ally valuable to you. More im­por­tantly, this ar­gu­ment does not even at­tempt to es­tab­lish that util­ity is globally lin­ear, and coun­ter­in­tu­itive con­se­quences of the Lin­ear Utility Hy­poth­e­sis, such as Pas­cal’s mug­ging, of­ten in­volve situ­a­tions that seem es­pe­cially likely to vi­o­late the as­sump­tion that all choices you make have tiny con­se­quences.

I have never seen any­one provide a defense of the Lin­ear Utility Hy­poth­e­sis it­self (ac­tu­ally, I think I’ve been pointed to the VNM the­o­rem for this, but I don’t count that be­cause it’s a non-se­quitor; the VNM the­o­rem is just a rea­son to use a util­ity func­tion in the first place, and does not place any con­straints on what that util­ity func­tion might look like), so I don’t know of any ar­gu­ments for it available for me to re­fute, and I’ll just go ahead and ar­gue that it can’t be right be­cause ac­tual hu­man prefer­ences vi­o­late it too dra­mat­i­cally. For in­stance, sup­pose you’re given a choice be­tween the fol­low­ing two op­tions: 1: Hu­man­ity grows into a vast civ­i­liza­tion of 10^100 peo­ple liv­ing long and happy lives, or 2: a 10% chance that hu­man­ity grows into a vast civ­i­liza­tion of 10^102 peo­ple liv­ing long and happy lives, and a 90% chance of go­ing ex­tinct right now. I think al­most ev­ery­one would pick op­tion 1, and would think it crazy to take a reck­less gam­ble like op­tion 2. But the Lin­ear Utility Hy­poth­e­sis says that op­tion 2 is much bet­ter. Most of the ways peo­ple re­spond to Pas­cal’s mug­ger don’t ap­ply to this situ­a­tion, since the prob­a­bil­ities and ra­tios of util­ities in­volved here are not at all ex­treme.

There are smaller-scale coun­terex­am­ples to the Lin­ear Utility Hy­poth­e­sis as well. Sup­pose you’re offered the choice be­tween: 1: con­tinue to live a nor­mal life, which lasts for n more years, or 2: live the next year of a nor­mal life, but then in­stead of liv­ing a nor­mal life af­ter that, have all your mem­o­ries from the past year re­moved, and ex­pe­rience that year again n more times (your mem­o­ries get­ting re­set each time). I ex­pect pretty much ev­ery­one to take op­tion 1, even if they ex­pect the next year of their life to be bet­ter than the av­er­age of all fu­ture years of their life. If util­ity is just a naive sum of lo­cal util­ity, then there must be some year in which has at least as much util­ity in it as the av­er­age year, and just re­peat­ing that year ev­ery year would thus in­crease to­tal util­ity. But hu­mans care about the re­la­tion­ship that their ex­pe­riences have with each other at differ­ent times, as well as what those ex­pe­riences are.

Here’s an­other thought ex­per­i­ment that seems like a rea­son­able em­piri­cal test of the Lin­ear Utility Hy­poth­e­sis: take some event that is fa­mil­iar enough that we un­der­stand its ex­pected util­ity rea­son­ably well (for in­stance, the amount of money in your pocket chang­ing by $5), and some lu­dicrously un­likely event (for in­stance, the event in which some ran­dom per­son is ac­tu­ally tel­ling the truth when they claim, with­out ev­i­dence, to have magic pow­ers al­low­ing them to con­trol the fates of ar­bi­trar­ily large uni­verses, and say­ing, with­out giv­ing a rea­son, that the way they use this power is de­pen­dent on some seem­ingly un­re­lated ac­tion you can take), and see if you be­come will­ing to sac­ri­fice the well-un­der­stood amount of util­ity in ex­change for the tiny chance of a large im­pact when the large im­pact be­comes big enough that the tiny chance of it would be more im­por­tant if the Lin­ear Utility Hy­poth­e­sis were true. This thought ex­per­i­ment should sound very fa­mil­iar. The re­sult of this ex­per­i­ment is that ba­si­cally ev­ery­one agrees that they shouldn’t pay the mug­ger, not only at much higher stakes than the Lin­ear Utility Hy­poth­e­sis pre­dicts should be suffi­cient, but even at ar­bi­trar­ily large stakes. This re­sult has even stronger con­se­quences than that the Lin­ear Utility Hy­poth­e­sis is false, namely that util­ity is bounded. Peo­ple have come up with all sorts of ab­surd ex­pla­na­tions for why they wouldn’t pay Pas­cal’s mug­ger even though the Lin­ear Utility Hy­poth­e­sis is true about their prefer­ences (I will ad­dress the least ab­surd of these ex­pla­na­tions in a bit), but there is no bet­ter test for whether an agent’s util­ity func­tion is bounded than how it re­sponds to Pas­cal’s mug­ger. If you take the claim “My util­ity func­tion is un­bounded”, and taboo “util­ity func­tion” and “un­bounded”, it be­comes “Given out­comes A and B such that I pre­fer A over B, for any prob­a­bil­ity p>0, there is an out­come C such that I would take B rather than A if it lets me con­trol whether C hap­pens in­stead with prob­a­bil­ity p.” If you claim that one of these claims is true and the other is false, then you’re just con­tra­dict­ing your­self, be­cause that’s what “util­ity func­tion” means. That can be roughly trans­lated into English as “I would do the equiv­a­lent of pay­ing the mug­ger in Pas­cal’s mug­ging-like situ­a­tions”. So in Pas­cal’s mug­ging-like situ­a­tions, agents with un­bounded util­ity func­tions don’t look for clever rea­sons not to do the equiv­a­lent of pay­ing the mug­ger; they just pay up. The fact that this be­hav­ior is so coun­ter­in­tu­itive is an in­di­ca­tion that agents with un­bounded util­ity func­tions are so alien that you have no idea how to em­pathize with them.

The “least ab­surd ex­pla­na­tion” I referred to for why an agent satis­fy­ing the Lin­ear Utility Hy­poth­e­sis would re­ject Pas­cal’s mug­ger, is, of course, the lev­er­age penalty that Eliezer dis­cusses in Pas­cal’s Mug­gle. The ar­gu­ment is that any hy­poth­e­sis in which there are n peo­ple, one of whom has a unique op­por­tu­nity to af­fect all the oth­ers, must im­ply that a ran­domly se­lected one of those n peo­ple has only a 1/​n chance of be­ing the one who has in­fluence. So if a hy­poth­e­sis im­plies that you have a unique op­por­tu­nity to af­fect n peo­ple’s lives, then this fact is ev­i­dence against this hy­poth­e­sis by a fac­tor of 1:n. In par­tic­u­lar, if Pas­cal’s mug­ger tells you that you are in a unique po­si­tion to af­fect 3↑↑↑3 lives, the fact that you are the one in this po­si­tion is 1 : 3↑↑↑3 ev­i­dence against the hy­poth­e­sis that Pas­cal’s mug­ger is tel­ling the truth. I have two crit­i­cisms of the lev­er­age penalty: first, that it is not the ac­tual rea­son that peo­ple re­ject Pas­cal’s mug­ger, and sec­ond, that it is not a cor­rect rea­son for an ideal ra­tio­nal agent to re­ject Pas­cal’s mug­ger.

The lev­er­age penalty can’t be the ac­tual rea­son peo­ple re­ject Pas­cal’s mug­ger be­cause peo­ple don’t ac­tu­ally as­sign prob­a­bil­ity as low as 1/​3↑↑↑3 to the propo­si­tion that Pas­cal’s mug­ger is tel­ling the truth. This can be demon­strated with thought ex­per­i­ments. Con­sider what hap­pens when some­one en­coun­ters over­whelming ev­i­dence that Pas­cal’s mug­ger ac­tu­ally is tel­ling the truth. The prob­a­bil­ity of the ev­i­dence be­ing faked can’t pos­si­bly be less than 1 in 10^10^26 or so (this up­per bound was sug­gested by Eliezer in Pas­cal’s Mug­gle), so an agent with a lev­er­age prior will still be ab­solutely con­vinced that Pas­cal’s mug­ger is ly­ing. Eliezer sug­gests two rea­sons that an agent might pay Pas­cal’s mug­ger any­way, given a suffi­cient amount of ev­i­dence: first, that once you up­date to a prob­a­bil­ity of some­thing like 10^100 /​ 3↑↑↑3, and mul­ti­ply by the stakes of 3↑↑↑3 lives, you get an ex­pected util­ity of some­thing like 10^100 lives, which is worth a lot more than $5, and sec­ond, that the agent might just give up on the idea of a lev­er­age penalty and ad­mit that there is a non-in­finites­i­mal chance that Pas­cal’s mug­ger may ac­tu­ally be tel­ling the truth. Eliezer con­cludes, and I agree, that the first of these ex­pla­na­tions is not a good one. I can ac­tu­ally demon­strate this with a thought ex­per­i­ment. Sup­pose that af­ter show­ing you over­whelming ev­i­dence that they’re tel­ling the truth, Pas­cal’s mug­ger says “Oh, and by the way, if I was tel­ling the truth about the 3↑↑↑3 lives in your hands, then X is also true,” where X is some (a pri­ori fairly un­likely) propo­si­tion that you later have the op­por­tu­nity to bet on with a third party. Now, I’m sure you’d be ap­pro­pri­ately cau­tious in light of the fact that you would be very con­fused about what’s go­ing on, so you wouldn’t bet reck­lessly, but you prob­a­bly would con­sider your­self to have some spe­cial in­for­ma­tion about X, and if offered good enough odds, you might see a good op­por­tu­nity for profit with an ac­cept­able risk, which would not have looked ap­peal­ing be­fore be­ing told X by Pas­cal’s mug­ger. If you were re­ally as con­fi­dent that Pas­cal’s mug­ger was ly­ing as the lev­er­age prior would im­ply, then you wouldn’t as­sume X was any more likely than you thought be­fore for any pur­poses not in­volv­ing as­tro­nom­i­cal stakes, since your rea­son for be­liev­ing X is pred­i­cated on you hav­ing con­trol over as­tro­nom­i­cal stakes, which is as­tro­nom­i­cally un­likely.

So af­ter see­ing the over­whelming ev­i­dence, you shouldn’t have a lev­er­age prior. And de­spite Eliezer’s protests to the con­trary, this does straight­for­wardly im­ply that you never had a lev­er­age prior in the first place. Eliezer’s ex­cuse for us­ing a lev­er­age prior be­fore but not af­ter see­ing ob­ser­va­tions that a lev­er­age prior pre­dicts are ex­tremely un­likely is com­pu­ta­tional limi­ta­tions. He com­pares this to the situ­a­tion in which there is a the­o­rem X that you aren’t yet aware you can prove, and a lemma Y that you can see is true and you can see im­plies X. If you’re asked how likely X is to be true, you might say some­thing like 50%, since you haven’t thought of Y, and then when asked how likely X&Y is to be true, you see why X is prob­a­bly true, and say some­thing like 90%. This is not at all analo­gous to a “su­pe­rup­date” in which you change pri­ors be­cause of un­likely ob­ser­va­tions, be­cause in the case of as­sign­ing prob­a­bil­ities to math­e­mat­i­cal claims, you only need to think about Y, whereas Eliezer is try­ing to claim that a su­pe­rup­date can only hap­pen when you ac­tu­ally ob­serve that ev­i­dence, and just think­ing hy­po­thet­i­cally about such ev­i­dence isn’t enough. A bet­ter anal­ogy to the situ­a­tion with the the­o­rem and lemma would be when you ini­tially say that there’s a 1 in 3↑↑↑3 chance that Pas­cal’s mug­ger was tel­ling the truth, and then some­one asks what you would think if Pas­cal’s mug­ger tore a hole in the sky, show­ing an­other copy of the mug­ger next to a but­ton, and re­peat­ing the claim that push­ing the but­ton would in­fluence 3↑↑↑3 lives, and then you think “oh in that case I’d think it’s pos­si­ble the mug­ger’s tel­ling the truth; I’d still be pretty skep­ti­cal, so maybe I’d think there was about a 1 in 1000 chance that the mug­ger is tel­ling the truth, and come to think of it, I guess the chance of me ob­serv­ing that ev­i­dence is around 10^-12, so I’m up­dat­ing right now to a 10^-15 chance that the mug­ger is tel­ling the truth.” In­ci­den­tally, if that did hap­pen, then this agent would be very poorly cal­ibrated, since if you as­sign a prob­a­bil­ity of 1 in 3↑↑↑3 to a propo­si­tion, you should as­sign a prob­a­bil­ity of at most 10^15 /​ 3↑↑↑3 to ever jus­tifi­ably up­dat­ing that prob­a­bil­ity to 10^-15. If you want a well-cal­ibrated prob­a­bil­ity for an ab­surdly un­likely event, you should already be think­ing about less un­likely ways that your model of the world could be wrong, in­stead of wait­ing for strong ev­i­dence that your model of the world ac­tu­ally is wrong, and plug­ging your ears and shout­ing “LA LA LA I CAN’T HEAR YOU!!!” when some­one de­scribes a thought ex­per­i­ment that sug­gests that the over­whelm­ingly most likely way the event could oc­cur is for your model to be in­cor­rect. But Eliezer per­plex­ingly sug­gests ig­nor­ing the re­sults of these thought ex­per­i­ments un­less they ac­tu­ally oc­cur in real life, and doesn’t give a rea­son for this other than “com­pu­ta­tional limi­ta­tions”, but, uh, if you’ve thought of a thought ex­per­i­ment and rea­soned though its im­pli­ca­tions, then your com­pu­ta­tional limi­ta­tions ap­par­ently aren’t strict enough to pre­vent you from do­ing that. Eliezer sug­gests that the fact that prob­a­bil­ities must sum to 1 might force you to as­sign near-in­finites­i­mal prob­a­bil­ities to cer­tain easy-to-state propo­si­tions, but this is clearly false. Com­plex­ity pri­ors sum to 1. Those aren’t com­putable, but as long as we’re talk­ing about com­pu­ta­tional limi­ta­tions, by Eliezer’s own es­ti­mate, there are far less than 10^10^26 mu­tu­ally dis­joint hy­pothe­ses a hu­man is phys­i­cally ca­pa­ble of even con­sid­er­ing, so the fact that prob­a­bil­ities sum to 1 can­not force you to as­sign a prob­a­bil­ity less than 1 in 10^10^26 to any of them (and you prob­a­bly shouldn’t; I sug­gest a “strong Cromwell’s rule” that em­piri­cal hy­pothe­ses shouldn’t be given prob­a­bil­ities less than 10^-10^26 or so). And for the sorts of hy­pothe­ses that are easy enough to de­scribe that we ac­tu­ally do so in thought ex­per­i­ments, we’re not go­ing to get up­per bounds any­where near that tiny.

And if you do as­sign a prob­a­bil­ity of 1/​3↑↑↑3 to some propo­si­tion, what is the em­piri­cal con­tent of this claim? One pos­si­ble an­swer is that this means that the odds at which you would be in­differ­ent to bet­ting on the propo­si­tion are 1 : 3↑↑↑3, if the bet is set­tled with some cur­rency that your util­ity func­tion is close to lin­ear with re­spect to across such scales. But the ex­is­tence of such a cur­rency is un­der dis­pute, and the em­piri­cal con­tent to the claim that such a cur­rency ex­ists is that you would make cer­tain bets with it in­volv­ing ar­bi­trar­ily ex­treme odds, so this is a very cir­cu­lar way to em­piri­cally ground the claim that you as­sign a prob­a­bil­ity of 1/​3↑↑↑3 to some propo­si­tion. So a good em­piri­cal ground­ing for this claim is go­ing to have to be in terms of prefer­ences be­tween more fa­mil­iar out­comes. And in terms of pay­offs at fa­mil­iar scales, I don’t see any­thing else that the claim that you as­sign a prob­a­bil­ity of 1/​3↑↑↑3 to a propo­si­tion could mean other than that you ex­pect to con­tinue to act as if the prob­a­bil­ity of the propo­si­tion is 0, even con­di­tional on any ob­ser­va­tions that don’t give you a like­li­hood ra­tio on the or­der of 1/​3↑↑↑3. If you claim that you would su­pe­rup­date long be­fore then, it’s not clear to me what you could mean when you say that your cur­rent prob­a­bil­ity for the propo­si­tion is 1/​3↑↑↑3.

There’s an­other way to see that bounded util­ity func­tions, not lev­er­age pri­ors, are Eliezer’s (and also pretty much ev­ery­one’s) true re­jec­tion to pay­ing Pas­cal’s mug­ger, and that is the fol­low­ing quote from Pas­cal’s Mug­gle: “I still feel a bit ner­vous about the idea that Pas­cal’s Muggee, af­ter the sky splits open, is hand­ing over five dol­lars while claiming to as­sign prob­a­bil­ity on the or­der of 10^9/​3↑↑↑3 that it’s do­ing any good.” This is an ad­mis­sion that Eliezer’s util­ity func­tion is bounded (even though Eliezer does not ad­mit that he is ad­mit­ting this) be­cause the ra­tio­nal agents whose util­ity func­tions are bounded are ex­actly (and tau­tolog­i­cally) char­ac­ter­ized by those for which there ex­ists a prob­a­bil­ity p>0 such that the agent would not spend [fixed amount of util­ity] for prob­a­bil­ity p of do­ing any good, no mat­ter what the good is. An agent satis­fy­ing the Lin­ear Utility Hy­poth­e­sis would spend $5 for a 10^9/​3↑↑↑3 chance of sav­ing 3↑↑↑3 lives. Ad­mit­ting that it would do the wrong thing if it was in that situ­a­tion, but claiming that that’s okay be­cause you have an elab­o­rate ar­gu­ment that the agent can’t be in that situ­a­tion even though it can be in situ­a­tions in which the prob­a­bil­ity is lower and can also be in situ­a­tions in which the prob­a­bil­ity is higher, strikes me as an ex­cep­tion­ally flimsy ar­gu­ment that the Lin­ear Utility Hy­poth­e­sis is com­pat­i­ble with hu­man val­ues.

I also promised a rea­son that the lev­er­age penalty ar­gu­ment is not a cor­rect rea­son for ra­tio­nal agents (re­gard­less of com­pu­ta­tional con­straints) satis­fy­ing the Lin­ear Utility Hy­poth­e­sis to not pay Pas­cal’s mug­ger. This is that in weird situ­a­tions like this, you should be us­ing up­date­less de­ci­sion the­ory, and figure out which policy has the best a pri­ori ex­pected util­ity and im­ple­ment­ing that policy, in­stead of try­ing to make sense of weird an­thropic ar­gu­ments be­fore up­date­fully com­ing up with a strat­egy. Now con­sider the fol­low­ing hy­poth­e­sis: “There are 3↑↑↑3 copies of you, and a Ma­trix Lord will ap­proach one of them while dis­guised as an or­di­nary hu­man, in­form that copy about his pow­ers and in­ten­tions with­out offer­ing any solid ev­i­dence to sup­port his claims, and then kill the rest of the copies iff this copy de­clines to pay him $5. None of the other copies will ex­pe­rience or hal­lu­ci­nate any­thing like this.” Of course, this hy­poth­e­sis is ex­tremely un­likely, but there is no as­sump­tion that some ran­domly se­lected copy co­in­ci­den­tally hap­pens to be the one that the Ma­trix Lord ap­proaches, and thus no way for a lev­er­age penalty to force the prob­a­bil­ity of the hy­poth­e­sis be­low 1/​3↑↑↑3. This hy­poth­e­sis and the Lin­ear Utility Hy­poth­e­sis sug­gest that hav­ing a policy of pay­ing Pas­cal’s mug­ger would have con­se­quences 3↑↑↑3 times as im­por­tant as not dy­ing, which is worth well over $5 in ex­pec­ta­tion, since the prob­a­bil­ity of the hy­poth­e­sis couldn’t be as low as 1/​3↑↑↑3. The fact that ac­tu­ally be­ing ap­proached by Pas­cal’s mug­ger can be seen as over­whelming ev­i­dence against this hy­poth­e­sis does noth­ing to change that.

Edit: I have writ­ten a fol­low-up to this.