# Extreme risks: when not to use expected utility

Would you pre­fer a 50% chance of gain­ing €10, one chance in a mil­lion off gain­ing €5 mil­lion, or a guaran­teed €5? The stan­dard po­si­tion on Less Wrong is that the an­swer de­pends solely on the differ­ence be­tween cash and util­ity. If your util­ity scales less-than-lin­early with money, you are risk averse and should choose the last op­tion; if it scales more-than-lin­early, you are risk-lov­ing and should choose the sec­ond one. If we re­placed €’s with utils in the ex­am­ple above, then it would sim­ply be ir­ra­tional to pre­fer one op­tion over the oth­ers.

There are math­e­mat­i­cal proofs of that re­sult, but there are also strong in­tu­itive ar­gu­ments for it. What’s the best way of see­ing this? Imag­ine that X1 and X2 were two prob­a­bil­ity dis­tri­bu­tions, with mean u1 and u2 and var­i­ances v1 and v2. If the two dis­tri­bu­tions are in­de­pen­dent, then the sum X1 + X2 has mean u1 + u2, and var­i­ance v1 + v2.

Now if we mul­ti­ply the re­turns of any dis­tri­bu­tion by a con­stant r, the mean scales by r and var­i­ance scales by r2. Con­se­quently if we have n prob­a­bil­ity dis­tri­bu­tions X1, X2, … , Xn rep­re­sent­ing n equally ex­pen­sive in­vest­ments, the ex­pected av­er­age re­turn is (Σni=1 ui)/​n, while the var­i­ance of this av­er­age is (Σni=1 vi)/​n2. If the vn are bounded, then once we make n large enough, that var­i­ance must tend to zero. So if you have many in­vest­ments, your av­er­aged ac­tual re­turns will be, with high prob­a­bil­ity, very close to your ex­pected re­turns.

Thus there is no bet­ter strat­egy than to always fol­low ex­pected util­ity. There is no such thing as sen­si­ble risk-aver­sion un­der these con­di­tions, as there is no ac­tual risk: you ex­pect your re­turns to be your ex­pected re­turns. Even if you your­self do not have enough in­vest­ment op­por­tu­ni­ties to smooth out the un­cer­tainty in this way, you could always ag­gre­gate your own money with oth­ers, through in­surance or in­dex funds, and achieve the same re­sult. Buy­ing a triple-rol­lover lot­tery ticket may be un­wise; but be­ing part of a con­sor­tium that buys up ev­ery ticket for a triple rol­lover lot­tery is just a dull, safe in­vest­ment. If you have al­tru­is­tic prefer­ences, you can even ag­gre­gate re­sults across the planet sim­ply by en­courag­ing more peo­ple to fol­low ex­pected re­turns. So, case closed it seems; de­part­ing from ex­pected re­turns is ir­ra­tional.

But the devil’s de­tail is the con­di­tion ‘once we make n large enough’. Be­cause there are risk dis­tri­bu­tions so skewed that no-one will ever be con­fronted with enough of them to re­duce the var­i­ance to man­age­able lev­els. Ex­treme risks to hu­man­ity are an ex­am­ple; kil­ler as­ter­oids, rogue stars go­ing su­per­nova, un­friendly AI, nu­clear war: even to­tal­ling all these risks to­gether, throw­ing in a few more ex­otic ones, and gen­er­ously adding ev­ery sin­gle other de­ci­sion of our ex­is­tence, we are nowhere near a neat prob­a­bil­ity dis­tri­bu­tion tightly bunched around its mean.

To con­sider an analo­gous situ­a­tion, imag­ine hav­ing to choose be­tween a pro­ject that gave one util to each per­son on the planet, and one that handed slightly over twelve billion utils to a ran­domly cho­sen hu­man and took away one util from ev­ery­one else. If there were trillions of such pro­jects, then it wouldn’t mat­ter what op­tion you chose. But if you only had one shot, it would be pe­cu­liar to ar­gue that there are no ra­tio­nal grounds to pre­fer one over the other, sim­ply be­cause the trillion-iter­ated ver­sions are iden­ti­cal. In the same way, our de­ci­sion when faced with a sin­gle planet-de­stroy­ing event should not be con­strained by the be­havi­our of a hy­po­thet­i­cal be­ing who con­fronts such events trillions of times over.

So where does this leave us? The in­de­pen­dence ax­iom of the von Neu­mann-Mor­gen­stern util­ity for­mal­ism should be ditched, as it im­plies that large var­i­ance dis­tri­bu­tions are iden­ti­cal to sums of low var­i­ance dis­tri­bu­tions. This ax­iom should be re­placed by a weaker ver­sion which re­pro­duces ex­pected util­ity in the limit­ing case of many dis­tri­bu­tions. Since there is no sin­gle ra­tio­nal path available, we need to fill the gap with other ax­ioms – val­ues – that re­flect our gen­uine tol­er­ance to­wards ex­treme risk. As when we first dis­cov­ered prob­a­bil­ity dis­tri­bu­tions in child­hood, we may need to pay at­ten­tion to me­di­ans, modes, var­i­ances, skew­ness, kur­to­sis or the over­all shapes of the dis­tri­bu­tions. Pas­cal’s mug­ger and his whole fam­ily can be con­fronted head-on rather than hop­ing the prob­a­bil­ities neatly can­cel out.

In these ex­treme cases, ex­clu­sively fol­low­ing the ex­pected value is an ar­bi­trary de­ci­sion rather than a log­i­cal ne­ces­sity.

• “To con­sider an analo­gous situ­a­tion, imag­ine hav­ing to choose be­tween a pro­ject that gave one util to each per­son on the planet, and one that handed slightly over twelve billion utils to a ran­domly cho­sen hu­man and took away one util from ev­ery­one else. If there were trillions of such pro­jects, then it wouldn’t mat­ter what op­tion you chose. But if you only had one shot, it would be pe­cu­liar to ar­gue that there are no ra­tio­nal grounds to pre­fer one over the other, sim­ply be­cause the trillion-iter­ated ver­sions are iden­ti­cal.”

That’s not the way ex­pected util­ity works. Utility is sim­ply a way of as­sign­ing num­bers to our prefer­ences; states with big­ger num­bers are bet­ter than states with smaller num­bers by defi­ni­tion. If out­come A has six billion plus a few utilons, and out­come B has six billion plus a few utilons, then, un­der whichever util­ity func­tion we’re us­ing, we are in­differ­ent be­tween A and B by defi­ni­tion. If we are not in­differ­ent be­tween A and B, then we must be us­ing a differ­ent util­ity func­tion.

To take one ex­am­ple, sup­pose we were faced with the choice be­tween A, giv­ing one dol­lar’s worth of goods to ev­ery per­son in the world, or B, tak­ing one dol­lar’s worth of goods from ev­ery per­son in the world, and hand­ing thir­teen billion dol­lar’s worth of goods to one ran­domly cho­sen per­son. The amount of goods in the world is the same in both cases. How­ever, if I pre­fer A to B, then U(A) must be larger than U(B), as this is just a differ­ent way of say­ing the ex­act same thing.

Now, if each per­son has a differ­ent util­ity func­tion, and we must find a way to ag­gre­gate them, that is in­deed an in­ter­est­ing prob­lem. How­ever, in that case, one must be care­ful to re­fer to the util­ity func­tion of per­sons A, B, C, etc., rather than just say­ing “util­ity”, as this is an ex­ceed­ingly easy way to get con­fused.

• To take one ex­am­ple, sup­pose we were faced with the choice be­tween A, giv­ing one dol­lar’s worth of goods to ev­ery per­son in the world, or B, tak­ing one dol­lar’s worth of goods from ev­ery per­son in the world, and hand­ing thir­teen billion dol­lar’s worth of goods to one ran­domly cho­sen per­son. The amount of goods in the world is the same in both cases. How­ever, if I pre­fer A to B, then U(A) must be larger than U(B), as this is just a differ­ent way of say­ing the ex­act same thing.

Pre­cisely. How­ever, I noted that if you had to do the same de­ci­sion a trillion trillion times, the util­ity of both op­tions are es­sen­tially the same. So it means that your util­ity does not sim­ply sum in the naive way if you al­low dis­tri­bu­tion or var­i­ance is­sues into the equa­tion.

• You are right that util­ity does not sum lin­early, but there are much less con­fus­ing ways of demon­strat­ing this. Eg., the law of de­creas­ing marginal util­ity: one mil­lion dol­lars is not a mil­lion times as use­ful as one dol­lar, if you are an av­er­age mid­dle-class Amer­i­can, be­cause you start to run out of high-util­ity-to-cost-ra­tio things to buy.

• Stan­dard util­ity does sum lin­early. If I offer you two chances at one util, it’s im­plicit that the sec­ond util may have a higher dol­lar value if you got the first.

This ar­gu­ment shows that util­ities that care about faire­ness or about var­i­ance do not sum lin­early.

• If you hold lot­tery A once, and it has util­ity B, that does not im­ply that if you hold lot­tery A X times, it must have a to­tal util­ity of X times B. In most cases, if you want to perform X lot­ter­ies such that ev­ery lot­tery has the same util­ity, you will have to perform X differ­ent lot­ter­ies, be­cause each lot­tery changes the ini­tial con­di­tions for the sub­se­quent lot­tery. Eg., if I ran­domly give some per­son a mil­lion dol­lar’s worth of stuff, this prob­a­bly has some util­ity Q. How­ever, if I hold the lot­tery a sec­ond time, it no longer has util­ity Q; it now has util­ity Q—ep­silon, be­cause there’s slightly more stuff in the world, so adding a fixed amount of stuff mat­ters less. If I want an­other lot­tery with util­ity Q, I must give away slightly more stuff the sec­ond time, and even more stuff the third time, and so on and so forth.

• This sounds like equiv­o­ca­tion; yes, the amount of money or stuff to be equally de­sir­able may change over time, but that’s pre­cisely why we try to talk of utils. If there are X lot­ter­ies de­liv­er­ing Y utils, why is the to­tal value not X*Y?

• If you define your util­ity func­tion such that each lot­tery has iden­ti­cal util­ity, then sure. How­ever, your util­ity func­tion also in­cludes prefer­ences based on fair­ness. If you think that a one-billionth chance of do­ing lot­tery A a billion times is bet­ter than do­ing lot­tery A once on grounds of fair­ness, then your util­ity func­tion must as­sign a differ­ent util­ity to lot­tery #658,168,192 than lot­tery #1. You can­not si­mul­ta­neously say that the two are equiv­a­lent in terms of util­ity and that one is prefer­able to the other on grounds of X; that is like try­ing to make A = 3 and A = 4 at the same time.

• Can you trans­late your com­plaint into a prob­lem with the in­de­pen­dence ax­iom in par­tic­u­lar?

Your sec­ond ex­am­ple is not a prob­lem of var­i­ance in fi­nal util­ity, but ag­gre­ga­tion of util­ity. Utility the­ory doesn’t force “Giv­ing 1 util to N peo­ple” to be equiv­a­lent to “Giv­ing N util to 1 per­son”. That is, it doesn’t force your util­ity U to be equal to U1 + U2 + … + UN where Ui is the “util­ity for per­son i”.

• To be con­crete, sup­pose you want to max­imise the av­er­age util­ity peo­ple have, but you also care about fair­ness so, all things equal, you pre­fer the util­ity to be clus­tered about its av­er­age. Then maybe your real util­ity func­tion is not

U = (U[1] + …. + U[n])/​n

but

U’ = U + ((U[1]-U)^2 + …. + (U[n]-U)^2)/​n

which is in some sense a mean minus a var­i­ance.

• Pre­cisely the model I of­ten have in mind (ex­cept I use the stan­dard de­vi­a­tion, not the var­i­ance, as it is in the same units as the mean).

But let us now see the prob­lem with the in­de­pen­dence ax­iom. Re­place ex­pected util­ity with phi=”ex­pected util­ity minus half the stan­dard de­vi­a­tion”.

Then if A and B are two in­de­pen­dent prob­a­bil­ity dis­tri­bu­tions, phi(A+B) >= phi(A) + phi(B) by Jensen’s in­equal­ity, as the square root is a con­cave func­tion. Equal­ity hap­pens only if the var­i­ance of A or B is zero.

Now imag­ine that B and C are iden­ti­cal dis­tri­bu­tions with non-zero var­i­ances, and that A has no var­i­ance with phi(A) = phi(B) = phi(C). Then phi(A+B) = phi(A) + phi(B) = phi(B) + phi(C) < phi(B+C), vi­o­lat­ing in­de­pen­dence.

(if we use var­i­ance rather than stan­dard de­vi­a­tion, we get phi(2B) < 2phi(B), giv­ing similar re­sults)

• A and B are sup­posed to be dis­tri­bu­tions on pos­si­ble out­comes, right? What is A+B sup­posed to mean here? A dis­tri­bu­tion with equal mix­ture of A and B (i.e. 50% chance of A hap­pen­ing and 50% chance of B hap­pen­ing), or A hap­pen­ing fol­lowed by B hap­pen­ing? It doesn’t seem to make sense ei­ther way.

If it’s sup­posed to be 5050 mix­ture of A and B, then phi(A+B) could be less than phi(A) + phi(B). If it’s A hap­pen­ing fol­lowed by B hap­pen­ing, then In­de­pen­dence/​ex­pected util­ity max­i­miza­tion doesn’t ap­ply be­cause it’s about ag­gre­gat­ing util­ity be­tween pos­si­ble wor­lds, not util­ity of events within a pos­si­ble world.

• To be tech­ni­cal, A and B are ran­dom vari­ables, though you can use­fully think of them as gen­er­al­ised lot­ter­ies. A+B rep­re­sents you be­ing en­tered in both lot­ter­ies.

Hum, if this is caus­ing con­fu­sion, there is no sur­prise that my overal post is ob­scure. I’ll try tak­ing it apart to rewrite it more clearly.

• To be tech­ni­cal, A and B are ran­dom vari­ables, though you can use­fully think of them as gen­er­al­ised lot­ter­ies. A+B rep­re­sents you be­ing en­tered in both lot­ter­ies.

That has noth­ing to do with the in­de­pen­dence ax­iom, which is about Wei Dai’s first sug­ges­tion of a 50% chance of A and a 50% chance of B (and about un­equal mix­tures). I think your en­tire post is based on this con­fu­sion.

• I did won­der what Stu­art meant when he started talk­ing about adding prob­a­bil­ity dis­tri­bu­tions to­gether. In the usual treat­ment, a sin­gle prob­a­bil­ity dis­tri­bu­tion rep­re­sents all pos­si­ble wor­lds, yes?

• Yes, the ax­ioms are about prefer­ences over prob­a­bil­ity dis­tri­bu­tions over all pos­si­ble wor­lds and are enough to pro­duce a util­ity func­tion whose ex­pec­ta­tion pro­duces those prefer­ences.

• I think your en­tire post is based on this con­fu­sion.

That’s how it looks to me as well.

• No, it isn’t. I’ll write an­other post that makes my po­si­tion clearer, as it seems I’ve spec­tac­u­larly failed with this one :-)

• This post could use some pol­ish: it’s not clear what the mes­sage is (not that it’s im­pos­si­ble to dis­cern it, but...), and how the para­graphs are re­lated.

Also, “It would be pe­cu­liar to ar­gue” is a poor ar­gu­ment.

• Can you give any ad­vice on im­prov­ing it?

• I don’t like util­ity the­ory at all ex­cept for mak­ing small fairly im­me­di­ate choices, it is too much like the old joke about the physi­cist who says, “As­sume a spher­i­cal cow...”. If any­one could di­rect me to some­thing that isn’t vague and hand­wavey about con­vert­ing real goals and de­sires to “utils” I would be in­ter­ested. Un­til then, I am get­ting re­ally tired of it.

• In the same way, it’s hope­less to try to as­sign prob­a­bil­ities to events and do a Bayesian up­date on ev­ery­thing. But you can still take ad­vice from the­o­rems like “Con­ser­va­tion of ex­pected ev­i­dence” and the like. For­mal­i­sa­tions might not be good for speci­fics, but they’re good for tel­ling you if you’re go­ing wrong in some more gen­eral man­ner.

• I be­lieve von Neu­mann and Mor­ganstern showed that you could ask peo­ple ques­tions about or­di­nal prefer­ences (would you pre­fer x to y) and from a num­ber of such ques­tions (if they’re con­sis­tent), con­struct car­di­nal prefer­ences—which would be turn­ing real goals and de­sires into utils.

• Haven’t var­i­ous psy­cholog­i­cal ex­per­i­ments shown that such self-re­ported prefer­ences are usu­ally in­con­sis­tent? (I’ve seen var­i­ous refs and ex­am­ples here on LW, al­though I can’t re­mem­ber one off­hand...)

• Oh, sure. (Eliezer has a post on spe­cific hu­man in­con­sis­ten­cies from the OB days.) But this is a the­o­ret­i­cal re­sult, say­ing we can go from spe­cific choices - ‘re­vealed prefer­ences’ - to a util­ity func­tion/​set of car­di­nal prefer­ences which will satisfy those choices, if those choices are some­what ra­tio­nal. Which is ex­actly what billswift asked for.

(And I’d note the is­sue here is not what do hu­mans ac­tu­ally use when as­sess­ing small prob­a­bil­ities, but what they should do. If we scrap ex­pected util­ity, it’s not clear what the right thing is; which is what my other com­ment is about.)

• The 12-billion-utils ex­am­ple is similar to one I men­tion on this page un­der “What about Iso­lated Ac­tions?” I agree that our de­ci­sion here is ul­ti­mately ar­bi­trary and up to us. But I also agree with the com­ments by oth­ers that this choice can be built into the stan­dard ex­pected-util­ity frame­work by chang­ing the util­ities. That is, un­less your com­plaint is, as Nick sug­gests, with the in­de­pen­dence ax­iom’s con­straint on ra­tio­nal prefer­ence or­der­ings in and of it­self (for in­stance, if you agreed—as I don’t—that the pop­u­lar choices in the Allais para­dox should count as “ra­tio­nal”).

• No, I don’t agree that the Allais para­dox should count as ra­tio­nal—but I don’t need to use the in­de­pen­dence ax­iom to get to this. I’ll re-ex­plain in a sub­se­quent post.

• For an al­ter­na­tive to ex­pected util­ity max­i­miza­tion that bet­ter de­scribes the de­ci­sions ac­tual hu­mans make, see prospect the­ory by Kah­ne­man and Tver­sky.

• Ah, but I’m not look­ing for a merely de­scrip­tive the­ory, but for one that was also ra­tio­nal and log­i­caly con­sis­tent. And us­ing prospect the­ory for ev­ery small de­ci­sion in your life will leave you worse off than us­ing ex­pected util­ity for ev­ery small de­ci­sion.

There’s noth­ing wrong I can see about us­ing prospect the­ory for the mega-risk de­ci­sions, though—I wouldn’t do so, but there seems no log­i­cal flaw in the idea.

• As when we first dis­cov­ered prob­a­bil­ity dis­tri­bu­tions in child­hood, we may need to pay at­ten­tion to me­di­ans, modes, var­i­ances, skew­ness, kur­to­sis or the over­all shapes of the dis­tri­bu­tions. Pas­cal’s mug­ger and his whole fam­ily can be con­fronted head-on rather than hop­ing the prob­a­bil­ities neatly can­cel out.

I would love to see the mug­ger dis­pel­led, but part of the at­trac­tion of stan­dard util­ity the­ory is that it seems very clean and op­ti­mal; is there any re­place­ment ax­iom which con­vinc­ingly deals with the low prob­a­bil­ity patholo­gies? Just go­ing on cur­rent hu­man psy­chol­ogy doesn’t seem very good.

• A thought on Pas­cal’s Mug­ging:

One source of “the prob­lem” seems to be a dis­guised ver­sion of un­bounded pay­offs.

Mug­ger: I can give you any finite amount of util­ity.

Vic­tim: I find that highly un­likely.

Mug­ger: How un­likely?

Vic­tim: 1/​(re­ally big num­ber)

Mug­ger: Well, if you give me \$1, I’ll give you (re­ally big num­ber)^2 times the util­ity of one dol­lar. Then your ex­pected util­ity is pos­i­tive, so you should give me the money.

The prob­lem here is that what­ever prob­a­bil­ity you give, the Mug­ger can always just make a bet­ter promise. Try­ing to as­sign “I can give you any finite amount of util­ity” a fixed non-zero prob­a­bil­ity is equiv­a­lent to as­sign­ing “I can give you an in­finite amount of util­ity” a fixed non-zero prob­a­bil­ity. It’s sneak­ing an in­finity in through the back door, so to speak.

It’s also very hard for any de­ci­sion the­ory to deal with the prob­lem “Name any ra­tio­nal num­ber, and you get that much util­ity.” That’s be­cause there is no largest ra­tio­nal num­ber; no mat­ter what num­ber you name, there is an­other num­ber that it is bet­ter to name. We can even come up with a ver­sion that even some­one with a bounded util­ity func­tion can be stumped by; “Name any ra­tio­nal num­ber less than ten, and you get that much util­ity.” 9.9 is dom­i­nated by 9.99, which is dom­i­nated by 9.999, and so on. As long as you’re be­ing asked to choose from a set that doesn’t con­tain its least up­per bound, ev­ery choice is strictly dom­i­nated by some other choice. Even if all the num­bers in­volved are finite, be­ing given an in­finite num­ber of op­tions can be enough to give de­ci­sion the­o­ries the fits.

• It’s sneak­ing an in­finity in through the back door, so to speak.

Yes, this is pre­cisely my own think­ing—in or­der to give any as­sess­ment of the prob­a­bil­ity of the mug­ger de­liv­er­ing on any deal, you are in effect giv­ing an as­sess­ment on an in­finite num­ber of deals (from 0 to in­finity), and if you as­sign a non-zero prob­a­bil­ity to all of them (no mat­ter how low), then you wind up with non­sen­si­cal re­sults.

Giv­ing the prob­a­bil­ity be­fore­hand looks even worse if you ig­nore the deal as­pect and sim­ply ask what is the prob­a­bil­ity that any­thing the mug­ger says would be true? (Since this in­cludes as a sub­set any promises to de­liver utils.) Since he could make state­ments about tur­ing ma­chines or Chaitin’s Omega etc., now you’re into ar­eas of in­tractable or un­de­cid­able ques­tions!

As it hap­pens, 2 or 3 days ago I emailed Bostrom about this. There was a fol­lowup pa­per to Bostrom’s “Pas­cal’s Mug­ging”, also pub­lished in Anal­y­sis, by a Bau­mann, who like­wise re­jected the prior prob­a­bil­ity, but Bau­mann didn’t have a good ar­gu­ment against it but to say that any such prob­a­bil­ity is ‘im­plau­si­ble’. Show­ing how in­fini­ties and un­de­cid­abil­ity get smug­gled into the mug­ging shores up Bau­mann’s dis­mis­sal.

But once we’ve dis­missed the prior prob­a­bil­ity, we still need to do some­thing once the mug­ger has made a spe­cific offer. If our prob­a­bil­ity doesn’t shrink at least as quickly as his offer in­creases, then we can still be mugged; if it shrinks ex­actly as quickly or even more quickly, we need to jus­tify our spe­cific shrink­age rate. And that is the per­plex­ity: how fast do we shrink, and why?

(We want the Right the­ory & jus­tifi­ca­tion, not just one that is mod­eled af­ter fal­lible hu­mans or ad ho­cly makes the mug­ger go away. That is what I am ask­ing for in the toplevel com­ment.)

• In­ter­est­ing thoughts on the mug­ger. But you still need a the­ory able to deal with it, not just an un­der­stand­ing of the prob­lems.

For the sec­ond part, you can get a good de­ci­sion the­ory for the “Name any ra­tio­nal num­ber less than ten, and you get that much util­ity,” by giv­ing you a cer­tain frac­tion of negutil­ity for each digit of your defi­ni­tion; there comes a time when the time wasted adding ex­tra ’9’s dwarfs the gain in util­ity. See Tols­toy’s story How Much Land Does a Man Need for a tra­di­tional liter­ary take on this prob­lem.

The “Name any ra­tio­nal num­ber, and you get that much util­ity” prob­lem is more tricky, and would be a ver­sion of the “it is ra­tio­nal to spend in­finity in hell” prob­lem. Ba­si­cally if your ac­tion (stay­ing in hell; or spec­i­fy­ing your util­ity) give you more ul­ti­mate util­ity than you lose by do­ing so, you will spend eter­nity do­ing your util­ity-los­ing ac­tion, and never cash in on your gained util­ity.

• I can give you any finite amount of util­ity.

All I want for Christ­mas is an ar­bi­trar­ily large chunk of util­ity.

Do you maybe see a prob­lem with this con­cept?

• Re­place ex­pected util­ity by ex­pected util­ity minus some mul­ti­ple of the stan­dard de­vi­a­tion, mak­ing that “some mul­ti­ple” go to zero for oft re­peated situ­a­tions.

The mug­ger won’t be able to stand against that, as the stan­dard de­vi­a­tion of his setup is huge.

• Then you would turn down free money. Sup­pose you try to max­i­mize EU—k*SD.

I’ll pick p < 12 * min(1, k^2), and offer you a bet in which you can re­ceive 1 util with prob­a­bil­ity p, or 0 utils with prob­a­bil­ity (1-p). This bet has mean pay­out p and stan­dard de­vi­a­tion sqrt[p(1-p)] You have noth­ing to lose, but you would turn down this bet.

Proof:

p < 12, so (1-p) > 12, so p < k^2/​2 < k^2(1-p)

Divide both sides by (1-p): p /​ (1-p) < k^2

Take the square root of both sides: sqrt[p /​ (1-p)] < k

Mul­ti­ply both sides by sqrt[p(1-p)]: p < k*sqrt[p(1-p)]

Which is equiv­a­lent to: EU < k * SD

So EU—k*SD < 0

• If k is tiny, this is only a minute chance of free money. I agree that it seems ab­surd to turn down that deal, but if the only cost of solv­ing Pas­cal’s mug­ger is that we avoid ad­van­ta­geous lot­ter­ies with such minute pay­offs, it seems a cost worth pay­ing.

But re­call—k is not a con­stant, it is a func­tion of how of­ten the “situ­a­tion” is re­peated. In this con­text, “re­peated situ­a­tion” means an­other lot­tery with larger stan­dard de­vi­a­tion. I’d guess I’ve faced over a mil­lion im­plicit lot­ter­ies with SD higher than k = 0.1 in my life so far.

We can even get more sub­tle about the count­ing. For any SD we have faced that is n times greater than the SD of this lot­tery, we add n to 1/​k.

In that setup, it may be im­pos­si­ble for you to ac­tu­ally pro­pose that free money deal to me (I’ll have to check the maths—it cer­tainly is im­pos­si­ble if we add n^3 to 1/​k). Ba­si­cally, the prob­lem is that k de­pends on the SD, and the SD de­pends on k. As you diminish the SD to catch up with k, you fur­ther de­crease k, and hence p, and hence the SD, and hence k, etc...

In­ter­est­ing ex­am­ple, though; and I’ll try and ac­tu­ally for­mal­ise an ex­am­ple of a sen­si­ble “SD ad­justed EU” so we can have proper de­bates about it.

• That seems pretty ar­bi­trary. You can make the mug­ging go away by sim­ply pe­nal­iz­ing his promise of n utils with a prob­a­bil­ity of 1/​n (or less); but just mak­ing him go away is not a jus­tifi­ca­tion for such a pro­ce­dure—what if you live in a uni­verse where a ec­cen­tric god will give you that many utilons if you win his cos­mic lot­tery?