VNM expected utility theory: uses, abuses, and interpretation

When in­ter­preted con­vser­va­tively, the von Neu­mann-Mor­gen­stern ra­tio­nal­ity ax­ioms and util­ity the­o­rem are an in­dis­pen­si­ble tool for the nor­ma­tive study of ra­tio­nal­ity, de­serv­ing of many thought ex­per­i­ments and at­ten­tive de­ci­sion the­ory. It’s one more rea­son I’m glad to be born af­ter the 1940s. Yet there is ap­pre­hen­sion about its val­idity, aside from merely con­fus­ing it with Ben­tham util­i­tar­i­anism (as high­lighted by Matt Simp­son). I want to de­scribe not only what VNM util­ity is re­ally meant for, but a con­tex­tual rein­ter­pre­ta­tion of its mean­ing, so that it may hope­fully be used more fre­quently, con­fi­dently, and ap­pro­pri­ately.

  1. Pre­limi­nary dis­cus­sion and precautions

  2. Shar­ing de­ci­sion util­ity is shar­ing power, not welfare

  3. Con­tex­tual Strength (CS) of prefer­ences, and VNM-prefer­ence as “strong” prefer­ence

  4. Haus­ner (lex­i­co­graphic) de­ci­sion util­ity

  5. The in­de­pen­dence ax­iom isn’t bad ei­ther

  6. Ap­pli­ca­tion to ear­lier LessWrong dis­cus­sions of util­ity

1. Pre­limi­nary dis­cus­sion and precautions

The idea of John von Neu­mann and Oskar Mogern­stern is that, if you be­have a cer­tain way, then it turns out you’re max­i­miz­ing the ex­pected value of a par­tic­u­lar func­tion. Very cool! And their de­scrip­tion of “a cer­tain way” is very com­pel­ling: a list of four, rea­son­able-seem­ing ax­ioms. If you haven’t already, check out the Von Neu­mann-Mor­gen­stern util­ity the­o­rem, a math­e­mat­i­cal re­sult which makes their claim rigor­ous, and true.

VNM util­ity is a de­ci­sion util­ity, in that it aims to char­ac­ter­ize the de­ci­sion-mak­ing of a ra­tio­nal agent. One great fea­ture is that it im­plic­itly ac­counts for risk aver­sion: not risk­ing $100 for a 10% chance to win $1000 and 90% chance to win $0 just means that for you, util­ity($100) > 10%util­ity($1000) + 90%util­ity($0).

But as the Wikipe­dia ar­ti­cle ex­plains nicely, VNM util­ity is:

  1. not de­signed to pre­dict the be­hav­ior of “ir­ra­tional” in­di­vi­d­u­als (like real peo­ple in a real econ­omy);

  2. not de­signed to char­ac­ter­ize well-be­ing, but to char­ac­ter­ize de­ci­sions;

  3. not de­signed to mea­sure the value of items, but the value of out­comes;

  4. only defined up to a scalar mul­ti­ple and ad­di­tive con­stant (act­ing with util­ity func­tion U(X) is the same as act­ing with a·U(X)+b, if a>0);

  5. not de­signed to be added up or com­pared be­tween a num­ber of in­di­vi­d­u­als;

  6. not some­thing that can be “sac­ri­ficed” in fa­vor of oth­ers in a mean­ingful way.

[ETA] Ad­di­tion­ally, in the VNM the­o­rem the prob­a­bil­ities are un­der­stood to be known to the agent as they are pre­sented, and to come from a source of ran­dom­ness whose out­comes are not sig­nifi­cant to the agent. Without these as­sump­tions, its proof doesn’t work.

Be­cause of (4), one of­ten con­sid­ers marginal util­ities of the form U(X)-U(Y), to can­cel the am­bi­guity in the ad­di­tive con­stant b. This is to­tally le­gi­t­i­mate, and faith­ful to the math­e­mat­i­cal con­cep­tion of VNM util­ity.

Be­cause of (5), peo­ple of­ten “nor­mal­ize” VNM util­ity to elimi­nate am­bi­guity in both con­stants, so that util­ities are unique num­bers that can be added ac­cross mul­ti­ple agents. One way is to de­clare that ev­ery per­son in some situ­a­tion val­ues $1 at 1 utilon (a fic­tional unit of mea­sure of util­ity), and $0 at 0. I think a more mean­ingful and ap­pli­ca­ble nor­mal­iza­tion is to fix mean and var­i­ance with re­spect to cer­tain out­comes (next sec­tion).

Be­cause of (6), char­ac­ter­iz­ing the al­tru­ism of a VNM-ra­tio­nal agent by how he sac­ri­fices his own VNM util­ity is the wrong ap­proach. In­deed, such a sac­ri­fice is a con­tra­dic­tion. Kah­ne­man sug­gests1, and I agree, that some­thing else should be added or sub­stracted to de­ter­mine the to­tal, com­par­a­tive, or av­er­age well-be­ing of in­di­vi­d­u­als. I’d call it “welfare”, to avoid con­fus­ing it with VNM util­ity. Kah­ne­man calls it E-util­ity, for “ex­pe­rienced util­ity”, a con­no­ta­tion I’ll avoid. In­tu­itively, this is cer­tainly some­thing you could sac­ri­fice for oth­ers, or have more of com­pared to oth­ers. True, a given per­son’s VNM util­ity is likely highly cor­re­lated with her per­sonal “welfare”, but I wouldn’t con­sider it an ac­cu­rate ap­prox­i­ma­tion.

So if not col­lec­tive welfare, then what could cross-agent com­par­i­sons or sums of VNM util­ities in­di­cate? Well, they’re meant to char­ac­ter­ize de­ci­sions, so one mean­ingful ap­pli­ca­tion is to col­lec­tive de­ci­sion-mak­ing:

2. Shar­ing de­ci­sion util­ity is shar­ing power, not welfare

Sup­pose de­ci­sions are to be made by or on be­half of a group. The de­ci­sion could equally be about the welfare of group mem­bers, or some­thing else. E.g.,

  • How much va­ca­tion each mem­ber gets, or

  • Which char­ity the group should in­vest its funds in.

Say each mem­ber ex­presses a VNM util­ity value—a de­ci­sion util­ity—for each out­come, and the de­ci­sion is made to max­i­mize the to­tal. Over time, man­dat­ing or ad­just­ing each mem­ber’s ex­pressed VNM util­ities to have a given mean and var­i­ance could en­sure that no one per­son dom­i­nates all the de­ci­sions by shout­ing gi­ant num­bers all the time. In­ci­den­tally, this is a way of nor­mal­iz­ing their util­ities: it will elimi­nate am­bi­guity in the con­stants ″a″ and ″b″ in (4) of sec­tion 1, which is ex­actly what we need for cross-agent com­par­i­sons and sums to make sense.

Without thought as to whether this is a good sys­tem, the two de­ci­sion ex­am­ples illus­trate how al­lot­ment of nor­mal­ized VNM util­ity sig­nifies shar­ing power in a col­lec­tive de­ci­sion, rather than shar­ing well-be­ing. As such, the lat­ter is bet­ter de­scribed by other met­rics, in my opinion and in Kah­ne­man’s.

3. Con­tex­tual strength (CS) of prefer­ences, and VNM-prefer­ence as “strong” preference

As a nor­ma­tive thory, I think VNM util­ity’s biggest short­com­ming is in its Archi­me­dian (or “Con­ti­nu­ity”) ax­iom, which as we’ll see, ac­tu­ally isn’t very limit­ing. In its harsh­est in­ter­pre­ta­tion, it says that if you won’t sac­ri­fice a small chance at X in or­der to get Y over Z, then you’re not al­lowed to pre­fer Y over Z. For ex­am­ple, if you pre­fer green socks over red socks, then you must be will­ing to sac­ri­fice some small, real prob­a­bil­ity of fulfilling im­mor­tal­ity to fa­vor that out­come. I wouldn’t say this is nec­es­sary to be con­sid­ered ra­tio­nal. Eliezer has noted im­plic­itly in this post (ex­cerpt be­low) that he also has a prob­lem with the Archimedean re­quire­ment.

I think this can be fixed di­rectly with rein­ter­pre­ta­tion. For a given con­text C of pos­si­ble out­comes, let’s in­tu­itively define a “strong prefer­ence” in that con­text to be one which is com­pa­rable in some non-zero ra­tio to the strongest prefer­ences in the con­text. For ex­am­ple, other things be­ing equal, you might con­sis­tently pre­fer green socks to red socks, but this may be com­pletely un­de­tectable on a scale that in­cludes im­mor­tal hap­iness, mak­ing it not a “strong prefer­ence” in that con­text. You might think of the socks as “in­finitely less sig­nifi­cant”, but in­finity is con­fus­ing. Per­haps less daunt­ing is to think of them as a “strictly sec­ondary con­cern” (see next sec­tion).

I sug­gest that the four VNM ax­ioms can work more broadly as ax­ioms for strong prefer­ence in a given con­text. That is, we con­sider VNM-prefer­ence and VNM-utility

  1. to be defined only for a given con­text C of vary­ing pos­si­ble out­comes, and

  2. to in­tu­itively only in­di­cate those prefer­ences finitely-com­pa­rable to the strongest ones in the given con­text.

Then VNM-in­differ­ence, which they de­note by equal­ity, would sim­ply mean a lack of strong prefer­ence in the given con­text, i.e. not car­ing enough to sac­ri­fice like­li­hoods of im­por­tant things. This is a Con­tex­tual Strength (CS) in­ter­pre­ta­tion of VNM util­ity the­ory: in big­ger con­texts, VNM-prefer­ence in­di­cates stronger prefer­ences and weaker in­differ­ences.

(CS) Hence­forth, I ex­plic­itly dis­t­in­guish the terms VNM-prefer­ence and VNM-in­differ­ence as those ax­io­m­a­tized by VNM, in­ter­preted as above.

4. Haus­ner (lex­i­co­graphic) de­ci­sion utility

[ETA] To see the broad ap­pli­ca­bil­ity of VNM util­ity, let’s ex­am­ine the flex­i­bil­ity of a the­ory with­out the Archimedean ax­iom, and see that they differ only mildly in re­sult:

In the socks vs. im­mor­tal­ity ex­am­ple, we could sup­pose that con­text “Big” in­cludes such pos­si­ble out­comes as im­mor­tal hap­piness, hu­man ex­tinc­tion, get­ting socks, and ice-cream, and con­text “Small” in­cludes only get­ting socks and ice-cream. You could have two VNM-like util­ity func­tions: US­mall for eval­u­at­ing gam­bles in the Small con­text, and UBig for the Big con­text. You could act to max­i­mize EUBig when­ever pos­si­ble (EU=ex­pected util­ity), and when two gam­bles have the same EUBig, you could de­fault to choos­ing be­tween them by their EUS­mall val­ues. This is es­sen­tially act­ing to max­i­mize the pair (EUBig, EUS­mall), or­dered lex­i­co­graph­i­cally, mean­ing that a differ­ence in the former value EUBig trumps a differ­ence in the lat­ter value. We thus have a sen­si­ble nu­mer­i­cal way to treat EUBig as “in­finitely more valuable” with­out re­ally in­volv­ing in­fini­ties in the calcu­la­tions; there is no need for that in­ter­pre­ta­tion if you don’t like it, though.

Since we have the VNM ax­ioms to im­ply when some­one is max­i­miz­ing one ex­pec­ta­tion value, you might ask, can we give some nice weaker ax­ioms un­der which some­one is max­i­miz­ing a lex­i­co­graphic tu­ple of ex­pec­ta­tions?

Heart­en­ingly, this has been taken care of, too. By weak­en­ing—in­deed, effec­tively elimi­nat­ing— the Archimedean ax­iom, Melvin Haus­ner2 de­vel­oped this the­ory in 1952 for Rand Cor­po­ra­tion, and Peter Fish­burn3 pro­vides a nice ex­po­si­tion of Haus­ner’s ax­ioms. So now we have Haus­ner-ra­tio­nal agents max­i­miz­ing Haus­ner util­ity.

[ETA] But the differ­ence be­tween Haus­ner and VNM util­ity comes into effect only in the rare event when you know you can’t dis­t­in­guish EUBig val­ues, oth­er­wise the Haus­ner-ra­tio­nal be­hav­ior is to “keep think­ing” to make sure you’re not sac­ri­fic­ing EUBig. The most plau­si­ble sce­nario I can imag­ine where this might ac­tu­ally hap­pen to a hu­man is when mak­ing a de­ci­sion on a pre­cisely known time limit, like say sniping on one of two si­mul­ta­neous ebay auc­tions for socks. CronoDAS might say the time limit cre­ates “noise in your ex­pec­ta­tions”. If the time runs out and you have failed to dis­t­in­guish which sock color re­sults in higher chances of im­mor­tal­ity or other EUBig con­cerns, then I’d say it wouldn’t be ir­ra­tional to make the choice ac­cord­ing to some sec­ondary util­ity EUS­mall that any de­tectable differ­ence in EUBig would oth­er­wise trump.

More­over, it turns out3 that the pri­mary, i.e. most dom­i­nant, func­tion in the Haus­ner util­ity tu­ple be­haves al­most ex­actly like VNM util­ity, and has the same unique­ness prop­erty (up to the con­stants ″a″ and ″b″). So ex­cept in rare cir­cum­stances, you can just think in terms of VNM util­ity and get the same an­swer, and even the rare ex­cep­tions in­volve con­sid­er­a­tions that are nec­es­sar­ily “unim­por­tant” rel­a­tive to the con­text.

Thus, a lot of ap­par­ent flex­i­bil­ity in Haus­ner util­ity the­ory might sim­ply demon­strate that VNM util­ity is more ap­pli­ca­ble to you than it fist ap­peared. This situ­a­tion fa­vors the (CS) in­ter­pre­ta­tion: even when the Archimedean ax­iom isn’t quite satis­fied, we can use VNM util­ity liber­ally as in­di­cat­ing “strong” prefer­ences in a given con­text.

5. The in­de­pen­dence ax­iom isn’t so bad

“A va­ri­ety of gen­er­al­ized ex­pected util­ity the­o­ries have arisen, most of which drop or re­lax the in­de­pen­dence ax­iom.” (Wikipe­dia) But I think the in­de­pen­dence ax­iom (which Haus­ner also as­sumes) is a non-is­sue if we’re talk­ing about “strong prefer­ences”. The fol­low­ing, in var­i­ous forms, is what seems to be the best ar­gu­ment against it:

Sup­pose a par­ent has no VNM prefer­ence be­tween S: her son or her daugh­ter gets a free car, and D: her daugh­ter gets it. In the origi­nal VNM for­mu­la­tion, this is writ­ten “S=D”. She is also pre­sented with a third op­tion, F=.5S+.5D. De­scrip­tively, a fair coin would be flipped, and her son or daugh­ter gets a car ac­cord­ingly.

By writ­ing S=.5S+.5S and D=.5D+.5D, the origi­nal in­de­pen­dence ax­iom says that S=D im­plies S=F=D, so she must be VNM-ind­ffer­ent be­tween F and the oth­ers. How­ever, a de­sire for “fair chances” might re­sult in prefer­ring F, which we might want to al­low as “ra­tio­nal”.

[ETA] I think the most nat­u­ral fix within the VNM the­ory is to just say S’ and D’ are the events “car is awarded so son/​daugh­ter based on a coin toss”, which are slightly bet­ter than S and D them­selves, and that F is re­ally 0.5S’ + 0.5D’. Un­for­tu­nately, such mod­ifi­ca­tions un­der­mine the ap­pli­ca­bil­ity of the VNM the­o­rem, which im­plic­itly as­sumes that the source of prob­a­bil­ities it­self is in­signifi­cant to the out­comes for the agent. Luck­ily, Bolker4 has di­vised an ax­io­matic the­ory whose the­o­rems will ap­ply with­out such as­sump­tions, at the ex­pense of some unique­ness re­sults. I’ll have an­other oc­ca­sion to post on this later.

Any­way, un­der the (CS) in­ter­pre­ta­tion, the re­quire­ment “S=F=D” just means the par­ent lacks a VNM-prefer­ence, i.e. a strong prefer­ence, so it’s not too big of a prob­lem. As­sum­ing she’s VNM-ra­tio­nal just means that, in the im­plicit con­text, she is un­will­ing to make cer­tain prob­a­bil­it­stic sac­ri­fices to fa­vor F over S and D.

  • If the con­text is Big and in­cludes some­thing like death, the VNM-in­differ­ence “S=D” is a weak claim: it might just in­di­cate an un­will­ing­ness to in­crease risk of things finitely-com­pa­rable to death in or­der to ob­tain F over S or D. She is still al­lowed to pre­fer F if no such sac­ri­fice is in­volved.

  • If the con­text is Small, and say only in­cludes her kids get­ting cars, then “S=D” is a strong claim: it in­di­cates an un­will­ingless to risk her kids not get­ting cars to fa­vor S over D in a gam­ble. Then she can still pre­fer F, but she couldn’t pre­fer F’=.49S+.49D+.02(no car) over S or D, since it would con­tra­dict what “S=D” means in terms of car-sac­ri­fice. I think that’s rea­son­able, since if she sim­ply flips a men­tal coin and gives her son the car, she can pre­fer to fa­vor her daugh­ter in later cir­cum­stances.

You might say VNM tells you to “Be the fair­ness that you want to see in the world.”

6. Ap­pli­ca­tion to ear­lier other LessWrong dis­cus­sions of utility

This con­tex­tual strength in­ter­pre­ta­tion of VNM util­ity is di­rectly rele­vant to re­solv­ing Eliezer’s point linked above:

″… The util­ity func­tion is not up for grabs. I love life with­out limit or up­per bound: There is no finite amount of life lived N where I would pre­fer a 80.0001% prob­a­bil­ity of liv­ing N years to an 0.0001% chance of liv­ing a googol­plex years and an 80% chance of liv­ing for­ever.”

This could just in­di­cate that Eliezer ranks im­mor­tal­ity on a scale that trumps finite lifes­pan prefer­ences, a-la-Haus­ner util­ity the­ory. In a con­text of differ­ing pos­i­tive like­li­hoods of im­mor­tal­ity, these other fac­tors are not strong enough to con­sti­tute VNM-prefer­ences.

As well, Stu­art Arm­strong has writ­ten a thought­ful ar­ti­cle “Ex­treme risks: when not to use ex­pected util­ity”, and ar­gues against In­de­pen­dence. I’d like to re­cast his ideas con­text-rel­a­tively, which I think alle­vi­ates the difficulty:

In his para­graph 5, he con­sid­ers var­i­ous ex­is­ten­tial dis­asters. In my view, this is a case for a “Big” con­text util­ity func­tion, not a case against in­de­pen­dence. If you were gam­bling only be­tween eis­ten­tial dis­tasters, then you have might have an “ex­is­ten­tial-con­text util­ity func­tion”, UEx­is­ten­tial. For ex­am­ple, would you prefer

  • 90%(ex­tinc­tion by nu­clear war) + 10%(noth­ing), or

  • 60%(ex­tinc­tion by nu­clear war) + 30%(ex­tinc­tion by as­ter­oids) + 10%(noth­ing)?

If you pre­fer the lat­ter enough to make some com­pa­rable sac­ri­fice in the «noth­ing» term, con­tex­tual VNM just says you as­sign a higher UEx­is­ten­tial to «ex­tinc­tion by as­ter­oids» than to «ex­tinc­tion by nu­clear war».5 There’s no need to be freaked out by as­sign­ing finite num­bers here, since for ex­am­ple Haus­ner would al­low the value of UEx­is­ten­tial to com­pletely trump the value of UEvery­day if you started wor­ry­ing about socks or ice cream. You could be both ex­tremely risk averse re­gard­ing ex­is­ten­tial out­comes, and ab­solutely un­will­ing to gam­ble with them for more triv­ial gains.

In his para­graph 6, Stu­art talks about giv­ing out (nec­es­sar­ily nor­mal­ized) VNM util­ity to peo­ple, which I de­scribed in sec­tion 2 as a model for shar­ing power rather than well-be­ing. I think he gives a good ar­gu­ment against blindly max­i­miz­ing the to­tal nor­mal­ized VNM util­ity of a col­lec­tive in a one-shot de­ci­sion:

″...imag­ine hav­ing to choose be­tween a pro­ject that gave one util to each per­son on the planet, and one that handed slightly over twelve billion utils to a ran­domly cho­sen hu­man and took away one util from ev­ery­one else. If there were trillions of such pro­jects, then it wouldn’t mat­ter what op­tion you chose. But if you only had one shot, it would be pe­cu­liar to ar­gue that there are no ra­tio­nal grounds to pre­fer one over the other, sim­ply be­cause the trillion-iter­ated ver­sions are iden­ti­cal.”

(In­deed, prac­ti­cally, the mean and var­i­ance nor­mal­iza­tion I de­scribed doesn’t ap­ply to provide the same “fair­ness” in a one-shot deal.)

I’d call the lat­ter of Stu­art’s pro­jects an un­fair dis­tri­bu­tion of power in a col­lec­tive de­ci­sion pro­cess, some­thing you might per­son­ally as­sign a low VNM util­ity to, and there­fore avoid. Thus I wouldn’t con­sider it an ar­gu­ment not to use ex­pected util­ity, but an ar­gu­ment not to blindly fa­vor to­tal nor­mal­ized VNM util­ity of a pop­u­la­tion in your own de­ci­sion util­ity func­tion. The same ar­gu­ment—Parfit’s Repug­nant Con­clu­sion—is made against to­tal nor­mal­ized welfare.

The ex­pected util­ity model of ra­tio­nal­ity is al­ive and nor­ma­tively kick­ing, and is highly adapt­able to mod­el­ling very weak as­sump­tions of ra­tio­nal­ity. I hope this post can serve to marginally per­suade oth­ers in that di­rec­tion.

Refer­ences, notes, and fur­ther read­ing:

1 Kah­ne­man, Wakker and Sarin, 1997, Back to Ben­tham? Ex­plo­ra­tions of ex­pe­rienced util­ity, The quar­terly jour­nal of eco­nomics.

2 Haus­ner, 1952, Mul­tidi­men­sional util­ities, Rand Cor­po­ra­tion.

3 Fish­burn, 1971, A Study of Lex­i­co­graphic Ex­pected Utility, Man­age­ment Science.

4 Bolker, 1967, A si­mul­ta­neous ax­iom­a­ti­za­tion of util­ity and prob­a­bil­ity, Philos­o­phy of Science As­so­ci­a­tion.

5 As wedrifid pointed out, you might in­stead just pre­fer un­cer­tainty in your im­pend­ing doom. Just as in sec­tion 5, nei­ther VNM nor Haus­ner can model this use­fully (i.e. in way that al­lows calcu­lat­ing util­ities), though I don’t con­sider this much of a limi­ta­tion. In fact, I’d con­sider it a nor­ma­tive step back­ward to ad­mit “ra­tio­nal” agents who ac­tu­ally pre­fer un­cer­tainty in it­self.