Normalising utility as willingness to pay

I’ve thought of a frame­work that puts most of the meth­ods of in­ter­te­o­retic util­ity nor­mal­i­sa­tion and bar­gain­ing on the same foot­ing. See this first post for a re­minder of the differ­ent types of util­ity func­tion nor­mal­i­sa­tion.

Most of the nor­mal­i­sa­tion tech­niques can be con­ceived of as a game with two out­comes, and each player can pay a cer­tain amount of their util­ity to flip from one one out­come to an­other. Then we can use the max­i­mal amount of util­ity they are will­ing to pay, as the com­mon mea­sur­ing stick for nor­mal­i­sa­tion.

Con­sider for ex­am­ple the min-max nor­mal­i­sa­tion: this as­signs util­ity to the ex­pected util­ity if the agent makes the worst pos­si­ble de­ci­sions, and if they make the best pos­si­ble ones.

So, if your util­ity func­tion is , the ques­tion is: how much util­ity would you be will­ing to pay to pre­vent your neme­sis (a max­imiser) from con­trol­ling the de­ci­sion pro­cess, and let you take it over in­stead? Di­vid­ing by that amount[1] will give you the min-max nor­mal­i­sa­tion (up to the ad­di­tion of a con­stant).

Now con­sider the mean-max nor­mal­i­sa­tion. For this, the game is as fol­lows: how much would you be will­ing to pay to pre­vent a policy from choos­ing ran­domly amongst the out­comes (“mean”), and let you take over the de­ci­sion pro­cess in­stead?

Con­versely, the mean min-mean nor­mal­i­sa­tion asks how much you would be will­ing to pay to pre­vent your neme­sis from con­trol­ling the de­ci­sion pro­cess, and shift­ing to a ran­dom pro­cess in­stead.

The mean differ­ence method is a bit differ­ent: here, two out­comes are cho­sen at ran­dom, and you are asked now much you are will­ing to pay to shift from the worst out­come to the best. The ex­pec­ta­tion of that amount is used for nor­mal­i­sa­tion.

The mu­tual Worth bar­gain­ing solu­tion has a similar in­ter­pre­ta­tion: how much would you be will­ing to pay to move from the de­fault op­tion, to one where you con­trol­led all de­ci­sions?

A few nor­mal­i­sa­tions don’t seem to fit into the this frame­work, most es­pe­cially those that de­pend on the square of the util­ity, such as var­i­ance nor­mal­i­sa­tion or the Nash Bar­gain­ing solu­tion. The Kalai–Smorod­in­sky bar­gain­ing solu­tion uses a similar nor­mal­i­sa­tion as the mu­tual worth bar­gain­ing solu­tion, but chooses the out­come differ­ently: if the de­fault point is at the ori­gin, it will pick the point with largest .

  1. This, of course, would in­cen­tivise you to lie—but that prob­lem is un­avoid­able in bar­gain­ing any­way. ↩︎