Bayesian Utility: Representing Preference by Probability Measures

This is a sim­ple trans­for­ma­tion of stan­dard ex­pected util­ity for­mula that I found con­cep­tu­ally in­ter­est­ing.

For sim­plic­ity, let’s con­sider a finite dis­crete prob­a­bil­ity space with non-zero prob­a­bil­ity at each point p(x), and a util­ity func­tion u(x) defined on its sam­ple space. Ex­pected util­ity of an event A (set of the points of the sam­ple space) is the av­er­age value of util­ity func­tion weighted by prob­a­bil­ity over the event, and is writ­ten as

EU(A)=\frac{\sum_{x\in A}{p(x)\cdot u(x)}}{\sum_{x\in A}{p(x)}}

Ex­pected util­ity is a way of com­par­ing events (sets of pos­si­ble out­comes) that cor­re­spond to, for ex­am­ple, available ac­tions. Event A is said to be prefer­able to event B when EU(A)>EU(B). Prefer­ence re­la­tion doesn’t change when util­ity func­tion is trans­formed by pos­i­tive af­fine trans­for­ma­tions. Since the sam­ple space is as­sumed finite, we can as­sume with­out loss of gen­er­al­ity that for all x, u(x)>0. Such util­ity func­tion can be ad­di­tion­ally rescaled so that for all sam­ple space

\sum_{x}{p(x)\cdot u(x)}=1

Now, if we define

q(x)=p(x)\cdot u(x)

the ex­pected util­ity can be rewrit­ten as

EU(A)=\frac{\sum_{x\in A}{q(x)}}{\sum_{x\in A}{p(x)}}

or

EU(A)=\frac{Q(A)}{P(A)}

Here, P and Q are two prob­a­bil­ity mea­sures. It’s easy to see that this form of ex­pected util­ity for­mula has the same ex­pres­sive power, so prefer­ence re­la­tion can be defined di­rectly by a pair of prob­a­bil­ity mea­sures on the same sam­ple space, in­stead of us­ing a util­ity func­tion.

Ex­pected util­ity writ­ten in this form only uses prob­a­bil­ity of the whole event in both mea­sures, with­out look­ing at the in­di­vi­d­ual points. I ten­ta­tively call mea­sure Q “should­ness”, to­gether with P be­ing “prob­a­bil­ity”. Con­cep­tual ad­van­tage of this form is that prob­a­bil­ity and util­ity are now on equal foot­ing, and it’s pos­si­ble to work with both of them us­ing the fa­mil­iar Bayesian up­dat­ing, in ex­actly the same way. To com­pute ex­pected util­ity of an event given ad­di­tional in­for­ma­tion, just use the pos­te­rior should­ness and prob­a­bil­ity:

EU(A|B)=\frac{Q(A|B)}{P(A|B)}

If events are drawn as points (vec­tors) in (P,Q) co­or­di­nates, ex­pected util­ity is mono­tone on the po­lar an­gle of the vec­tors. Since co­or­di­nates show mea­sures of events, a vec­tor de­pict­ing a union of non­in­ter­sect­ing events is equal to the sum of vec­tors de­pict­ing these events:

(P(A\cup B),Q(A\cup B)) = (P(A),Q(A))+(P(B),Q(B)),\ A\cap B=\emptyset

This al­lows to graph­i­cally see some of the struc­ture of sim­ple sigma-alge­bras of the sam­ple space to­gether with a prefer­ence re­la­tion defined by a pair of mea­sures. See also this com­ment on some ex­am­ples of ap­ply­ing this ge­o­met­ric rep­re­sen­ta­tion of prefer­ence.

Prefer­ence re­la­tion defined by ex­pected util­ity this way also doesn’t de­pend on con­stant fac­tors in the mea­sures, so it’s un­nec­es­sary to re­quire the mea­sures to sum up to 1.

Since P and Q are just de­vices rep­re­sent­ing the prefer­ence re­la­tion, there is noth­ing in­her­ently “epistemic” about P. In­deed, it’s pos­si­ble to mix P and Q to­gether with­out chang­ing the prefer­ence re­la­tion. A pair (p’,q’) defined by

\begin{matrix} \left\{\begin{matrix} p' &=& \alpha\cdot p + (1-\beta)\cdot q\\ q' &=& \beta\cdot q + (1-\alpha)\cdot p \end{matrix}\right.\\ \alpha>\beta \end{matrix}

gives the same prefer­ence re­la­tion,

\frac{Q(A)}{P(A)}>\frac{Q(B)}{P(B)} \Leftrightarrow \frac{Q'(A)}{P'(A)}>\frac{Q'(B)}{P'(B)}

(Coeffi­cients can be nega­tive or more than 1, but val­ues of p and q must re­main pos­i­tive.)

Con­versely, given a fixed mea­sure P, it isn’t pos­si­ble to define an ar­bi­trary prefer­ence re­la­tion by only vary­ing Q (or util­ity func­tion). For ex­am­ple, for a sam­ple space of three el­e­ments, a, b and c, if p(a)=p(b)=p(c), then EU(a)>EU(b)>EU(c) means that EU(a+c)>EU(b+c), so it isn’t pos­si­ble to choose q such that EU(a+c)<EU(b+c). If we are free to choose p, how­ever, an ex­am­ple that has these prop­er­ties (al­low­ing zero val­ues for sim­plic­ity) is a=(0,1/​4), b=(1/​2,3/​4), c=(1/​2,0), with a+c=(1/​2,1/​4), b+c=(1,3/​4), so EU(a+c)<EU(b+c).

Prior is an in­te­gral part of prefer­ence, and it works ex­actly the same way as should­ness. Ma­nipu­la­tions with prob­a­bil­ities, or Bayesian “lev­els of cer­tainty”, are ma­nipu­la­tions with “half of prefer­ence”. The prob­lem of choos­ing Bayesian pri­ors is in gen­eral the prob­lem of for­mal­iz­ing prefer­ence, it can’t be solved com­pletely with­out con­sid­er­ing util­ity, with­out for­mal­iz­ing val­ues, and val­ues are very com­pli­cated. No sim­ple moral­ity, no sim­ple prob­a­bil­ity.