Probability is Real, and Value is Complex

(This post idea is due en­tirely to Scott Garrabrant, but it has been sev­eral years and he hasn’t writ­ten it up.)

In 2009, Vladimir Nesov ob­served that prob­a­bil­ity can be mixed up with util­ity in differ­ent ways while still ex­press­ing the same prefer­ences. The ob­ser­va­tion was con­cep­tu­ally similar to one made by Jeffrey and Bolker in the book The Logic of De­ci­sion, so I give them in­tel­lec­tual pri­or­ity, and re­fer to the re­sult as “Jeffrey-Bolker ro­ta­tion”.

Based on Nesov’s post, Scott came up with a way to rep­re­sent prefer­ences as vec­tor-val­ued mea­sures, which makes the re­sult ge­o­met­ri­cally clear and math­e­mat­i­cally el­e­gant.

Vec­tor Valued Preferences

As usual, we think of a space of events which form a sigma alge­bra. Each event has a prob­a­bil­ity and an ex­pected util­ity as­so­ci­ated with it. How­ever, rather than deal­ing with di­rectly, we define . Vladimir Nesov called “should­ness”, but that’s fairly mean­ingless. Since it is graphed on the y-axis, rep­re­sents util­ity times prob­a­bil­ity, and is oth­er­wise fairly mean­ingless, a good name for it is “up”. Here is a graph of prob­a­bil­ity and up­ness for some events, rep­re­sented as vec­tors:

(The post ti­tle is a pun on the fact that this looks like the com­plex plane: events are com­plex num­bers with real com­po­nent P and imag­i­nary com­po­nent Q. How­ever, it is bet­ter to think of this as a generic 2D vec­tor space rather than the com­plex plane speci­fi­cally.)

If we as­sume and are mu­tu­ally ex­clu­sive events (that is, ), then calcu­lat­ing the P and Q of their union is sim­ple. The prob­a­bil­ity of the union of two mu­tu­ally ex­clu­sive events is just the sum:

The ex­pected util­ity is the weighted sum of the com­po­nent parts, nor­mal­ized by the sum of the prob­a­bil­ities:

The nu­mer­a­tor is just the sum of the should­nesses, and the de­nom­i­na­tor is just the prob­a­bil­ity of the union:

But, we can mul­ti­ply both sides by the de­nom­i­na­tor to get a re­la­tion­ship on should­ness alone:

Thus, we know that both co­or­di­nates of are sim­ply the sum of the com­po­nent parts. This means union of dis­joint events is vec­tor ad­di­tion in our vec­tor space, as illus­trated in my di­a­gram ear­lier.

Lin­ear Transformations

When we rep­re­sent prefer­ences in a vec­tor space, it is nat­u­ral to think of them as ba­sis-in­de­pen­dent: the way we drew the axes was ar­bi­trary; all that mat­ters is the sys­tem of prefer­ences be­ing rep­re­sented. What this ends up mean­ing is that we don’t care about lin­ear trans­for­ma­tions of the space, so long as the prefer­ences don’t get re­flected (which re­verses the prefer­ence rep­re­sented). This is a gen­er­al­iza­tion of the usual “util­ity is unique up to af­fine trans­for­ma­tions with pos­i­tive co­effi­cient”: util­ity is no longer unique in that way, but the com­bi­na­tion of prob­a­bil­ity and util­ity is unique up to non-re­flect­ing lin­ear trans­for­ma­tions.

Let’s look at that vi­su­ally. Mul­ti­ply­ing all the ex­pected util­ities by a pos­i­tive con­stant doesn’t change any­thing:

Ad­ding a con­stant to ex­pected util­ity doesn’t change any­thing:

Slightly weird, but not too weird… mul­ti­ply­ing all the prob­a­bil­ities by a pos­i­tive con­stant (and the same for Q, since Q is U*P) doesn’t change any­thing (mean­ing we don’t care if prob­a­bil­ities are nor­mal­ized):

Here’s the re­ally new trans­for­ma­tion, which can com­bine with the other 4 to cre­ate all the valid trans­for­ma­tions. The Jeffrey-Bolker ro­ta­tion, which changes what parts of our prefer­ences are rep­re­sented in prob­a­bil­ities vs util­ities:

Let’s pause for a bit on this one, since it is re­ally the whole point of the setup. What does it mean to ro­tate our vec­tor-val­ued mea­sure?

A sim­ple ex­am­ple: sup­pose that we can take a left path, or a right path. There are two pos­si­ble wor­lds, which are equally prob­a­ble: in Left World, the left path leads to a golden city overflow­ing with wealth and char­ity, which we would like to go to with V=+1. The right path leads to a dan­ger­ous bad­lands full of ban­dits, which we would like to avoid, V=-1. On the other hand, Right World (so named be­cause we would pre­fer to go right in this world) has a some­what nice village on the right path, V=+.5, and a some­what nasty swamp on the left, V=-.5. Sup­pos­ing that we are (strangely enough) un­cer­tain about which path we take, we calcu­late the events as fol­lows:

  • Go left in left-world:

    • P=.25

    • V=1

    • Q=.25

  • Go left in right-world:

    • P=.25

    • V=-.5

    • Q=-.125

  • Go right in left-world:

    • P=.25

    • V=-1

    • Q=-.25

  • Go right in right-world:

    • P=.25

    • V=.5

    • Q=.125

  • Go left (union of the two left-go­ing cases):

    • P=.5

    • Q=.125

    • V=Q/​P=.25

  • Go right:

    • P=.5

    • Q=-.125

    • V=Q/​P=-.25

We can calcu­late the V of each ac­tion and take the best. So, in this case, we sen­si­bly de­cide to go left, since the Left-world is more im­pact­ful to us and both are equally prob­a­ble.

Now, let’s ro­tate 30°. (Hope­fully I get the math right here.)

  • Left in L-world:

    • P=.09

    • Q=.34

    • V=3.7

  • Left in R-world:

    • P=.28

    • Q=.02

    • V=.06

  • Right in L-world:

    • P=.34

    • Q=-.09

    • V=-.26

  • Right in R-world:

    • P=.15

    • Q=.23

    • V=1.5

  • Left over­all:

    • P=.37

    • Q=.36

    • V=.97

  • Right over­all:

    • P=.49

    • Q=.14

    • V=.29

Now, it looks like go­ing left is ev­i­dence for be­ing in R-world, and go­ing right is ev­i­dence for be­ing in L-world! The dis­par­ity be­tween the wor­lds has also got­ten larger; L-world now has a differ­ence of al­most 4 util­ity be­tween the differ­ent paths, rather than 2. R-world now eval­u­ates both paths as pos­i­tive, with a differ­ence be­tween the two of only .9. Also note that our prob­a­bil­ities have stopped sum­ming to one (but as men­tioned already, this doesn’t mat­ter much; we could nor­mal­ize the prob­a­bil­ities if we want).

In any case, the fi­nal de­ci­sion is ex­actly the same, as we ex­pect. I don’t have a good in­tu­itive ex­pla­na­tion of what the agent is think­ing, but roughly, the de­creased con­trol the agent has over the situ­a­tion due to the cor­re­la­tion be­tween its ac­tions and which world it is in seems to be com­pen­sated for by the more ex­treme pay­off differ­ences in L-world.

Ra­tional Preferences

Alright, so prefer­ences can be rep­re­sented as vec­tor-val­ued mea­sures in two di­men­sions. Does that mean ar­bi­trary vec­tor-val­ued mea­sures in two di­men­sions can be in­ter­preted as prefer­ences?

No.

The re­stric­tion that prob­a­bil­ities be non-nega­tive means that events can only ap­pear in quad­rants I and IV of the graph. We want to state this in a ba­sis-in­de­pen­dent way, though, since it is un­nat­u­ral to have a preferred ba­sis in a vec­tor space. One way to state the re­quire­ment is that there must be a line pass­ing through the (0,0) point, such that all of the events are strictly to one side of the line, ex­cept per­haps events at the (0,0) point it­self:

As illus­trated, there may be a sin­gle such line, or there may be mul­ti­ple, de­pend­ing on how closely prefer­ences hug the (0,0) point. The nor­mal vec­tor of this line (drawn in red) can be in­ter­preted as the di­men­sion, if you want to pull out prob­a­bil­ities in a way which guaran­tees that they are non-nega­tive. There may be a unique di­rec­tion cor­re­spond­ing to prob­a­bil­ity, and there may not. Since , we get a unique prob­a­bil­ity di­rec­tion if and only if we have events with both ar­bi­trar­ily high util­ities and ar­bi­trar­ily low. So, Jeffrey-Bolker ro­ta­tion is in­trin­si­cally tied up in the ques­tion of whether util­ities are bounded.

Ac­tu­ally, Scott prefers a differ­ent con­di­tion on vec­tor-val­ued mea­sures: that they have a unique (0,0) event. This al­lows for ei­ther in­finite pos­i­tive util­ities (not merely un­bounded—in­finite), or in­finite nega­tive util­ities, but not both. I find this less nat­u­ral. (Note that we have to have an empty event in our sigma-alge­bra, and it has to get value (0,0) as a ba­sic fact of vec­tor-val­ued mea­sures. Whether any other event is al­lowed to have that value is an­other ques­tion.)

How do we use vec­tor-val­ued prefer­ences to op­ti­mize? The ex­pected value of a vec­tor is the slope, . This runs into trou­ble for prob­a­bil­ity zero events, though, which we may cre­ate as we ro­tate. In­stead, we can pre­fer events which are less clock­wise:

(Note, how­ever, that the prefer­ence of a (0,0) event is un­defined.)

This gives the same an­swers for pos­i­tive-x-value, but keeps mak­ing sense as we ro­tate into other quad­rants. More and less clock­wise always makes sense as a no­tion since we as­sumed that the vec­tors always stay to one side of some line; we can’t spin around in a full cir­cle look­ing for the best op­tion, be­cause we will hit the sep­a­rat­ing line. This al­lows us to define a prefer­ence re­la­tion based on the an­gle of be­ing within 180° of ’s.

Conclusion

This is a fun pic­ture of how prob­a­bil­ities and util­ities re­late to each other. It sug­gests that the two are in­ex­tri­ca­bly in­ter­twined, and mean­ingless in iso­la­tion. View­ing them in this way makes it some­what more nat­u­ral to think that prob­a­bil­ities are more like “car­ing mea­sure” ex­press­ing how much the agent cares about how things go in par­tic­u­lar wor­lds, rather than sub­jec­tive ap­prox­i­ma­tions of an ob­jec­tive “mag­i­cal re­al­ity fluid” which de­ter­mines what wor­lds are ex­pe­rienced. (See here for an ex­am­ple of this de­bate.) More prac­ti­cally, it gives a nice tool for vi­su­al­iz­ing the Jeffrey-Bolker ro­ta­tion, which helps us think about prefer­ence re­la­tions which are rep­re­sentable via mul­ti­ple differ­ent be­lief dis­tri­bu­tions.

A down­side of this frame­work is that it re­quires agents to be able to ex­press a prefer­ence be­tween any two events, which might be a lit­tle ab­surd. Let me know if you figure out how to con­nect this to com­plete-class style foun­da­tions which only re­quire agents to have prefer­ences over things which they can con­trol.