# Categorial preferences and utility functions

This post is mo­ti­vated by a re­cent post of Stu­art Arm­strong on go­ing from prefer­ences to a util­ity func­tion. It was origi­nally planned as a com­ment, but seems to have de­vel­oped a bit of a life of its own. The ideas here came up in a dis­cus­sion with Owen Bie­sel; all mis­takes in this ex­po­si­tion are mine. I’m not very good with the type­set­ting en­g­ine here, so apolo­gies for la­tex and other prob­lems.

The ba­sic ideas is as fol­lows. Sup­pose we have a set of ob­jects, and we are given some in­for­ma­tion on which ob­jects are preferred to which other ob­jects. Then we are in­ter­ested in whether and in how many ways this data can be cap­tured by a util­ity func­tion. Our key in­no­va­tion is that we as­sume not only the di­rec­tion of prefer­ences is given, but also some in­for­ma­tion on the strength of the prefer­ences, in a man­ner which we will make pre­cise be­low (weak prefer­ences).

Ba­sic on or­ders vs util­ity functions

We re­fer to the Order The­ory page on Wikipe­dia for the defi­ni­tions of re­flex­ive, anti-sym­met­ric, tran­si­tive and con­nex­ive bi­nary re­la­tions. If is a set and is a func­tion (util­ity’), this in­duces a re­flex­ive, tran­si­tive and con­nected bi­nary re­la­tion (not anti-sym­met­ric in gen­eral, un­less is in­jec­tive).

Con­versely, any re­flex­ive, tran­si­tive, an­ti­sym­met­ric and con­nex­ive bi­nary re­la­tion (a.k.a. to­tal or­der) on a countable set , this is in­duced by a util­ity func­tion tak­ing val­ues in the ra­tio­nal num­bers (link to proof); there is a more gen­eral dis­cus­sion here.

Strength of prefer­ences

In what fol­lows, we fix a to­tally or­dered abelian group . To ex­press a prefer­ence be­tween to ob­jects , of our set , one should give an el­e­ment of which ex­presses how strongly is preferred to . The most nat­u­ral ex­am­ple is to take , then:

Say­ing is preferred to with strength 1 means we slightly pre­fer to ;

Say­ing s1 is preferred to s2 with strength 0 means have no prefer­ence be­tween s1 and s2;

Say­ing s1 is preferred to s2 with strength 2 means we pre­fer s1 to s2 more strongly;

Say­ing s1 is preferred to s2 with strength −1 means we slightly pre­fer s2 to s1.

# Ex­pres­sions of preference

We con­sider three ways of de­scribing prefer­ences among ob­jects in a set :

1. Weak Prefer­ence

Defi­ni­tion. A weak prefer­ence among el­e­ments of con­sists of a col­lec­tion of triples with and .

A triple (s1,s2,g) is in­ter­preted as mean­ing that is preferred to with strength . This is the kind of data one might be pro­vided with in prac­tise; for ex­am­ple, some­one tells you that they slightly pre­fer cats to dogs, but strongly prefers dogs to snakes (as­sign­ing num­bers/​el­e­ments of to the strengths of the prefer­ences).

2. Cat­e­gor­i­cal preference

Defi­ni­tion. To give a cat­e­gor­i­cal prefer­ence for el­e­ments of is to give a cat­e­gory with ob­jects , to­gether with an en­rich­ment over

This defi­ni­tion may seem a bit cryp­tic, but is close to a stan­dard way of think­ing of an or­der as an en­rich­ment. For ev­ery two ob­jects we as­sign an el­e­ment of (which we think of as tel­ling us the strength of the prefer­ence for over ), sub­ject to a bunch of com­pat­i­bil­ity con­di­tions (for ex­am­ple it will im­ply that the prefer­ence for over is given by the unit of ). More de­tails and ex­pan­sion can be found in Law­vere’s Tak­ing Cat­e­gories Se­ri­ously.

3. Utility function

This is just a func­tion , and is in­ter­preted in the usual way.

# Goal

In prac­tise, in­for­ma­tion is likely to be sup­plied in the form of a weak prefer­ence. Ul­ti­mately one wants a util­ity func­tion to feed to some op­ti­mi­sa­tion pro­ce­dure. We will de­scribe how to pass nat­u­rally from a weak to a cat­e­gor­i­cal prefer­ence, and then see (un­der some hy­pothe­ses) that a cat­e­go­rial prefer­ences in­duces an es­sen­tially-unique util­ity func­tion.

We sup­pose through­out that S and G are fixed, with finite for sim­plic­ity.

Weak prefer­ences to cat­e­go­rial preferences

First, given a cat­e­gor­i­cal prefer­ence, choos­ing a sub­set of ar­rows yields a weak prefer­ence. On the other hand, given a weak prefer­ence, we are in­ter­ested in

Q1. whether this weak prefer­ence can arise from a cat­e­gor­i­cal prefer­ence, and

Q2. if so, then from how many dis­tinct cat­e­gor­i­cal prefer­ences can this weak prefer­ence arise.

Sup­pose we are given a weak prefer­ence . First, for ev­ery triple , we ad­join to the triple ; call the re­sult­ing weak prefer­nce A cy­cle in W’ is a finite or­dered se­quence of triples , and such a cy­cle is closed if

Claim 1: The weak prefer­ence can be ex­tended to a cat­e­gor­i­cal prefer­ence if and only if ev­ery cy­cle is closed.

Sketch of proof. Sup­pose we are given and , and want to as­sign an el­e­ment of If there is no path from to we can as­sign any el­e­ment we want (we will use this ob­ser­va­tion later). If there is a path, then the group law in de­ter­mines a value of the prefer­ence of over . The only way a prob­lem might arise is if there are two (or more) paths from to , but then the cy­cles are closed con­di­tion means that these paths will in­duce the same prefer­ence for over . QED

We move on the the unique­ness ques­tion. As­sume that satis­fies the cy­cles are closed’ con­di­tion, and write for the set of con­nected com­po­nents of .

Claim 2. The set of cat­e­gor­i­cal prefer­ences re­strict­ing to is nat­u­rally in bi­jec­tion with

In par­tic­u­lar, if is con­nected then there is ex­actly one way to as­so­ci­ate a cat­e­gor­i­cal prefer­ence.

Sketch of proof. In the proof of claim 1, the only choice we made was when there was no path from to In that case we chose an el­e­ment of . QED

From cat­e­gor­i­cal prefer­ences to util­ity func­tions

Sup­pose we are given a cat­e­gor­i­cal prefer­ence on , i.e. the struc­ture of a cat­e­gory en­riched over with ob­ject set . Analo­gously to the above, we are in­ter­ested in whether and in how many ways this can be in­duced by a util­ity func­tion.

From now on, we as­sume that is Archimedean , equiv­a­lently that it is iso­mor­phic (as an or­dered group) to a sub­group of . This iso­mor­phism is then nec­es­sar­ily unique up to scal­ing (Holder’s the­o­rem).

Sup­pose first that we have a util­ity func­tion . Let be a sub­group con­tain­ing for ev­ery . Then by as­sign­ing to the pair the el­e­ment we give a cat­e­gor­i­cal prefer­ence struc­ture, with en­rich­ment over .

Again, we are in­ter­ested in two ques­tions:

Q1. given a cat­e­gor­i­cal prefer­ence on , does it arise from some util­ity func­tion in the above fash­ion?

Q2. If the an­swer to Q1 is yes’, then from how many util­ity func­tions can out cat­e­gor­i­cal prefer­ence arise?

The an­swer to Q1 is always yes. First choose an em­bed­ding of in as a to­tally or­dered group. Then choose some el­e­ment , and define The val­ues of on the other el­e­ments of are then uniquely de­ter­mined by the en­riched cat­e­gory struc­ture.

Q2 is al­most as easy. The em­bed­ding of in is unique up to scal­ing, and the choice of is just a trans­la­tion. Hence the util­ity func­tion is unique up to trans­la­tion and scal­ing.

# Con­clu­sion

In prac­tise, we might be given the data of a weak prefer­ence. We have seen an easy way to check whether it can be ex­tended to a cat­e­gor­i­cal prefer­ence, and a sim­ple de­scrip­tion of all the re­sult­ing pos­si­ble cat­e­gor­i­cal prefer­ences. A cat­e­gor­i­cal prefer­ence always comes from a util­ity func­tion, in a way which is unique up to trans­la­tion and scal­ing.

In ac­tual prac­tise, it is not un­likely that the weak prefer­ence data will not come from a cat­e­gor­i­cal prefer­ence (and thus not from a util­ity func­tion). Then we should prob­a­bly look for the most rea­son­able’ as­so­ci­ated cat­e­gor­i­cal prefer­ence, how to do this is not so clear to me yet.

I find the fact that util­ity func­tions are only unique up to trans­la­tion and scal­ing a bit awk­ward; maybe this no­tion of cat­e­gor­i­cal prefer­ence cap­tures the im­por­tant data in a more canon­i­cal fash­ion?!

• I think the most “na­tive” rep­re­sen­ta­tion of util­ity func­tions is ac­tu­ally as a func­tion from or­dered triples of out­comes to real num­bers. Rather than hav­ing an ar­bi­trary (af­fine sym­me­try break­ing) scale for strength of prefer­ence, set the scale of a prefer­ence by com­par­ing to a third pos­si­ble out­come.

The func­tion is the “how much bet­ter?” func­tion. Given pos­si­ble out­comes A, B, and X, how many times bet­ter is A (rel­a­tive to X) than B (rel­a­tive to X).

If A is choco­late cake, and B is ice cream, and X is go­ing hun­gry, maybe the choco­late cake prefer­ence is 1.25 times stronger, so the func­tion Bet­ter­ness(choco­late cake, ice cream, go­ing hun­gry) = 1.25.

This is the sort of prefer­ence that you would elicit from a gam­ble (at least from a ra­tio­nal agent, not nec­es­sar­ily from a hu­man). If I am in­differ­ent to a gam­ble with a prob­a­bil­ity 1 of ice cream, and a prob­a­bil­ity 0.8 of choco­late cake and 0.2 of go­ing hun­gry, this tells you that bet­ter­ness-value above.

Any­how, in­ter­est­ing post, I’m just idly com­ment­ing.

• Thanks for the com­ment Char­lie.

If I am in­differ­ent to a gam­ble with a prob­a­bil­ity of ice cream, and a prob­a­bil­ity 0.8 of choco­late cake and 0.2 of go­ing hungry

To check I un­der­stand cor­rectly, you mean the agent is in­differ­ent be­tween the gam­bles (prob­a­bil­ity of ice cream) and (prob­a­bil­ity 0.8 of choco­late cake, prob­a­bil­ity 0.2 of go­ing hun­gry)?

If I un­der­stand cor­rectly, you’re de­scribing a var­i­ant of Von Neu­mann–Mor­gen­stern where in­stead of giv­ing prefer­ences among all lot­ter­ies, you’re spec­i­fy­ing a cer­tain col­lec­tion of spe­cial type of pairs of lot­ter­ies be­tween which the agent is in­differ­ent, to­gether with a sign to say in which di­rec­tion’ things be­come preferred? It seems then likely to me that the data you give can be used to re­con­struct prefer­ences be­tween all lot­ter­ies...

If one is given in­for­ma­tion in the form you pro­pose but only for an in­com­plete’ set of spe­cial triples (c.f.weak prefer­ences’ above), then one can again ask whether and in how many ways it can be ex­tended to a com­plete set of prefer­ences. It feels to me as if there is an ex­tra am­bi­guity com­ing in with your de­scrip­tion, for ex­am­ple if the set of pos­si­ble out­comes has 6 el­e­ments and I am given the value of the Bet­ter­ness func­tion on two dis­joint triples, then to gen­er­ate a util­ity func­tion I have to not only choose a trans­la­tion’ be­tween the two triples, but also a scal­ing. But maybe this is bet­ter/​more re­al­is­tic!

. By spe­cial types’, I mean in­differ­ence be­tween pairs of gam­bles of the form
(prob­a­bil­ity of A) vs (prob­a­bil­ity of B and prob­a­bil­ity of C)
for some , and pos­si­ble out­comes A, B, C. Then the sign says that I pre­fer higher prob­a­bil­ity of B (say).

• I think a rea­son­able ques­tion to ask is what value hav­ing rel­a­tive strength of the prefer­ence or­der­ing mat­ters to ques­tions we want to an­swer. That is, I sus­pect the rea­son we nor­mally only con­sider a prefer­ence or­der­ing and not a prefer­ence mea­sure is that it’s not rele­vant to ob­served be­hav­ior, since in that case we pre­sume the most preferred pos­si­ble ac­tion will always be taken, thus more in­for­ma­tion than the or­der is not rele­vant to the de­ci­sion.

I can imag­ine we might care about mea­sure in cases like mod­el­ing how hu­man prefer­ences are ac­tu­ally man­i­fested where they might have rel­a­tive weights and can be up­dated, al­though per­son­ally I pre­fer the idea of avoid­ing this by mak­ing up­dat­ing ap­pear im­mutable by con­di­tion­ing each prefer­ence on the en­tire causal his­tory that came prior to its re­al­iza­tion, al­though this has its own prob­lems.

• Sure, in the end we only re­ally care about what comes top, as that’s the thing we choose. My feel­ing is that in­for­ma­tion on (rel­a­tive) strengths of prefer­ences is of­ten available, and when it is available it seems to make sense to use it (e.g. al­low­ing cir­cum­ven­tion of Ar­row’s the­o­rem).

In par­tic­u­lar, I worry that, when we only have or­di­nal prefer­ences, the out­come of at­tempts to com­bine var­i­ous prefer­ences will de­pend heav­ily on how finely we di­vide up the world; by us­ing in­for­ma­tion on strengths of prefer­ences we can miti­gate this.

• Neat!

Though I should men­tion that my cur­rent ver­sion of par­tial prefer­ences does not as­sume all cy­cles are closed—the con­strained op­ti­mi­sa­tion can be seen as try­ing to get “as close as pos­si­ble” to that, given non-closed cy­cles.

• Thanks! I like the way your op­ti­mi­sa­tion prob­lem han­dles non-closed cy­cles.

I think I’m less com­fortable with how it treats dis­con­nected com­po­nents—as I un­der­stand it you just trans­late each sep­a­rately to have cen­tre of mass’ at 0. If one wants to get a util­ity func­tion out at the end one has to make some kind of choice in this situ­a­tion, and the choice you make is prob­a­bly the best one, so in that sense it seems very good.

But for ex­am­ple it seems vuln­er­a­ble to cre­at­ing ‘vir­tual copies’ of wor­lds in or­der to shift the cen­tre of mass and push con­nected com­po­nents one way or the other. That was what started me think­ing about in­clud­ing strength of prefer­ence—if one adds to your setup a bunch of vir­tual copies of a world be­tween which one is `al­most in­differ­ent’ then it seems it will shift the cen­tre of mass, and thus the util­ity rel­a­tive to come other chain. Of course, if one is ac­tu­ally in­differ­ent then the ‘vir­tual copies’ will be col­lapsed to a sin­gle point in your , but if they are just ex­tremely close then it seems it will af­fect the util­ity rel­a­tive to some other chain. I’ll try to ex­plain this more clearly in a com­ment to your post.