Toy model piece #5: combining partial preferences

My pre­vi­ous ap­proach to com­bin­ing prefer­ences went like this: from the par­tial prefer­ences , cre­ate a nor­mal­ised util­ity func­tion that is defined over all wor­lds (and which is in­differ­ent to the in­for­ma­tion that didn’t ap­pear in the par­tial model). Then sim­ply add these util­ities, weighted ac­cord­ing to the weight/​strength of the prefer­ence.

But this method fails. Con­sider for ex­am­ple the fol­low­ing par­tial prefer­ences, all weighted with the same weight of :

  • .

  • .

  • .

  • .

  • .

If we fol­low the stan­dard nor­mal­i­sa­tion ap­proach, then the nor­mal­ised util­ity will be defined[1] as:

  • , , and oth­er­wise .

Then adding to­gether all five util­ity func­tions would give:

  • .

There are sev­eral prob­lems with this util­ity. Firstly, the util­ity of and the util­ity of are the same, even though in the only case where there is a di­rect com­par­i­son be­tween them, is ranked higher. We might say that we are miss­ing the com­par­i­sons be­tween and and , and could elicit these prefer­ences us­ing one-step hy­po­thet­i­cals. But what if com­par­ing to is a com­plex prefer­ence, and all that hap­pens is that the agent com­bines and ? If we added an­other par­tial prefer­ence that said , then would end up ranked above !

Another, more sub­tle point, is that the differ­ence be­tween and is too large. Sim­ply hav­ing and would give . Ad­ding in moves this differ­ence to . But note that is already im­plicit in and , so adding it shouldn’t make the differ­ence larger.

In fact, if the differ­ence in util­ity be­tween and were larger than , adding in should make the differ­ence be­tween and smaller: be­cause hav­ing weighted at means that the agent’s prefer­ence of over is not that strong.

En­ergy min­imis­ing be­tween utilities

So, how should we com­bine these prefer­ences oth­er­wise? Well, if I have a prefer­ence , of weight , that ranks out­come be­low out­come (write this as ), then, if these out­comes ap­pear nowhere else in any par­tial prefer­ence, will be .

So in a sense, that par­tial prefer­ence is try­ing to set the dis­tance be­tween those two out­comes to . Call this the en­ergy-min­imis­ing con­di­tion for .

Then for a util­ity func­tion , we can define the en­ergy of , as com­pared with the (par­tially defined) nor­mal­ised util­ity cor­re­spond­ing to . It is:

  • .

This is the differ­ence be­tween the weighted dis­tance be­tween the out­comes that , and the one that ac­tu­ally gives.

Be­cause differ­ent par­tial prefer­ences have differ­ent num­ber of el­e­ments to com­pare, we can com­pute the av­er­age en­ergy of :

  • .

Global en­ergy min­imis­ing condition

But weights have an­other role to play here; they mea­sure not only how much is preferred to , but how im­por­tant it is to reach that prefer­ence. So, for hu­mans, ” with weight ” means both:

  • is not much preferred to .

  • The hu­mans isn’t too fussed about the or­der­ing of and .

For gen­eral agents, these two could be sep­a­rate phe­nom­ena; but for hu­mans, they gen­er­ally seem to be the same thing. So we can reuse the weights to com­pute the global en­ergy for as com­pared to all par­tial prefer­ences, which is just the weighted sum of its av­er­age en­ergy for each par­tial prefer­ence:

  • .

Then the ac­tual ideal is defined to be the that min­imises this en­ergy term.


Now, it’s clear this ex­pres­sion is con­vex. But it need not be strictly con­vex (which would im­ply a sin­gle solu­tion): for ex­am­ple, if () and () were the only par­tial prefer­ences, then there would be no con­di­tions on the rel­a­tive util­ities of , and .

Say that is linked to , by defin­ing a link as “there ex­ists a with or ”, and then mak­ing this defi­ni­tion tran­si­tive and re­flex­ive (it’s au­to­mat­i­cally sym­met­ric). In the ex­am­ple above, with , , all of are linked.

Be­ing linked is an equiv­alence re­la­tion. And within a class of linked wor­lds, if we fix the util­ity of one world, then the en­ergy min­imi­sa­tion equa­tion be­comes strictly con­vex (and hence has a sin­gle solu­tion). Thus, within a class of linked wor­lds, the en­ergy min­imi­sa­tion equa­tion has a sin­gle solu­tion, up to trans­la­tion.

So if we want a sin­gle , trans­late the solu­tion for each linked class so that the av­er­age util­ity in that class is equal to the av­er­age of ev­ery other linked class. And this would then define uniquely (up to trans­la­tion).

For ex­am­ple, if we only had () and (), this could set to be:

Here, the av­er­age util­ity in each linked class (, and ) is .

Ap­ply­ing this to the example

So, ap­ply­ing this ap­proach to the full set of the , above (and fix­ing ), we’d get:

  • .

Here is in the mid­dle of and , as it should be, while the util­ities of and are defined by their dis­tance from only. The dis­tance be­tween and is . This is be­tween (which would be given by and only) and (which would be given by only).

  1. I’ve di­vided the nor­mal­i­sa­tion from that post by , to fit bet­ter with the meth­ods of this post. Di­vid­ing ev­ery­thing in a sum by the same con­stant gives the same equiv­alence class of util­ity func­tions. ↩︎