Toy model piece #2: Combining short and long range partial preferences

I’m work­ing to­wards a toy model that will illus­trate all the steps in the re­search agenda. It will start with some al­gorith­mic stand-in for the “hu­man”, and pro­ceed to cre­ate the , fol­low­ing all the steps in that re­search agenda. So I’ll be post­ing a se­ries of “toy model pieces”, that will then be ul­ti­mately com­bined in a full toy model. Along the way, I hope to get a bet­ter un­der­stand­ing of how to do the re­search agenda in prac­tice, and maybe even mod­ify that agenda based on in­sights mak­ing the toy model.

For this post, I’ll look in more de­tail into how to com­bine differ­ent types of (par­tial) prefer­ences.

Short-dis­tance, long-dis­tance, and other preferences

I nor­mally use pop­u­la­tion ethics as my go-to-ex­am­ple for a ten­sion be­tween differ­ent types of prefer­ences. You can get a lot of mileage by con­trast­ing the re­pug­nance of the re­pug­nant con­clu­sion with the seem­ing in­tu­itive­ness of the mere ad­di­tion ar­gu­ment.

How­ever, many peo­ple who read this will have strong opinions about pop­u­la­tion ethics, or at least some opinions. Since I’m not try­ing to con­vince any­one of my par­tic­u­lar pop­u­la­tion ethics here, I thought it best to shift to an­other set­ting where we could see similar ten­sions at work, with­out the bag­gage.

Liv­ing in a world of smiles

Sup­pose you have three some­what con­tra­dic­tory eth­i­cal in­tu­itions. Or rather, in the for­mu­la­tion of my re­search agenda, two some­what con­tra­dic­tory par­tial prefer­ences.

The sec­ond is that any world would be bet­ter if peo­ple smiled more (). The third is that if al­most ev­ery­one smiles all the time, it gets re­ally creepy ().

Now, the proper way of re­solv­ing those prefer­ences is to ap­peal to meta-prefer­ences, or to cut them up into their web of con­no­ta­tions: why do we value smiles? Is it be­cause peo­ple are happy? Why do we find uni­ver­sal smil­ing creepy? Is it be­cause we fear that some­thing un­nat­u­ral is mak­ing them smile that way? That’s the proper way of re­solv­ing those prefer­ences.

How­ever, let’s pre­tend there are no meta-prefer­ences, and no con­no­ta­tions, and just try to com­bine the prefer­ences as given.

Smiles and worlds

Fix the pop­u­la­tion to a hun­dred peo­ple, and let be the set of wor­lds. This set will con­tain one hun­dred and one differ­ent wor­lds, de­scribed by , where is an in­te­ger, de­not­ing the num­ber of peo­ple smil­ing in these wor­lds.

We can for­mal­ise the prefer­ences as fol­lows:

  • .

  • and }$.

Th­ese give rise to the fol­low­ing util­ity func­tions (for sim­plic­ity of the for­mula, I’ve trans­lated the defi­ni­tion of ; trans­la­tions don’t mat­ter when com­bin­ing util­ities; I’ve also writ­ten as ):

  • .

  • .

But be­fore be­ing com­bined, there prefer­ences have to be nor­mal­ised. There are mul­ti­ple ways we could do this, and I’ll some­what ar­bi­trar­ily choose the “mean-max” method, which nor­mal­ises the util­ity differ­ence be­tween the top world and the av­er­age world[1].

Given that nor­mal­i­sa­tion, we have:

  • .

  • .

Thus we send the to their nor­mal­ised coun­ter­parts:

  • .

  • .

Now con­sider what hap­pens when we do the weighted sum of these util­ities, weighted by the in­ten­sity of the hu­man feel­ing on the sub­ject:

  • .

If the weights and are equal, we get the fol­low­ing, where the util­ity of the world grows slowly with the num­ber of smiles, un­til it reaches the max­i­mum at and then drops pre­cip­i­tously:

Thus is dom­i­nant most of the time when com­par­ing wor­lds, but is very strong on the few wor­lds it re­ally wants to avoid.

But what if (a seem­ing odd choice) is weighted less that (a more “nat­u­ral” choice)?

Well, set­ting for the mo­ment, if , then the util­ity for all wor­lds with are the same:

.

Thus if , will force the op­ti­mal to be (and will se­lect from these op­tions). If , then will dom­i­nate com­pletely, set­ting .

This seems like it could be ex­tended to solve pop­u­la­tion ethics con­sid­er­a­tions in var­i­ous ways (where might be to­tal util­i­tar­i­anism, with av­er­age util­i­tar­i­anism or just a dis­like of wor­lds with ev­ery­one at very low util­ity). To go back to my old post about differ­en­tial ver­sus in­te­gral ethics, is a differ­en­tial con­straint, is an in­te­gral one, and is the com­pro­mise point be­tween them.

In­vert­ing the utilities

If we in­vert the util­ities, things be­have differ­ently. If we had (smiles are bad) and (only lots of smiles are good) in­stead, things would be differ­ent[2]. In mean-max, the norm of these would be:

  • .

  • .

So the nor­mal­ised ver­sion of is just , but the nor­mal­ised ver­sion of is differ­ent from .

Then, at equal weights, we get the fol­low­ing graph for :

Thus fails at hav­ing any in­fluence, and is op­ti­mum.

To get the break-ever point, we need , where and are equally val­ued:

For greater than that, dom­i­nates com­pletely, and forces .

It’s clear that and are less “an­tag­o­nis­tic” than and are (com­pare the sin­gle peak in the graph in the first case, with the two peaks in the sec­ond).


  1. Why choose the mean-max nor­mal­i­sa­tion? Well, it seems to have a num­ber of nice prop­er­ties. It has some nice for­mal prop­er­ties, as the in­terthe­o­retic util­ity com­par­i­son post demon­strates. But it also, to some ex­tent, boosts util­ity func­tion to the ex­tent that they do not in­terfere much with other func­tions.

    What do I mean by this? Well, con­sider two util­ity func­tions over differ­ent wor­lds. The first one, , ranks one world () as above all oth­ers (the other ones be­ing equal). The sec­ond one, , ranks one world () as be­low all oth­ers (the other ones be­ing equal).

    Un­der the mean-max nor­mal­i­sa­tion, and for other . Un­der the same nor­mal­i­sa­tion, while for other .

    Thus has a much wider “spread” that , mean­ing that, in a nor­mal­ised sum of util­ities, af­fects the out­come much more strongly than (“out­come” mean­ing the out­come of max­imis­ing the summed util­ity). This is ac­cept­able, even de­sir­able: dom­i­nat­ing the out­come just rules out one uni­verse (), while dom­i­nat­ing the out­come rules out all-but-one uni­verse (). So, in a sense, their abil­ity to fo­cus the out­come is com­pa­rable: al­most never fo­cuses the out­come, but when it does, it nar­rows down to a sin­gle uni­verse. While al­most always fo­cuses the out­come, but barely nar­rows it down. ↩︎

  2. There is no point hav­ing the pairs be­ing or , since those pairs agree on the or­der­ing of the wor­lds, up to ties. ↩︎

No comments.