# DavidHolmes

Karma: 32
• Sure, in the end we only re­ally care about what comes top, as that’s the thing we choose. My feel­ing is that in­for­ma­tion on (rel­a­tive) strengths of prefer­ences is of­ten available, and when it is available it seems to make sense to use it (e.g. al­low­ing cir­cum­ven­tion of Ar­row’s the­o­rem).

In par­tic­u­lar, I worry that, when we only have or­di­nal prefer­ences, the out­come of at­tempts to com­bine var­i­ous prefer­ences will de­pend heav­ily on how finely we di­vide up the world; by us­ing in­for­ma­tion on strengths of prefer­ences we can miti­gate this.

• (ac­tu­ally, my for­mula dou­bles the num­bers you gave)

Are you sure? Sup­pose we take with , , then , so the val­ues for should be as I gave them. And similarly for , giv­ing val­ues . Or else I have mis-un­der­stood your defi­ni­tion?

I’d sim­ply see that as two sep­a­rate par­tial preferences

Just to be clear, by “sep­a­rate par­tial prefer­ence” you mean a sep­a­rate pre­order, on a set of ob­jects which may or may not have some over­lap with the ob­jects we con­sid­ered so far? Then some­how the work is just post­poned to the point where we try to com­bine par­tial prefer­ences?

EDIT (in re­ply to your edit): I guess e.g. keep­ing con­di­tions 1,2,3 the same and in­stead min­imis­ing

where is pro­por­tion to the re­cip­ro­cal of the strength of the prefer­ence? Of course there are lots of var­i­ants on this!

• This seems re­ally neat, but it seems quite sen­si­tive to how one defines the wor­lds un­der con­sid­er­a­tion, and whether one counts slightly differ­ent wor­lds as ac­tu­ally dis­tinct. Let me try to illus­trate this with an ex­am­ple.

Sup­pose we have a con­sist­ing of 7 wor­lds, , with prefer­ences
and no other non-triv­ial prefer­ences. Then (from the sen­si­ble case’), I think we get the fol­low­ing util­ities:

.

Sup­pose now that I cre­ate two new copies , of the world which each differ by the po­si­tion of a sin­gle atom, so as to give me (ex­tremely weak!) prefer­ences , so all the non-triv­ial prefer­ences in the new are now sum­marised as

Then the re­sult­ing util­ities are (I think):

.

In par­tic­u­lar, be­fore adding in these ‘triv­ial copies’ we had , and now we get . Is this a prob­lem? It de­pends on the situ­a­tion, but to me it sug­gests that, if us­ing this ap­proach, one needs to be care­ful in how the wor­lds are speci­fied, and the ‘fine-grained­ness’ needs to be roughly the same ev­ery­where.

• Thanks! I like the way your op­ti­mi­sa­tion prob­lem han­dles non-closed cy­cles.

I think I’m less com­fortable with how it treats dis­con­nected com­po­nents—as I un­der­stand it you just trans­late each sep­a­rately to have cen­tre of mass’ at 0. If one wants to get a util­ity func­tion out at the end one has to make some kind of choice in this situ­a­tion, and the choice you make is prob­a­bly the best one, so in that sense it seems very good.

But for ex­am­ple it seems vuln­er­a­ble to cre­at­ing ‘vir­tual copies’ of wor­lds in or­der to shift the cen­tre of mass and push con­nected com­po­nents one way or the other. That was what started me think­ing about in­clud­ing strength of prefer­ence—if one adds to your setup a bunch of vir­tual copies of a world be­tween which one is al­most in­differ­ent’ then it seems it will shift the cen­tre of mass, and thus the util­ity rel­a­tive to come other chain. Of course, if one is ac­tu­ally in­differ­ent then the ‘vir­tual copies’ will be col­lapsed to a sin­gle point in your , but if they are just ex­tremely close then it seems it will af­fect the util­ity rel­a­tive to some other chain. I’ll try to ex­plain this more clearly in a com­ment to your post.

• Thanks for the com­ment Char­lie.

If I am in­differ­ent to a gam­ble with a prob­a­bil­ity of ice cream, and a prob­a­bil­ity 0.8 of choco­late cake and 0.2 of go­ing hungry

To check I un­der­stand cor­rectly, you mean the agent is in­differ­ent be­tween the gam­bles (prob­a­bil­ity of ice cream) and (prob­a­bil­ity 0.8 of choco­late cake, prob­a­bil­ity 0.2 of go­ing hun­gry)?

If I un­der­stand cor­rectly, you’re de­scribing a var­i­ant of Von Neu­mann–Mor­gen­stern where in­stead of giv­ing prefer­ences among all lot­ter­ies, you’re spec­i­fy­ing a cer­tain col­lec­tion of spe­cial type of pairs of lot­ter­ies be­tween which the agent is in­differ­ent, to­gether with a sign to say in which di­rec­tion’ things be­come preferred? It seems then likely to me that the data you give can be used to re­con­struct prefer­ences be­tween all lot­ter­ies...

If one is given in­for­ma­tion in the form you pro­pose but only for an in­com­plete’ set of spe­cial triples (c.f.weak prefer­ences’ above), then one can again ask whether and in how many ways it can be ex­tended to a com­plete set of prefer­ences. It feels to me as if there is an ex­tra am­bi­guity com­ing in with your de­scrip­tion, for ex­am­ple if the set of pos­si­ble out­comes has 6 el­e­ments and I am given the value of the Bet­ter­ness func­tion on two dis­joint triples, then to gen­er­ate a util­ity func­tion I have to not only choose a trans­la­tion’ be­tween the two triples, but also a scal­ing. But maybe this is bet­ter/​more re­al­is­tic!

. By spe­cial types’, I mean in­differ­ence be­tween pairs of gam­bles of the form
(prob­a­bil­ity of A) vs (prob­a­bil­ity of B and prob­a­bil­ity of C)
for some , and pos­si­ble out­comes A, B, C. Then the sign says that I pre­fer higher prob­a­bil­ity of B (say).

• Thanks for point­ing me to this up­dated ver­sion :-). This seems a re­ally neat trick for writ­ing down a util­ity func­tion that is com­pat­i­ble with the given pre­order. I thought a bit more about when/​to what ex­tent such a util­ity func­tion will be unique, in par­tic­u­lar if you are given not only the data of a pre­order, but also some in­for­ma­tion on the strengths of the prefer­ences. This ended up a bit too long for a com­ment, so I wrote a few things in out­line here:

https://​​www.less­wrong.com/​​posts/​​7ncFy84ReMFW7TDG6/​​cat­e­go­rial-prefer­ences-and-util­ity-functions

It may be quite ir­rele­vant to what you’re aiming for here, but I thought it was maybe worth writ­ing down just in case.

# Cat­e­go­rial prefer­ences and util­ity functions

9 Aug 2019 21:36 UTC
9 points
• Hi Stu­art,

I’m work­ing my way through your Re­search Agenda v0.9’ post, and am there­fore go­ing through var­i­ous older posts to un­der­stand things. I won­der if I could ask some ques­tions about the defi­ni­tion you pro­pose here?

First, that be con­tained in for some seems not so rele­vant; can I just as­sume X, Y and Z are some man­i­folds ( for some )? And we are given some par­tial or­der on X, so that we can re­fer to be­ing a bet­ter world’?

Then, as I un­der­stand it, your defi­ni­tion says the fol­low­ing:

Fix X, and Z. Let Y be a man­i­fold and , . Given a lo­cal ho­mo­mor­phism , we say that is par­tially preferred to if for all , we have .

I’m not sure which in­equal­ities should be strict, but this seems non-es­sen­tial for now. On the other hand, the de­pen­dence of this defi­ni­tion on the choice of Y seems some­what sub­tle and in­ter­est­ing. I will try to illus­trate this in what fol­lows.

First, let us make a new defi­ni­tion. Fix X, , and Z as be­fore. Let , a two-el­e­ment set equipped with the dis­crete topol­ogy, and let be an im­mer­sion of -man­i­folds. We say that is weakly par­tially preferred to if for all , we have .

First, it is clear that par­tial prefer­ence im­plies weak par­tial prefer­ence. More for­mally:

Claim 1: Fix X, and Z. Sup­pose we have a man­i­fold Y, points , , and a lo­cal ho­mo­mor­phism such that is par­tially preferred to . Set­ting with the sub­space topol­ogy from (i.e. dis­crete), and tak­ing to be the re­stric­tion of from to , we have that is weakly par­tially preferred to .

Proof: ob­vi­ous. $\qed$

How­ever, the con­verse can fail if Z is not con­tractible. First, let’s prove that the con­cepts are equiv­a­lent for Z con­tractible:

Claim 2: Fix X, and Z, and as­sume that Z is con­tractible. Sup­pose we have a two-el­e­ment set and a map mak­ing weakly par­tially preferred to . Then there ex­ist a man­i­fold Y, an in­jec­tion , and a lo­cal home­o­mor­phism whose re­stric­tion to is , mak­ing par­tially preferred to .

Proof: Let’s as­sume for sim­plic­ity of no­ta­tion that X is equidi­men­sional, say of di­men­sion , and write for the di­men­sion of Z. Let Y be the dis­joint union of two open balls of di­men­sion , with the in­clu­sion of the cen­tres of the balls. Then take an -neigh­bour­hood of Z in X; it is diffeo­mor­phic to since the nor­mal bun­dle to Z in X is triv­ial­is­able (c.f. https://​​math.stack­ex­change.com/​​ques­tions/​​857784/​​product-neigh­bor­hood-the­o­rem-with-bound­ary). $\qed$

If we want ex­am­ples where weak par­tial prefer­ence and par­tial prefer­ence don’t co­in­cide, we should look for an ex­am­ple where Z is not con­tractible, and its nor­mal bun­dle in X is not con­tractible.

Ex­am­ple 3: Let X be the dis­joint union of two moe­bius bands, and let Z be a cir­cle. Note that in­clud­ing Z along the cen­tre of ei­ther band gives a sub­man­i­fold whose tubu­lar neigh­bour­hood is not a product. As­sume that is such that one com­po­nent of X is preferred to the other (and is in­differ­ent within each con­nected com­po­nent). Then take , and to be the in­clu­sion of the two cir­cles along the cen­tres of the two moe­bius bands, such that ends up in the preferred band. This yields a situ­a­tion where is weakly par­tially preferred to , but the con­clu­sion of Claim 2 fails, i.e. this can­not be ex­tended to a par­tial prefer­ence for over .

What con­clu­sion should we draw from this? To me, it sug­gests that the no­tion of par­tial prefer­ence is not yet quite as one would want. In the set­ting of Ex­am­ple 3, where X con­sists of two moe­bius strips, one of which is preferred to the other, then land­ing in the preferred strip should be preferred to land­ing in the un-preferred strip?! And yet the lo­cal home­o­mor­phism from a product’ con­di­tion gets in the way. This ex­am­ple is ob­vi­ously quite ar­tifi­cial, and maybe analo­gous things can­not oc­cur in re­al­ity. But I’m not so happy with this as an an­swer, since our ap­proaches to AI safety should be (so far as pos­si­ble) ro­bust against the flaws in our un­der­stand­ing of physics.

Apolo­gies for the overly-long com­ment, and for the im­perfect LaTeX (I’ve not used this type of form much be­fore).

• Thanks for the re­ply, Zack.

The rea­son this ob­jec­tion doesn’t make the post com­pletely use­less...

Sorry, I hope I didn’t sug­gest I thought that! You make a good point about some vari­ables be­ing more nat­u­ral in given ap­pli­ca­tions. I think it’s good to keep in mind that some­times it’s just a mat­ter of co­or­di­nate choice, and other times the points may be sep­a­rated but not in a lin­ear way.

• Hi Zack,

Can you clar­ify some­thing? In the pic­ture you draw, there is a codi­men­sion-1 lin­ear sub­space sep­a­rat­ing the pa­ram­e­ter space into two halves, with all red points to one side, and all blue points to the other. Pro­ject­ing onto any 1-di­men­sional sub­space or­thog­o­nal to this (there is a unique one through the ori­gin) will thus yield a vari­able’ which cleanly sep­a­rates the two points into the red and blue cat­e­gories. So in the illus­trated ex­am­ple, it looks just like a prob­lem of bad co­or­di­nate choice.

On the other hand, one can eas­ily have much more patholog­i­cal situ­a­tions; for ex­am­ples, the red points could all lie in­side a cer­tain sphere, and the blue points out­side it. Then no choice of lin­ear co­or­di­nates will illus­trate this, and one has to use more ad­vanced anal­y­sis tech­niques to pick up on it (e.g. per­sis­tent ho­mol­ogy).

So, to my vague ques­tion: do you have only the first situ­a­tion in mind, or are you also con­sid­er­ing the gen­eral case, but made the illus­trated ex­am­ple ex­tra-sim­ple?

Per­haps this is clar­ified by your nu­mer­i­cal ex­am­ple, I’m afraid I’ve not checked.