Intuitive supergoal uncertainty

There is a com­mon in­tu­ition and feel­ing that our most fun­da­men­tal goals may be un­cer­tain in some sense. What causes this in­tu­ition? For this topic I need to be able to pick out one’s top level goals, roughly one’s con­text in­sen­si­tive util­ity func­tion, and not some task spe­cific util­ity func­tion, and I do not want to im­ply that the top level goals can be in­ter­preted in the form of a util­ity func­tion. Fol­low­ing from Eliezer’s CFAI pa­per I thus choose the word “su­per­goal” (sorry Eliezer, but I am fond of that old doc­u­ment and its ten­dency to coin new vo­cab­u­lary). In what fol­lows, I will nat­u­ral­is­ti­cally ex­plore the in­tu­ition of su­per­goal un­cer­tainty.

To posit a model, what goal un­cer­tainty (in­clud­ing su­per­goal un­cer­tainty as an in­stance) means is that you have a weighted dis­tri­bu­tion over a set of pos­si­ble goals and a mechanism by which that weight may be re­dis­tributed. If we take away the dis­tri­bu­tion of weights how can we choose ac­tions co­her­ently, how can we com­pare? If we take away the weight re­dis­tri­bu­tion mechanism we end up with a sin­gle goal whose state util­ities may be defined as the weighted sum of the con­stituent goals’ util­ities, and thus the weight re­dis­tri­bu­tion mechanism is nec­es­sary for goal un­cer­tainty to be a dis­tinct con­cept.

  • Part of the in­tu­ition of su­per­goal un­cer­tainty nat­u­rally fol­lows from goal and plan­ning un­cer­tainty. What plan of ac­tion is best? How should I con­struct a lo­cal util­ity func­tion for this con­text? The weight re­dis­tri­bu­tion mechanism is then a re­sult of gath­er­ing more ev­i­dence, calcu­lat­ing fur­ther, and see­ing how this goal links up to one’s su­per­goals in the con­text of other plans.

  • It could be we are mis­taken. The rules so­ciety sets up to co­or­di­nate be­hav­ior give the de­fault as­sump­tion that there is an ab­solute stan­dard by which to judge a be­hav­ior good or bad. Also, re­li­gions many times dic­tate that there is a moral ab­solute, and even if we aren’t re­li­gious, the cul­tural mi­lieu make us con­sider the pos­si­bil­ity that the con­cept “should” (and so the ex­is­tence of a weight re­dis­tri­bu­tion mechanism) can be val­idly ap­plied to su­per­goals.

  • It could be we are con­fused. Our pro­fessed su­per­goal does not nec­es­sar­ily equal our ac­tual su­per­goal but nei­ther are they com­pletely sep­a­rate. So when we re­view our past be­hav­ior and in­tro­spect to de­ter­mine what our su­per­goal is, we get con­flict­ing ev­i­dence that is hard to rec­on­cile with the be­lief that we have one sim­ple su­per­goal. Nor can we nec­es­sar­ily en­dorse the ob­served su­per­goal for so­cial rea­sons. The difficulty in de­scribing the su­per­goal is then rep­re­sented as a weight­ing over pos­si­ble su­per­goals and the weight re­dis­tri­bu­tion mechanism cor­re­sponds to up­dat­ing our self model given ad­di­tional ob­ser­va­tions and in­tro­spec­tions along with vary­ing the so­cial con­text.

  • It could be I’m con­fused :) Peo­ple, in­clud­ing some part of me, may wish to pre­sent the illu­sion that we do not know what our su­per­goals truly are and also pre­tend that the su­per­goals are malle­able to change af­ter ar­gu­ment for game the­o­retic rea­sons.

  • It could be we are in­co­her­ent. As Allais’s para­dox, hy­per­bolic dis­count­ing, and cir­cu­lar prefer­ences, show that no util­ity func­tion may be defined for peo­ple (at least in any sim­ple way). How then may we ap­prox­i­mate a per­son’s be­hav­ior with a util­ity func­tion/​su­per­goal? Us­ing a weight dis­tri­bu­tion and up­dat­ing (along with some ad­di­tional in­ter­pre­ta­tion ma­chin­ery) is a plau­si­ble pos­si­bil­ity (though an ad­mit­tedly ugly one). Per­haps su­per­goal un­cer­tainty is a kludge to de­scribe this in­co­her­ent be­hav­ior. Our en­vi­ron­ments, so­cial and phys­i­cal, en­force con­sis­tency con­straints upon us, ap­proach­ing mak­ing us, in iso­lated con­texts, ex­pec­ta­tion max­i­miz­ers. Could some­thing like weight­ing based on prob­a­bil­ity of en­coun­ter­ing each of those con­texts define our in­di­vi­d­ual su­per­goals? Ugly, ugly, ugly.

  • It could be we pre­dict our su­per­goals will change with time. Who said peo­ple have sta­ble goals? Just look at chil­dren ver­sus adults or the changes peo­ple un­dergo when they get sta­tus or have chil­dren. Per­haps the un­cer­tainty has to do with what they pre­dict their fu­ture su­per­goals will be in face of fu­ture cir­cum­stances and ar­gu­ments.

  • It could be we dis­cover our su­per­goals and have un­cer­tainty over what we will dis­cover and what we would even­tu­ally get at the limit of our ex­plo­ra­tion. At one point I had rather limited ex­po­sure to the var­i­ous types of foods but now find I like ex­plor­ing taste space. At one point I didn’t know com­puter sci­ence but now I en­joy its beauty. At one point I hadn’t yet pur­sued women but now find it quite en­joy­able. Some things we ap­par­ently just have to try (or at the very least think about) to dis­cover if we like them.

  • It could be we can­not com­pletely sep­a­rate our an­ti­ci­pa­tions from our goals. If our an­ti­ci­pa­tions are slowly up­dat­ing, sys­tem­at­i­cally off, and co­her­ent in their effect on re­al­ity then it is easy to mis­take the in­ter­ac­tion of flawed an­ti­ci­pa­tions plus a su­per­goal with hav­ing an en­tirely differ­ent su­per­goal.

  • It could be we have un­cer­tainty over how to define our very selves. If your self defi­ni­tion doesn’t in­clude ir­ra­tional be­hav­ior or self­ish­ness or sys­tem 1 or in­cludes the Google over­mind, then “your” goals are go­ing to look quite differ­ent de­pend­ing on what you in­clude and ex­clude in your self defi­ni­tion. It is also pos­si­ble your util­ity func­tion doesn’t de­pend upon self defi­ni­tion or you are “by defi­ni­tion” your util­ity func­tion and this ques­tion is moot.

  • It could be that en­vi­ron­men­tal con­straints cause some su­per­goals to ex­press them­selves equiv­a­lently to other su­per­goals. Per­haps your su­per­goal could be for­ever deferred in or­der to gain ca­pa­bil­ity to achieve it ever bet­ter (likely a rare situ­a­tion). Per­haps big world an­thropic ne­go­ti­a­tion ar­gu­ments mean you must always dis­tort the achieve­ment of your su­per­goal. Per­haps the metagolden rule is in effect and “so­cial” con­di­tions force you to con­strain your be­hav­ior.

  • It could be that there re­ally is a way to de­cide be­tween su­per­goals (un­likely but still con­ceiv­able) and they don’t know yet where that de­ci­sion pro­cess will take them. There could even ac­tu­ally be a mean­ing of life (i.e. uni­ver­sally con­vinc­ing su­per­goal given some in­tel­li­gence pre­con­di­tions) af­ter all.

  • It could be caused by ev­i­den­tial and log­i­cal un­cer­tainty. A mind is made of many parts and there are con­straints about how much each part can know about the oth­ers or the whole about the com­po­nents. To show how this im­plies a form of su­per­goal un­cer­tainty, par­ti­tion a mind along func­tional com­po­nents. Each of these com­po­nents has its own func­tion. It may not be able to achieve it with­out the rest of the com­po­nents but nev­er­the­less it is there. Now if the op­ti­miza­tion power em­bed­ded in that com­po­nent is large enough and the sys­tem as a whole has ev­i­den­tial or log­i­cal un­cer­tainty about how that com­po­nent will work you get the pos­si­bil­ity that this func­tional sub­com­po­nent will “op­ti­mize” its way to­wards get­ting greater weight in the de­ci­sion pro­cess and hi­er­ar­chi­cally this pro­ceeds for all sub­com­po­nent op­ti­miz­ers. So, in essence, when­ever there is ev­i­den­tial or log­i­cal un­cer­tainty about the op­er­a­tions of an op­ti­miz­ing sub­com­po­nent we get a su­per­goal term cor­re­spond­ing to that part and the weight re­dis­tri­bu­tion mechanism cor­re­sponds to that sub­com­po­nent co-opt­ing some weight. Per­haps this con­cept can even be ex­tended to define su­per­goals (with un­cer­tainty) for ev­ery­thing from pure ex­pec­ta­tion max­i­miz­ers to rocks.

  • It could be un­cer­tainty over how to ground out in re­al­ity the defi­ni­tion of the su­per­goal. If I want to max­i­mize pa­per clips and just now learn quan­tum me­chan­ics do I count a pa­per­clip in a su­per­po­si­tion of states once or many times? If I have one in­finite bunch of pa­per­clips I could pro­duce ver­sus an­other how do I choose? If my util­ity func­tion is un­bounded in both pos­i­tive and nega­tive di­rec­tions and I do Solomonoff in­duc­tion how can I make de­ci­sions at all given that ac­tions may have val­ues that are un­defined?

  • It could be some sort of mys­te­ri­ous un­der­ly­ing fac­tor makes the for­mal con­cept of su­per­goal in­ap­pro­pri­ate and it is this mis­matched fit that causes un­cer­tainty and weight re­dis­tri­bu­tion. Unifi­ca­tion of pri­ors and val­ues in up­date-less-de­ci­sion the­ory? Some­thing else? The uni­verse is still un­known enough that we could be mis­taken on this level.

  • It could be some­thing else en­tirely.

(ps I may soon post and ex­plore the effects of su­per­goal un­cer­tainty in its var­i­ous reifi­ca­tions on mak­ing de­ci­sions. For in­stance, what im­pli­ca­tions, if any, does it have on bounded util­ity func­tions (and ac­tions that de­pend on those bounds) and nega­tive util­i­tar­i­anism (or sym­met­ri­cally pos­i­tive util­i­tar­i­anism)? Also, if any­one knows of re­lated liter­a­ture I would be happy to check it out.)

(pps Dang, the con­cept of su­per­goal un­cer­tainty is sur­pris­ingly beau­tiful and fun to ex­plore, and I now have a vague wisp of an idea of how to in­te­grate a sub­set of these with TDT/​UDT)