The Human’s Hidden Utility Function (Maybe)

Sup­pose it turned out that hu­mans vi­o­late the ax­ioms of VNM ra­tio­nal­ity (and there­fore don’t act like they have util­ity func­tions) be­cause there are three val­u­a­tion sys­tems in the brain that make con­flict­ing val­u­a­tions, and all three sys­tems con­tribute to choice. And sup­pose that upon re­flec­tion we would clearly re­ject the out­puts of two of these sys­tems, whereas the third sys­tem looks some­thing more like a util­ity func­tion we might be able to use in CEV.

What I just de­scribed is part of the lead­ing the­ory of choice in the hu­man brain.

Re­call that hu­man choices are made when cer­tain pop­u­la­tions of neu­rons en­code ex­pected sub­jec­tive value (in their firing rates) for each op­tion in the choice set, with the fi­nal choice be­ing made by an argmax or reser­va­tion price mechanism.

To­day’s news is that our best cur­rent the­ory of hu­man choices says that at least three differ­ent sys­tems com­pute “val­ues” that are then fed into the fi­nal choice cir­cuit:

  • The model-based sys­tem “uses ex­pe­rience in the en­vi­ron­ment to learn a model of the tran­si­tion dis­tri­bu­tion, out­comes and mo­ti­va­tion­ally-sen­si­tive util­ities.” (See Sut­ton & Barto 1998 for the mean­ings of these terms in re­in­force­ment learn­ing the­ory.) The model-based sys­tem also “in­fers choices by… build­ing and eval­u­at­ing the search de­ci­sion tree to work out the op­ti­mal course of ac­tion.” In short, the model-based sys­tem is re­spon­si­ble for goal-di­rected be­hav­ior. How­ever, mak­ing all choices with a goal-di­rected sys­tem us­ing some­thing like a util­ity func­tion would be com­pu­ta­tion­ally pro­hibitive (Daw et al. 2005), so many an­i­mals (in­clud­ing hu­mans) first evolved much sim­pler meth­ods for calcu­lat­ing the sub­jec­tive val­ues of op­tions (see be­low).

  • The model-free sys­tem also learns a model of the tran­si­tion dis­tri­bu­tion and out­comes from ex­pe­rience, but “it does so by caching and then re­call­ing the re­sults of ex­pe­rience rather than build­ing and search­ing the tree of pos­si­bil­ities. Thus, the model-free con­trol­ler does not even rep­re­sent the out­comes… that un­der­lie the util­ities, and is there­fore not in any po­si­tion to change the es­ti­mate of its val­ues if the mo­ti­va­tional state changes. Con­sider, for in­stance, the case that af­ter a sub­ject has been taught to press a lever to get some cheese, the cheese is poi­soned, so it is no longer worth eat­ing. The model-free sys­tem would learn the util­ity of press­ing the lever, but would not have the in­for­ma­tional where­withal to re­al­ize that this util­ity had changed when the cheese had been poi­soned. Thus it would con­tinue to in­sist upon press­ing the lever. This is an ex­am­ple of mo­ti­va­tional in­sen­si­tivity.”

  • The Pavlo­vian sys­tem, in con­trast, calcu­lates val­ues based on a set of hard-wired prepara­tory and con­sum­ma­tory “prefer­ences.” Rather than calcu­late value based on what is likely to lead to re­ward­ing and pun­ish­ing out­comes, the Pavlo­vian sys­tem calcu­lates val­ues con­sis­tent with au­to­matic ap­proach to­ward ap­pet­i­tive stim­uli, and au­to­matic with­drawal from aver­sive stim­uli. Thus, “an­i­mals can­not help but ap­proach (rather than run away from) a source of food, even if the ex­per­i­menter has cru­elly ar­ranged things in a look­ing-glass world so that the ap­proach ap­pears to make the food re­cede, whereas re­treat­ing would make the food more ac­cessible (Her­sh­berger 1986).”

Or, as Jandila put it:

  • Model-based sys­tem: Figure out what’s go­ing on, and what ac­tions max­i­mize re­turns, and do them.

  • Model-free sys­tem: Do the thingy that worked be­fore again!

  • Pavlo­vian sys­tem: Avoid the un­pleas­ant thing and go to the pleas­ant thing. Re­peat as nec­es­sary.

In short:

We have de­scribed three sys­tems that are in­volved in mak­ing choices. Even in the case that they share a sin­gle, Pla­tonic, util­ity func­tion for out­comes, the choices they ex­press can be quite differ­ent. The model-based con­trol­ler comes clos­est to be­ing Pla­ton­i­cally ap­pro­pri­ate… The choices of the model-free con­trol­ler can de­part from cur­rent util­ities be­cause it has learned or cached a set of val­ues that may no longer be cor­rect. Pavlo­vian choices, though de­ter­mined over the course of evolu­tion to be ap­pro­pri­ate, can turn out to be in­stru­men­tally catas­trophic in any given ex­per­i­men­tal do­main...

[Hav­ing mul­ti­ple sys­tems that calcu­late value] is [one way] of ad­dress­ing the com­plex­ities men­tioned, but can lead to clashes be­tween Pla­tonic util­ity and choice. Fur­ther, model-free and Pavlo­vian choices can them­selves be in­con­sis­tent with their own util­ities.

We don’t yet know how choice re­sults from the in­puts of these three sys­tems, nor how the sys­tems might in­ter­act be­fore they de­liver their value calcu­la­tions to the fi­nal choice cir­cuit, nor whether the model-based sys­tem re­ally uses any­thing like a co­her­ent util­ity func­tion. But it looks like the hu­man might have a “hid­den” util­ity func­tion that would re­veal it­self if it wasn’t also us­ing the com­pu­ta­tion­ally cheaper model-free and Pavlo­vian sys­tems to help de­ter­mine choice.

At a glance, it seems that upon re­flec­tion I might em­brace an ex­trap­o­la­tion of the model-based sys­tem’s prefer­ences as rep­re­sent­ing “my val­ues,” and I would re­ject the out­puts of the model-free and Pavlo­vian sys­tems as the out­puts of dumb sys­tems that evolved for their com­pu­ta­tional sim­plic­ity, and can be seen as ways of try­ing to ap­prox­i­mate the full power of a model-based sys­tem re­spon­si­ble for goal-di­rected be­hav­ior.

On the other hand, as Eliezer points out, per­haps we ought to be sus­pi­cious of this, be­cause “it sounds like the cor­rect an­swer ought to be to just keep the part with the co­her­ent util­ity func­tion in CEV which would make it way eas­ier, but then some­one’s go­ing to jump up and say: ‘Ha ha! Love and friend­ship were ac­tu­ally in the other two!’”

Un­for­tu­nately, it’s too early to tell whether these re­sults will be use­ful for CEV. But it’s a lit­tle promis­ing. This is the kind of thing that some­times hap­pens when you hack away at the edges of hard prob­lems. This is also a re­peat of the les­son that “you can of­ten out-pace most philoso­phers sim­ply by read­ing what to­day’s lead­ing sci­en­tists have to say about a given topic in­stead of read­ing what philoso­phers say about it.”

(For poin­t­ers to the rele­vant ex­per­i­men­tal data, and for an ex­pla­na­tion of the math­e­mat­i­cal role of each val­u­a­tion sys­tem in the brain’s re­in­force­ment learn­ing sys­tem, see Dayan (2011). All quotes in this post are from that chap­ter, ex­cept for the last one.)