Predictive coding = RL + SL + Bayes + MPC

I was con­fused and skep­ti­cal for quite a while about some as­pects of pre­dic­tive cod­ing—and it’s pos­si­ble I’m still con­fused—but af­ter read­ing a num­ber of differ­ent per­spec­tives on brain al­gorithms, the fol­low­ing pic­ture popped into my head and I felt much bet­ter:

This is sup­posed to be a high-level per­spec­tive on how the neo­cor­tex[1] builds a pre­dic­tive world-model and uses it to choose ap­pro­pri­ate ac­tions. (a) We gen­er­ate a bunch of gen­er­a­tive mod­els in par­allel, which make pre­dic­tions about what’s go­ing on and what’s go­ing to hap­pen next, in­clud­ing what I am do­ing and will do next (i.e., my plans). The mod­els gain “promi­nence” by (b) cor­rectly pre­dict­ing up­com­ing sen­sory in­puts; (c) cor­rectly pre­dict­ing other types of in­put in­for­ma­tion com­ing into the neo­cor­tex like tired­ness, hor­monal sig­nals, hunger, warmth, pain, plea­sure, re­ward, and so on; (d) be­ing com­pat­i­ble with other already-promi­nent mod­els; (e) pre­dict­ing that a large re­ward sig­nal is com­ing, which as dis­cussed in my later ar­ti­cle In­ner Align­ment in the Brain, in­cludes things like pre­dict­ing that my goals will be fulfilled with min­i­mal effort, I’ll be eat­ing soon if I’m hun­gry, I’ll be sleep­ing soon if I’m tired, I’ll avoid pain, and so on. What­ever can­di­date gen­er­a­tive model winds up the most “promi­nent” wins, and de­ter­mines my be­liefs and ac­tions go­ing for­ward.

(Note on ter­minol­ogy: I’m call­ing these things “gen­er­a­tive mod­els” in all cases. Kurzweil calls them “pat­terns”. They’re also some­times called “hy­pothe­ses”, es­pe­cially in the con­text of pas­sive ob­ser­va­tion (e.g. “that thing I see is a bouncy ball”). Or they’re called “sub­agents”[2], es­pe­cially in the con­text of self-pre­dic­tion (e.g. “I am about to eat”).

Be­fore we get to de­tails, I need to apol­o­gize for the pic­ture be­ing mis­lead­ing:

  • First, I drew (b,c,d,e) as hap­pen­ing af­ter (a), but re­ally some of these (es­pe­cially (d) I think) work by af­fect­ing which mod­els get con­sid­ered in the first place. (More gen­er­ally, I do not want to im­ply that a,b,c,d,e cor­re­spond to ex­actly five dis­tinct neu­ral mechanisms, or any­thing like that. I’m just go­ing for a func­tional per­spec­tive in this post.)

  • Se­cond (and re­lat­edly), I de­picted it as if we sim­ply add up points for (b-e), but it’s cer­tainly not lin­ear like that. I think at least some of the con­sid­er­a­tions effec­tively get ve­toes. For ex­am­ple, we don’t gen­er­ally see a situ­a­tion where (e) is so pos­i­tive that it sim­ply out­votes (b-d), and thus we spend all day check­ing our wallet ex­pect­ing to find it mag­i­cally filled with crisp $1000 bills. (Much more about wish­ful think­ing be­low.)

  • Third, at the bot­tom I drew one gen­er­a­tive model be­ing the “win­ner”. Things like ac­tion plans and con­scious at­ten­tion do in fact have a win­ner-take-all dy­namic be­cause, for ex­am­ple, we don’t want to be send­ing out mus­cle com­mands for both walk­ing and sit­ting si­mul­ta­neously.[3] But in gen­eral, lower-ranked mod­els are not thrown out; they linger, with their promi­nence grow­ing or shrink­ing as more ev­i­dence comes in.

Any­way, the pic­ture above tells a nice story:

(b) is self-su­per­vised learn­ing[4], i.e. learn­ing from pre­dic­tion. Pro­cess (b) sim­ply votes against gen­er­a­tive mod­els when they make in­cor­rect pre­dic­tions. This pro­cess is where we get the vast ma­jor­ity of the in­for­ma­tion con­tent we need to build a good pre­dic­tive world-model. Note that there doesn’t seem to be any strong differ­ence in the brain be­tween (i) ac­tual ex­pe­riences, (ii) mem­ory re­call, and (iii) imag­i­na­tion—pro­cess (b) will vote for or against mod­els when pre­sented with any of those three types of “ev­i­dence”. (I think they vote much more strongly in case (i) though.)

(c) is credit as­sign­ment, i.e. learn­ing what as­pects of the world cause good or bad things to hap­pen to us, so that we can make good de­ci­sions. Each gen­er­a­tive model makes claims about what is the cause of sub­cor­tex-pro­vided in­for­ma­tional sig­nals (analo­gous to “re­ward” in RL)—in­for­ma­tion sig­nals that say we’re in pain, or eat­ing yummy food, or ex­hausted, or scared, etc. Th­ese claims cash out as pre­dic­tions that can prove right or wrong, thus ei­ther sup­port­ing or cast­ing doubt on that model. Thus our in­ter­nal mod­els say that “cook­ies are yummy”, cor­re­spond­ing to a pre­dic­tion that, if we eat one, we’ll get a “yummy” sig­nal from some an­cient rep­tilian part of our brain.

(d) is Bayesian pri­ors. I doubt we do Bayesian up­dat­ing in a literal math­e­mat­i­cal sense, but we cer­tainly do in­cor­po­rate prior be­liefs into our in­ter­pre­ta­tion of new ev­i­dence. I’m claiming that the mechanism for this is “mod­els gain promi­nence by be­ing com­pat­i­ble with already-promi­nent mod­els”. What is an “already-promi­nent model”? One that has pre­vi­ously been suc­cess­ful in this same pro­cess I’m de­scribing here, es­pe­cially if in similar con­texts, and su­per-es­pe­cially if in the im­me­di­ate past. Such mod­els func­tion as our pri­ors. And what does it mean for a new model to be “com­pat­i­ble” with these prior mod­els? Well, a crit­i­cal fact about these mod­els is that they snap to­gether like Le­gos, al­low­ing hi­er­ar­chies, re­cur­sion, com­po­si­tion, analo­gies, ca­sual re­la­tion­ships, and so on. (Thus, I’ve never seen a rub­ber wine glass, but I can eas­ily cre­ate a men­tal model of one by glu­ing to­gether some of my rub­ber-re­lated gen­er­a­tive mod­els with some of my wine-glass-re­lated gen­er­a­tive mod­els.) Over time we build up these su­per-com­pli­cated and in­tri­cate Rube Gold­berg mod­els, ap­prox­i­mately de­scribing our even-more-com­pli­cated world. I think a new model is “com­pat­i­ble” with a prior one when (1) the new model is al­most the same as the prior model apart from just one or two sim­ple ed­its, like adding a new bridg­ing con­nec­tion to a differ­ent already-known model; and/​or (2) when the new model doesn’t make pre­dic­tions counter to the prior one, at least not in ar­eas where the prior one is very pre­cise and con­fi­dent.[5] Some­thing like that any­way, I think...

(e) is Model-Pre­dic­tive Con­trol. If we’re hun­gry, we give ex­tra points to a gen­er­a­tive model that says we’re about to get up and eat a snack, and so on. This works in tan­dem with credit as­sign­ment (pro­cess (c)), so if we have a promi­nent model that giv­ing speeches will lead to em­bar­rass­ment, then we will sub­tract points from a new model that we will give a speech to­mor­row, and we don’t need to run the model all the way through to the part where we get em­bar­rassed. I like Kaj So­tala’s de­scrip­tion here: “men­tal rep­re­sen­ta­tions...[are] im­bued with a con­text-sen­si­tive af­fec­tive gloss”—in this case, the men­tal rep­re­sen­ta­tion of “I will give a speech” is in­fused with a nega­tive “will lead to em­bar­rass­ment” vibe, and mod­els lose points for con­tain­ing that vibe. It’s con­text-sen­si­tive be­cause, for ex­am­ple, the “will lead to feel­ing cold” vibe could be ei­ther fa­vor­able or un­fa­vor­able de­pend­ing on our cur­rent body tem­per­a­ture. Any­way, this fram­ing makes a lot of sense for choos­ing ac­tions, and amounts to us­ing con­trol the­ory to satisfy our in­nate drives. But if we’re just pas­sively ob­serv­ing the world, this frame­work is kinda prob­le­matic...

(e) is also wish­ful think­ing. Let’s say some­one gives us an un­marked box with a sur­prise gift in­side. Ac­cord­ing to the role of (e) in the pic­ture I drew, if we re­ceive the box when we’re hun­gry, we should ex­pect to find food in the box, and if we re­ceive the box when we’re in a loud room, we should ex­pect to find earplugs in the box, etc. Well, that’s not right. Wish­ful think­ing does ex­ist, but it doesn’t seem so in­evitable and ubiquitous as to de­serve a seat right near the heart of hu­man cog­ni­tion. Well, one op­tion is to de­clare that one of the core ideas of Pre­dic­tive Cod­ing the­ory—unify­ing world-mod­el­ing and ac­tion-se­lec­tion within the same com­pu­ta­tional ar­chi­tec­ture—is baloney. But I don’t think that’s the right an­swer. I think a bet­ter ap­proach is to posit that (b-d) are ac­tu­ally pretty re­stric­tive in prac­tice, leav­ing (e) mainly as a com­par­i­tively weak force that can be a tiebreaker be­tween equally plau­si­ble mod­els. In other words, pas­sive ob­servers rarely if ever come across mul­ti­ple equally plau­si­ble mod­els for what’s go­ing on and what will hap­pen next; it would re­quire a big co­in­ci­dence to bal­ance the scales so pre­cisely. But when we make pre­dic­tions about what we our­selves will do, that as­pect of the pre­dic­tion is a self-fulfilling prophecy, so we rou­tinely have equally plau­si­ble mod­els...and then (e) can step in and break the tie.

More gen­eral state­ment of situ­a­tions where (e) plays a big role: Maybe “self-fulfilling” is not quite the right ter­minol­ogy for when (e) is im­por­tant; it’s more like “(e) is most im­por­tant in situ­a­tions where lots of mod­els are all in­com­pat­i­ble, yet where pro­cesses (b,c,d) never get ev­i­dence to sup­port one model over the oth­ers.” So (e) is cen­tral in choos­ing ac­tion-se­lec­tion mod­els, since these are self-fulfilling, but (e) plays a rel­a­tively minor role in pas­sive ob­ser­va­tion of the world, since there we have (b,c) keep­ing us an­chored to re­al­ity (but (e) does play an oc­ca­sional role on the mar­gins, and we call it “wish­ful think­ing”). (e) is also im­por­tant be­cause (b,c,d) by them­selves leave this whole pro­cess highly un­der-de­ter­mined: walk­ing in a for­est, your brain can build a bet­ter pre­dic­tive model of trees, of clouds, of rocks, or of noth­ing at all; (e) is a guid­ing force that, over time, keeps us on track build­ing use­ful mod­els for our ecolog­i­cal niche.

One more ex­am­ple where (e) is im­por­tant: con­fab­u­la­tion, ra­tio­nal­iza­tion, etc. Here’s an ex­am­ple: I reach out to grab Emma’s unat­tended lol­lipop be­cause I’m hun­gry and cal­lous, but then I im­me­di­ately think of an al­ter­nate model, in which I am tak­ing the lol­lipop be­cause she prob­a­bly wants me to have it. The sec­ond model gets ex­tra points from the (e) pro­cess, be­cause I have an in­nate drive to con­form to so­cial norms, be well-re­garded and well-liked, etc. Thus the sec­ond model beats the truth­ful model (that I grabbed the lol­lipop be­cause I was hun­gry and cal­lous). Why can’t the (b) pro­cess de­tect and de­stroy this lie? Be­cause all that (b) has to go on is my own mem­ory, and per­ni­ciously, the sec­ond model has some in­fluence over how I form the mem­ory of grab­bing the lol­lipop. It has cov­ered its tracks! Sneaky! So I can keep do­ing this kind of thing for years, and the (b) pro­cess will never be able to de­tect and kill this habit of thought. Thus, ra­tio­nal­iza­tion winds up more like ac­tion se­lec­tion, and less like wish­ful think­ing, in that it is pretty much ubiquitous and cen­tral to cog­ni­tion.[6]

Side note: Should we lump (d-e) to­gether? When peo­ple de­scribe Pre­dic­tive Cod­ing the­ory, they tend to lump (d-e) to­gether, to say things like “We have a prior that, when we’re hun­gry, we’re go­ing to eat soon.” I am propos­ing that this lump­ing is not merely bad ped­a­gogy, but is ac­tu­ally con­flat­ing to­gether two differ­ent things: (d) and (e) are not in­ex­tri­ca­bly unified into a sin­gle com­pu­ta­tional mechanism. (I don’t think the pre­vi­ous sen­tence is ob­vi­ous, and I’m not su­per-con­fi­dent about it.) By the same to­ken, I’m un­com­fortable say­ing that min­i­miz­ing pre­dic­tion er­ror is a fun­da­men­tal op­er­at­ing prin­ci­ple of the brain; I want to say that pro­cesses (a-e) are fun­da­men­tal, and min­i­miz­ing pre­dic­tion er­ror is some­thing that ar­guably hap­pens as an in­ci­den­tal side-effect.

Well, that’s my story, it seems to ba­si­cally makes sense, but that could just be my (e) wish­ful think­ing and (e) ra­tio­nal­iza­tion talk­ing. :-)

(Up­date May 2020: The tra­di­tional RL view would be that there’s a 1-di­men­sional sig­nal called “re­ward” that drives pro­cess (e). When I first wrote this, I was still con­fused about whether that was the right way to think about the brain, and thus I largely avoided the term “re­ward” in fa­vor of less spe­cific things. After think­ing about it more—see in­ner al­ign­ment in the brain, I am now fully on board with the tra­di­tional RL view; pro­cess (e) is just “We give ex­tra points to mod­els that pre­dict that a large re­ward sig­nal is com­ing”. Also, I re­placed “hy­pothe­ses” with “gen­er­a­tive mod­els” through­out, I think it’s a bet­ter ter­minol­ogy.)


  1. The neo­cor­tex is 75% of the hu­man brain by weight, and cen­trally in­volved in pretty much ev­ery as­pect of hu­man in­tel­li­gence (in part­ner­ship with the tha­la­mus and hip­pocam­pus). More about the neo­cor­tex in my pre­vi­ous post ↩︎

  2. See Jan Kul­veit’s Multi-agent pre­dic­tive minds and AI al­ign­ment, Kaj So­tala’s Mul­ti­a­gent mod­els of mind se­quence, or of course Marvin Min­sky and many oth­ers. ↩︎

  3. I de­scribed con­scious at­ten­tion and ac­tion plans as “win­ner-take-all” in the com­pe­ti­tion among mod­els, but I think it’s some­what more com­pli­cated and sub­tle than that. I also think that pick­ing a win­ner is not a sep­a­rate mechanism from (b,c,d,e), or at least not en­tirely sep­a­rate. This is a long story that’s out­side the scope of this post. ↩︎

  4. I have a brief in­tro to self-su­per­vised learn­ing at the be­gin­ning of Self-Su­per­vised Learn­ing and AGI Safety ↩︎

  5. Note that my pic­ture at the top shows par­allel pro­cess­ing of mod­els, but that’s not quite right; in or­der to see whether two promi­nent mod­els are mak­ing con­tra­dic­tory pre­dic­tions, we need to ex­change in­for­ma­tion be­tween them. ↩︎

  6. See The Elephant in the Brain etc. ↩︎