Sequence introduction: non-agent and multiagent models of mind

A typ­i­cal paradigm by which peo­ple tend to think of them­selves and oth­ers is as con­se­quen­tial­ist agents: en­tities who can be use­fully mod­eled as hav­ing be­liefs and goals, who are then act­ing ac­cord­ing to their be­liefs to achieve their goals.

This is of­ten a use­ful model, but it doesn’t quite cap­ture re­al­ity. It’s a bit of a fake frame­work. Or in com­puter sci­ence terms, you might call it a leaky ab­strac­tion.

An ab­strac­tion in the com­puter sci­ence sense is a sim­plifi­ca­tion which tries to hide the un­der­ly­ing de­tails of a thing, let­ting you think in terms of the sim­plifi­ca­tion rather than the de­tails. To the ex­tent that the ab­strac­tion ac­tu­ally suc­ceeds in hid­ing the de­tails, this makes things a lot sim­pler. But some­times the ab­strac­tion in­evitably leaks, as the sim­plifi­ca­tion fails to pre­dict some of the ac­tual be­hav­ior that emerges from the de­tails; in that situ­a­tion you need to ac­tu­ally know the un­der­ly­ing de­tails, and be able to think in terms of them.

Agent-ness be­ing a leaky ab­strac­tion is not ex­actly a novel con­cept for Less Wrong; it has been touched upon sev­eral times, such as in Scott Alexan­der’s Blue-Min­i­miz­ing Robot Se­quence. At the same time, I do not think that it has been quite fully in­ter­nal­ized yet, and that many foun­da­tional posts on LW go wrong due to be­ing premised on the as­sump­tion of hu­mans be­ing agents. In fact, I would go as far as to claim that this is the biggest flaw of the origi­nal Se­quences: they were at­tempt­ing to ex­plain many failures of ra­tio­nal­ity as be­ing due to cog­ni­tive bi­ases, when in ret­ro­spect it looks like un­der­stand­ing cog­ni­tive bi­ases doesn’t ac­tu­ally make you sub­stan­tially more effec­tive. But if you are im­plic­itly mod­el­ing hu­mans as goal-di­rected agents, then cog­ni­tive bi­ases is the most nat­u­ral place for ir­ra­tional­ity to emerge from, so it makes sense to fo­cus the most on there.

Just know­ing that an ab­strac­tion leaks isn’t enough to im­prove your think­ing, how­ever. To do bet­ter, you need to know about the ac­tual un­der­ly­ing de­tails to get a bet­ter model. In this se­quence, I will aim to elab­o­rate on var­i­ous tools for think­ing about minds which look at hu­mans in more gran­u­lar de­tail than the clas­si­cal agent model does. Hope­fully, this will help us bet­ter get past the old paradigm.

One par­tic­u­lar fam­ily of mod­els that I will be dis­cussing, will be that of multi-agent the­o­ries of mind. Here the claim is not that we would liter­ally have mul­ti­ple per­son­al­ities. Rather, my ap­proach will be similar in spirit to the one in Subagents Are Not A Me­taphor:

Here’s are the parts com­pos­ing my tech­ni­cal defi­ni­tion of an agent:
1. Values
This could be any­thing from liter­ally a util­ity func­tion to highly fram­ing-de­pen­dent. De­gen­er­ate case: em­bed­ded in lookup table from world model to ac­tions.
2. World-Model
De­gen­er­ate case: state­less world model con­sist­ing of just sense in­puts.
3. Search Process
Causal de­ci­sion the­ory is a search pro­cess. “From a fixed list of ac­tions, pick the most pos­i­tively re­in­forced” is an­other. De­gen­er­ate case: lookup table from world model to ac­tions.
Note: this says a ther­mo­stat is an agent. Not figu­ra­tively an agent. Liter­ally tech­ni­cally an agent. Fea­ture not bug.

This is a model that can be ap­plied nat­u­rally to a wide range of en­tities, as seen from the fact that ther­mostats qual­ify. And the rea­son why we tend to au­to­mat­i­cally think of peo­ple—or ther­mostats—as agents, is that our brains have evolved to nat­u­rally model things in terms of this kind of an in­ten­tional stance; it’s a way of thought that comes na­tively to us.

Given that we want to learn to think about hu­mans in a new way, we should look for ways to map the new way of think­ing into a na­tive mode of thought. One of my tac­tics will be to look for parts of the mind that look like they could liter­ally be agents (as in the above tech­ni­cal defi­ni­tion of an agent), so that we can re­place our in­tu­itive one-agent model with in­tu­itive multi-agent mod­els with­out need­ing to make trade-offs be­tween in­tu­itive­ness and truth. This will still be a leaky sim­plifi­ca­tion, but hope­fully it will be a more fine-grained leaky sim­plifi­ca­tion, so that over­all we’ll be more ac­cu­rate.

My model of what I think our sub­agents looks like draws upon a num­ber of differ­ent sources, in­clud­ing neu­ro­science, psy­chother­apy and med­i­ta­tion, so in the pro­cess of sketch­ing out my model I will be cov­er­ing a num­ber of them in turn. To give you a rough idea of what I’m try­ing to do, here’s a sum­mary of some up­com­ing con­tent.

Pub­lished posts:

(Note: this list may not always be fully up to date; see the se­quence in­dex for ac­tively main­tained ver­sion)

Book sum­mary: Con­scious­ness and the Brain. One of the fun­da­men­tal build­ing blocks of much of con­scious­ness re­search, is that of Global Workspace The­ory (GWT). This could be de­scribed as a com­po­nent of a mul­ti­a­gent model, fo­cus­ing on the way in which differ­ent agents ex­change in­for­ma­tion be­tween one an­other. One elab­o­ra­tion of GWT, which fo­cuses on how it might be im­ple­mented in the brain, is the Global Neu­ronal Workspace (GNW) model in neu­ro­science. Con­scious­ness in the Brain is a 2014 book that sum­ma­rizes some of the re­search and ba­sic ideas be­hind GNW, so sum­ma­riz­ing the main con­tent of that book looks like a good place to start our dis­cus­sion and for get­ting a neu­ro­scien­tific ground­ing be­fore we get more spec­u­la­tive.

Build­ing up to an IFS model. One the­o­ret­i­cal ap­proach for mod­el­ing hu­mans as be­ing com­posed of in­ter­act­ing parts is that of In­ter­nal Fam­ily Sys­tems. In my ex­pe­rience and that of sev­eral other peo­ple in the ra­tio­nal­ist com­mu­nity, it’s very effec­tive for this pur­pose. How­ever, hav­ing its ori­gins in ther­apy, its the­o­ret­i­cal model may seem rather un­scien­tific and woo-y. This per­son­ally put me off the the­ory for a long time, as I thought that it sounded fake, and gave me a strong sense of “my mind isn’t split into parts like that”.

In this post, I con­struct a mechanis­tic sketch of how a mind might work, draw­ing on the kinds of mechanisms that have already been demon­strated in con­tem­po­rary ma­chine learn­ing, and then end up with a model that pretty closely re­sem­bles the IFS one.

Subagents, in­tro­spec­tive aware­ness, and blend­ing. In this post, I ex­tend the model of mind that I’ve been build­ing up in pre­vi­ous posts to ex­plain some things about change blind­ness, not know­ing whether you are con­scious, for­get­ting most of your thoughts, and mis­tak­ing your thoughts and emo­tions as ob­jec­tive facts, while also con­nect­ing it with the the­ory in the med­i­ta­tion book The Mind Illu­mi­nated.

Subagents, akra­sia, and co­her­ence in hu­mans. We can roughly de­scribe co­her­ence as the prop­erty that, if you be­come aware that there ex­ists a more op­ti­mal strat­egy for achiev­ing your goals than the one that you are cur­rently ex­e­cut­ing, then you will switch to that bet­ter strat­egy. For a sub­agent the­ory of mind, we would like to have some ex­pla­na­tion of when ex­actly the sub­agents man­age to be col­lec­tively co­her­ent (that is, change their be­hav­ior to some bet­ter one), and what are the situ­a­tions in which they fail to do so.

My con­clu­sion is that we are ca­pa­ble of chang­ing our be­hav­iors on oc­ca­sions when the mind-sys­tem as a whole puts suffi­ciently high prob­a­bil­ity on the new be­hav­ior be­ing bet­ter, when the new be­hav­ior is not be­ing blocked by a par­tic­u­lar highly weighted sub­agent (such as an IFS-style pro­tec­tor) that puts high prob­a­bil­ity on it be­ing bad, and when we have enough slack in our lives for any new be­hav­iors to be eval­u­ated in the first place. Akra­sia is sub­agent dis­agree­ment about what to do.

In­te­grat­ing dis­agree­ing sub­agents. In the pre­vi­ous post, I sug­gested that akra­sia in­volves sub­agent dis­agree­ment—or in other words, differ­ent parts of the brain hav­ing differ­ing ideas on what the best course of ac­tion is. The ex­is­tence of such con­flicts raises the ques­tion, how does one re­solve them?

In this post I dis­cuss var­i­ous tech­niques which could be in­ter­preted as ways of re­solv­ing sub­agents dis­agree­ments, as well as some of the rea­sons for why this doesn’t always hap­pen.

Subagents, neu­ral Tur­ing ma­chines, thought se­lec­tion, and blindspots. In my sum­mary of Con­scious­ness and the Brain, I briefly men­tioned that one of the func­tions of con­scious­ness is to carry out ar­tifi­cial se­rial op­er­a­tions; or in other words, im­ple­ment a pro­duc­tion sys­tem (equiv­a­lent to a Tur­ing ma­chine) in the brain.

While I did not go into very much de­tail about this model in the post, I’ve used it in later ar­ti­cles. For in­stance, in Build­ing up to an In­ter­nal Fam­ily Sys­tems model, I used a toy model where differ­ent sub­agents cast votes to mod­ify the con­tents of con­scious­ness. One may con­cep­tu­al­ize this as equiv­a­lent to the pro­duc­tion sys­tem model, where differ­ent sub­agents im­ple­ment differ­ent pro­duc­tion rules which com­pete to mod­ify the con­tents of con­scious­ness.

In this post, I flesh out the model a bit more, as well as ap­ply­ing it to a few other ex­am­ples, such as emo­tion sup­pres­sion, in­ter­nal con­flict, and blind spots.

Subagents, trauma, and ra­tio­nal­ity. This post in­ter­prets the ap­pear­ance of sub­agents as emerg­ing from un­in­te­grated mem­ory net­works, and ar­gues that the pres­ence of these is a mat­ter of de­gree. There’s a con­tin­u­ous pro­gres­sion of frag­mented (dis­so­ci­ated) mem­ory net­works giv­ing arise to in­creas­ingly worse symp­toms as the de­gree of frag­men­ta­tion grows. The con­tinuum goes from ev­ery­day pro­cras­ti­na­tion and akra­sia on the “nor­mal” end, to dis­rupted and dys­func­tional be­liefs on the mid­dle, and con­di­tions like clini­cal PTSD, bor­der­line per­son­al­ity di­s­or­der, and dis­so­ci­a­tive iden­tity di­s­or­der on the severely trau­ma­tized end.

I also ar­gue that emo­tional work and ex­plor­ing one’s past trau­mas in or­der to heal them, is nec­es­sary for effec­tive in­stru­men­tal and epistemic ra­tio­nal­ity.

Against “Sys­tem 1” and “Sys­tem 2″. The terms Sys­tem 1 and Sys­tem 2 were origi­nally coined by the psy­chol­o­gist Keith Stanovich and then pop­u­larized by Daniel Kah­ne­man in his book Think­ing, Fast and Slow. Stanovich noted that a num­ber of fields within psy­chol­ogy had been de­vel­op­ing var­i­ous kinds of the­o­ries dis­t­in­guish­ing be­tween fast/​in­tu­itive on the one hand and slow/​de­liber­a­tive think­ing on the other. Often these fields were not aware of each other. The S1/​S2 model was offered as a gen­eral ver­sion of these spe­cific the­o­ries, high­light­ing fea­tures of the two modes of thought that tended to ap­pear in all the the­o­ries.

Since then, aca­demics have con­tinued to dis­cuss the mod­els. Among other de­vel­op­ments, Stanovich and other au­thors have dis­con­tinued the use of the Sys­tem 1/​Sys­tem 2 ter­minol­ogy as mis­lead­ing, choos­ing to in­stead talk about Type 1 and Type 2 pro­cess­ing. In this post, I will build on some of that dis­cus­sion to ar­gue that Type 2 pro­cess­ing is a par­tic­u­lar way of chain­ing to­gether the out­puts of var­i­ous sub­agents us­ing work­ing mem­ory. Some of the pro­cesses in­volved in this chain­ing are them­selves im­ple­mented by par­tic­u­lar kinds of sub­agents.

Book sum­mary: Un­lock­ing the Emo­tional Brain. Writ­ten by the psy­chother­a­pists Bruce Ecker, Robin Ti­cic and Lau­rel Hul­ley, Un­lock­ing the Emo­tional Brain claims to offer a neu­ro­science-grounded, com­pre­hen­sive model of how effec­tive ther­apy works. In so do­ing, it also hap­pens to for­mu­late its the­ory in terms of be­lief up­dat­ing, helping ex­plain how the brain mod­els the world and what kinds of tech­niques al­low us to ac­tu­ally change our minds. Its dis­cus­sion and mod­els are closely con­nected to the mod­els about in­ter­nal con­flict and be­lief re­vi­sion that are dis­cussed in pre­vi­ous posts, par­tic­u­larly “in­te­grat­ing dis­agree­ing sub­agents”.

A mechanis­tic model of med­i­ta­tion. Med­i­ta­tion has been claimed to have all kinds of trans­for­ma­tive effects on the psy­che, such as im­prov­ing con­cen­tra­tion abil­ity, heal­ing trauma, clean­ing up delu­sions, al­low­ing one to track their sub­con­scious strate­gies, and mak­ing one’s ner­vous sys­tem more effi­cient. How­ever, an ex­pla­na­tion for why and how ex­actly this would hap­pen has typ­i­cally been lack­ing. This makes peo­ple rea­son­ably skep­ti­cal of such claims.

In this post, I want to offer an ex­pla­na­tion for one kind of a mechanism: med­i­ta­tion in­creas­ing the de­gree of a per­son’s in­tro­spec­tive aware­ness, and thus lead­ing to in­creas­ing psy­cholog­i­cal unity as in­ter­nal con­flicts are de­tected and re­solved.

A non-mys­ti­cal ex­pla­na­tion of in­sight med­i­ta­tion and the three char­ac­ter­is­tics of ex­is­tence: in­tro­duc­tion and pream­ble. In­sight med­i­ta­tion, en­light­en­ment, what’s that all about?

The se­quence of posts start­ing from this one is my per­sonal at­tempt at an­swer­ing that ques­tion. It seeks to:

  • Ex­plain what kinds of im­plicit as­sump­tions build up our de­fault un­der­stand­ing of re­al­ity and how those as­sump­tions are sub­tly flawed.

  • Point out as­pects from our ex­pe­rience whose re­peated ob­ser­va­tion will up­date those as­sump­tions, and ex­plain how this may cause psy­cholog­i­cal change in some­one who med­i­tates.

  • Ex­plain how the so-called “three char­ac­ter­is­tics of ex­is­tence” of Bud­dhism—im­per­ma­nence, no-self and un­satis­fac­tori­ness—are all in­ter­re­lated and con­nected with each other in a way that is con­nected to the pre­vi­ously dis­cussed top­ics in the se­quence.

Farther out (sketched out but not as ex­ten­sively planned/​writ­ten yet)

The game the­ory of ra­tio­nal­ity and co­op­er­a­tion in a mul­ti­a­gent world. Multi-agent mod­els have a nat­u­ral con­nec­tion to Elephant in the Brain -style dy­nam­ics: our brains do­ing things for pur­poses of which we are un­aware. Fur­ther­more, there can be strong in­cen­tives to con­tinue sys­tem­atic self-de­cep­tion and not in­te­grate con­flict­ing be­liefs. For in­stance, if a mind has sub­agents which think that spe­cific be­liefs are dan­ger­ous to hold or ex­press, then they will work to sup­press sub­agents hold­ing that be­lief from com­ing into con­scious aware­ness.

“Danger­ous be­liefs” might be ones that touch upon poli­ti­cal top­ics, but they might also be ones of a more per­sonal na­ture. For in­stance, some­one may have an iden­tity as be­ing “good at X”, and then want to ra­tio­nal­ize away any con­tra­dic­tory ev­i­dence—in­clud­ing ev­i­dence sug­gest­ing that they were wrong on a topic re­lated to X. Or it might be some­thing even more sub­tle.

Th­ese are a few ex­am­ples of how ra­tio­nal­ity work has to hap­pen on two lev­els at once: to de­bug some be­liefs (in­di­vi­d­ual level), peo­ple need to be in a com­mu­nity where hold­ing var­i­ous kinds of be­liefs is ac­tu­ally safe (so­cial level). But in or­der for the com­mu­nity to be safe for hold­ing those be­liefs (so­cial level), peo­ple within the com­mu­nity also need to work on them­selves so as to deal with their own sub­agents that would cause them to at­tack peo­ple with the “wrong” be­liefs (in­di­vi­d­ual level). This kind of work also seems to be nec­es­sary for fix­ing “poli­tics be­ing the mind-kil­ler” and col­lab­o­rat­ing on is­sues such as ex­is­ten­tial risk across sharp value differ­ences; but the need to carry out the work on many lev­els at once makes it challeng­ing, es­pe­cially since the cur­rent en­vi­ron­ment in­cen­tivizes many (sub)agents to sab­o­tage any at­tempt at this.

(This topic area is also re­lated to that stuff Valen­tine has been say­ing about Omega.)

This se­quence is part of re­search done for, and sup­ported by, the Foun­da­tional Re­search In­sti­tute.