Re­lated to: Cached Selves, Cached Thoughts, A Ra­tional Identity

Th­e­sis: Har­ness­ing the large pres­sures on thoughts and ac­tions gen­er­ated by in­nate drives for sig­nal­ing, iden­tity ne­go­ta­tion, and iden­tity for­ma­tion by con­sciously aiming for in­stru­men­tally op­ti­mal at­trac­tors in iden­ti­tys­pace should al­low for struc­tured self-im­prove­ment and a nat­u­ral defense against goal dis­tor­tion. First, though, we must see the true scope of the prob­lem, and then find meth­ods of at­tack.

Part One: Con­cep­tu­al­iz­ing the Prob­lem Domain

I: It Begins

Life is an odd, long jour­ney. The differ­ence be­tween a young girl and the wiz­ened pro­fes­sor she be­comes is quite profound. One of those crazy things in life that seems so nat­u­ral. The child surely has a very differ­ent set of as­pira­tions than the pro­fes­sor. They look differ­ent. They have en­tirely differ­ent thoughts. They share the same genes, but the two are still way more differ­ent than the av­er­age pair of iden­ti­cal twins. The pro­fes­sor has very dull mem­o­ries of the child, but that’s a pretty weak link, and it only goes one way. They have differ­ent senses of self, differ­ent be­liefs, differ­ent af­fili­a­tions, differ­ent re­la­tion­ships, differ­ent goals. Differ­ent iden­tities. Is the pro­fes­sor what the child would have wished to be­come, if the child had known more and thought faster, had be­come wiser and more en­light­ened by some mag­i­cal pro­cess? It’s hard to say; it’s hard to say how well the child’s the­o­ret­i­cally re­flec­tively en­dorsed goals were fulfilled, how many utilons were at­tained by that stan­dard.

There is a long, long jour­ney be­tween the child and the pro­fes­sor, and the jour­ney is frought with po­ten­tial for both growth and goal dis­tor­tion. Talk of ge­netic pre­dis­po­si­tions all you want, but Or­well wasn’t that far off in claiming that man is in­finitely malle­able. That list of hu­man uni­ver­sals you might have seen per­tains to in­di­vi­d­u­als much less than to cul­tures, and even less so to the in­di­vi­d­ual splices of your av­er­age per­son’s life. We may have a lot in com­mon on av­er­age, but there’s a heck of a lot of in­di­vi­d­ual var­i­ance, and the meme­plex of thoughtspace can do a lot of sur­pris­ing things with that var­i­ance [ho­mo­sex­u­al­ity adap­ta­tion to­wards some­thing else evo psych book]. Am­plifi­ca­tion, sub­ju­ga­tion, deflec­tion, at­trac­tion: mindspace is an in­ter­ac­tive and pow­er­ful play­ing field, chaotic and mul­ti­di­men­sional. If we are to think about it clearly, we should prob­a­bly try to nail down some ter­minol­ogy.

II: Defi­ni­tions: Mindspace, At­trac­tors, Rationality

Mindspace is the con­figu­ra­tion space of a mind. For hu­mans at least, mindspace can be loosely bro­ken down into thoughtspace and iden­ti­tys­pace. Gen­er­ally speak­ing, thoughtspace is ephemeral where iden­ti­tys­pace is more con­stant, though chang­ing. At­trac­tors in the realm of thoughtspace tend to work on timescales like sec­onds or hours, where iden­ti­tys­pace is what hap­pens when pat­terns of thoughts, emo­tions, habits, et cetera per­sist for days or life­times. Hu­mans are con­stantly mov­ing through mindspace. Some are drift­ing lazily, some are pro­pel­led pur­pose­fully, some are tossed about chaot­i­cally. Ideally, we wish to move strongly and quickly to­wards the ar­eas of mindspace that are most con­ducive to achiev­ing our goals. We do so by think­ing op­ti­mal thoughts and hav­ing op­ti­mal dis­po­si­tions. Of course, it’s never that easy. (Refer­ence.)

Thoughtspace is a gi­ant mul­ti­di­men­sional web of knowl­edge and pos­si­ble thoughts and in­sights. Most of it is handed to you in some form or an­other, from sen­sory ex­pe­rience or a book or a friend of a friend, prob­a­bly mu­tated along the way, of­ten for the worse. Thoughtspace is made up of thoughts, not dis­po­si­tions or habits: those are in iden­ti­tys­pace. But habits of think­ing de­ter­mine your move­ment through thoughtspace, and thoughts when thought ha­bit­u­ally may quickly be­come part of your iden­tity. The two spaces are con­nected at many joints, some­times non-ob­vi­ously.

Your web of be­liefs makes up your start­ing point for ex­plo­ra­tion of thoughtspace. There are many des­ti­na­tions to aim for. Some­times, we wish to gain new knowl­edge to up­date an old be­lief, or some­times we spend time think­ing to in­te­grate new ev­i­dence. Some­times we syn­the­size old con­cepts and come up with some­thing new. Some­times we for­mu­late plans to reach goals pro­vided by our iden­tities. The part of thoughtspace in your grasp is your map; it re­flecs re­al­ity, sorta, and al­lows you to nav­i­gate via your iden­tity to­wards thoughts and be­liefs that are likely to fulfill your goals, in­so­far as you know what your goals are.

Iden­ti­tys­pace is a thou­sand-di­men­sional con­figu­ra­tion space of thought pat­terns, emo­tion pat­terns, habits, dis­po­si­tions, group af­fili­a­tions, bi­ases, goals, val­ues, struc­tured clusters of pre-con­ceived ideas, spe­cific knowl­edge struc­tures or cog­ni­tive rep­re­sen­ta­tions of the self, men­tal frame­works cen­ter­ing on a spe­cific theme that helps to or­ga­nize so­cial in­for­ma­tion, et cetera. The sta­bil­ity of iden­tity varies from per­son to per­son. Most gen­er­ally don’t change all that much past a cer­tain. Some, like those with bipo­lar di­s­or­der, cy­cle be­tween two con­figu­ra­tions in iden­tity space. Some pur­pose­fully rein­vent them­selves ev­ery few months.

But at any rate, iden­tity is ob­vi­ously very malle­able, even over short spans of time. Chaotic fluc­tu­a­tions are not un­com­monly trig­gered by in­nocu­ous stim­uli. Most of­ten, though, sig­nifi­cant shifts in iden­ti­tys­pace fol­low no­tice­ably large events. Long-held grudges are dropped, re­li­gions are bro­ken with, et cetera. But more reg­u­larly, peo­ple slowly and sub­op­ti­mally drift through iden­ti­tys­pace, not tak­ing care to care­fully mon­i­tor their progress or their de­sires, and let­ting goal dis­tor­tion creep in to the poorly kept gar­den of ra­tio­nal­ity.

Goals are a sig­nifi­cant part of iden­ti­tys­pace: for most peo­ple, they provide sta­ble tar­gets to work for, in­stead of just drift­ing about in the memetic sea. Thus, a great though in­cred­ibly difficult abil­ity is be­ing able to de­ter­minedly, me­thod­i­cally, swiftly, and care­fully progress through iden­ti­tys­pace to­wards strictly defined and sig­nifi­cant goals. This ar­ti­cle at­tempts to ex­am­ine the prob­lems and op­por­tu­ni­ties en­coun­tered while do­ing so.

At­trac­tors are a class of thoughts, re­ac­tions, dis­po­si­tions, emo­tions, so­cial pres­sures, bi­ases, and any­thing else that will push or pull or deflect or al­ter your grand voy­age through mindspace, for bet­ter or worse. It might help to think in terms of ‘pos­i­tive’ and ‘nega­tive’ at­trac­tors, cod­ified by their ex­pected value in terms of suc­cess­ful nav­i­ga­tion of mindspace. You can ad­just for at­trac­tors, but it’s difficult, and it takes a whole bunch of ra­tio­nal­ity be­fore you start do­ing it right, in­stead of over­cor­rect­ing or try­ing to be too clever in us­ing the gusts to your ad­van­tage.

Un­pre­dictable storms of mind pro­jec­tion fal­lacy on the hori­zon will force you off course, and whirlpools of con­fir­ma­tion bias will try to suck you into the abyss. Some­times peo­ple will try to mess around with your map just to be dicks, as if Po­sei­don and Zeus de­cided to get to­gether to zap you with thun­der­bolts for fun and profit. I think I may have just con­fused the map with the ter­ri­tory. You see? It’s difficult. (In­spired by these.)

Ra­tion­al­ity is the method of nav­i­ga­tion. It helps you figure out what the ter­ri­tory ac­tu­ally looks like, so you can plot a course of thoughts that leads roughly in the right di­rec­tion. Nav­i­gat­ing these in­sane mon­ster-rid­den hy­per­di­men­sional wa­ters isn’t ex­actly easy. I won’t go into de­tail about ra­tio­nal­ity, as that is what the rest of Less Wrong is for.

III: The Evolu­tion of Navigation

Ra­tion­al­ity, though similar to past at­tempts at care­ful philos­o­phy and so­cial episte­mol­ogy, is a new and differ­ent art form. Tra­di­tional ra­tio­nal­ity, the pre­cur­sor to true ra­tio­nal­ity, was the product of the so­cial pro­cess of sci­ence. Science cares not about what goes on in­side your head when you nav­i­gate thoughtspace. It ex­ists only to ver­ify that what you find on your jour­ney is of value in re­al­ity. Tra­di­tional ra­tio­nal­ity is what hap­pens when you take care to hold your method­ol­ogy for think­ing to the same stan­dard that would be de­manded by peer re­view. The tra­di­tional ra­tio­nal­ist dreams up a bunch of sci­en­tists sorta like him­self, but much more crit­i­cal, and puts his ideas un­der their imag­ined ex­pected scrutiny. Some­times the fi­nal judge is re­al­ity, but ideally you can elimi­nate most silly ideas be­fore they get that far; it’s best to not ac­tu­ally test your garage for in­visi­ble drag­ons.

The ra­tio­nal­ity that makes up Over­com­ing Bias and Less Wrong canon is what hap­pens when the peer re­view coun­cil is in­stead made up of su­per­in­tel­li­gences that tol­er­ate noth­ing less than ex­act pre­ci­sion. There is no such thing as an ‘ac­cepted’ hy­poth­e­sis: su­per­in­tel­li­gences don’t speak bi­nary. The num­ber of 9′s you trail your prob­a­bil­ity es­ti­mates with had bet­ter be care­fully cal­ibrated, or your Bayes score is go­ing doooown. But such judges do not ex­ist in re­al­ity, and so we have the Great Prob­lem: we can­not eas­ily ver­ify ra­tio­nal­ity.

Bayesi­anism is the ideal of the perfected art, though it is in­cred­ibly difficult for hu­mans to ap­prox­i­mate, and the field, though grow­ing, has few prac­ti­tion­ers. Bayesian de­ci­sion the­ory is the ideal method of achiev­ing our goals, but we haven’t fully worked it out yet, and hu­mans are even fur­ther away from that ideal than just Bayesi­anism. It’s hard to know what we’re even striv­ing for, or what meth­ods could help us do so. We ra­tio­nal­ists do not yet have as re­fined of tools for be­com­ing max­i­mally effec­tive peo­ple as we do for find­ing truth. It is prob­a­ble that the lore to help us do so is already hid­den some­where for us to find.

Bud­dhist med­i­ta­tion may be ra­tio­nal­ity’s true spiritual pre­de­ces­sor. Both arts give the lay­man tools for nav­i­gat­ing thoughtspace and, to a lesser ex­tent, iden­ti­tys­pace. Like med­i­ta­tion, it’s hard to watch some­one be­ing ra­tio­nal. The vast ma­jor­ity of cor­rect think­ing is un­re­mark­able stuff that goes on in­side your head, and fur­ther­more, again like med­i­ta­tion, most of it isn’t ver­bal. It’s in­tu­itive in­stinc­tual re­ac­tion to pat­terns of thought that are ei­ther in har­mony with the Bayes, or not. Dis­tract­ing or bi­as­ing thoughts are pushed aside as the med­i­ta­tor or ra­tio­nal­ist tries to see the world as it re­ally is. Where med­i­ta­tion is con­cerned pri­mar­ily with pre­cise thought fo­cused on the self, the ra­tio­nal­ity glo­rified by Less Wrong is pre­cise thought fo­cused on just about ev­ery­thing else. Per­haps this should change, and the two kin arts should be­come one, more pre­cise and more effec­tive.

I can­not speak with any au­thor­ity on med­i­ta­tion—I hope oth­ers will do so—but it does not seem to me that ra­tio­nal­ity pro­vides one with many tools for nav­i­gat­ing iden­ti­tys­pace. You are told things like, policy de­bates should not ap­pear one-sided. Be wary when ar­gu­ing your preferred po­si­tion, for ar­gu­ments are like sol­diers, and you will feel un­easy about be­tray­ing your al­lies. The Greens and the Blues spilt blood over the most triv­ial and unim­por­tant things; tribal af­fili­a­tions re­volv­ing around sports matches within a sin­gle unified civ­i­liza­tion. Not even Raiders fans are that crazy, gen­er­ally. You are told the story of the cult of Ayn Rand, naively called ‘the most un­likely cult in his­tory.’ You are told of the fear­somly pow­er­ful con­fir­ma­tion bias, pos­i­tive bias, con­sis­tency effects, com­mit­ment effects, cached selves, prim­ing, an­chor­ing, and a le­gion of other ways for your iden­tity to screw you over if you’re not in­cred­ibly at­ten­tive. And these things you are told pre­sent a gestalt of bad habits of thought to avoid around this whole iden­tity thing. But the tools are not pre­cise, and they don’t re­ally al­low you to do it right. Re­v­ersed stu­pidity con­tinu­ally fails to equal in­tel­li­gence. Iden­tity can be good. Per­haps we could har­ness the power of iden­tity to cor­rectly nav­i­gate iden­ti­tys­pace, and thus thoughtspace?

IV: Com­mon At­trac­tors in Mindspace

There are way too many types of at­trac­tors, deflec­tors, warpers, and all sorts of weird crea­tures in mindspace to list ex­haus­tively. Still, these are some of the ones that come most quickly to mind:


  • So­cial sta­tus (The mother of group at­trac­tors, for bet­ter or worse.)

  • Poli­ti­cal (In­cludes those who define them­selves by their dis­in­ter­est in poli­tics.)[pre­tend­ing to be wise]

  • Reli­gious (Athe­ism, though not a re­li­gion, some­times gets treated like one by athe­ists.) [Note: this might be why ag­nos­tics are gen­er­ally saner]

  • Socioeconomic

  • Clique (Col­leges, golf clubs, dance clubs, and or­ga­ni­za­tions.)

  • Occupational

  • Age

  • Ethnic

  • Philo­soph­i­cal (In­clud­ing things like Bayesi­anism and ra­tio­nal­ity.)

  • Sexual


  • Emo­tional, in­stinc­tive (Fast per­cep­tual judg­ment, of­ten too fast to de­bias on the fly.)

  • Prim­ing and an­chor­ing (Hav­ing spe­cific thought pat­terns trig­gered by re­cent stim­uli.)

  • Cached selves (Be­ing like one’s past self due to con­sis­tency effects, etc.)

  • Belief co­her­ence (Hav­ing an iden­ity that avoids cog­ni­tive dis­so­nance.)

  • Heroic as­piring (Be­ing like some­one else, real, fic­tional, or imag­ined.)

  • Nega­tive heroic as­piring (Be­ing un­like some­one else.)

  • Nar­cis­sis­tic (Hav­ing a sense of iden­tity that is most self-glo­rify­ing.)

An ex­haus­tive list of bi­ases could also be made, but the above is a short at­tempt at list­ing the ones that trip peo­ple up most of­ten, as well as the ones that peo­ple tend to use most effec­tively to mo­ti­vate them­selves to ac­com­plish their nom­i­nal goals. The sword is dou­ble-eged, though the nega­tive edge does tend to be sig­nifi­cantly sharper. It is im­por­tant to note that a suffi­ciently ad­vanced ra­tio­nal­ist could use any of the above to more effec­tively achieve their goals. Whether or not that is pos­si­ble in prac­tice is the main ques­tion of this post.

V: Ra­tional Ac­tors and Ra­tion­al­ist Attractors

From the per­spec­tive of an ‘as­piring ra­tio­nal­ist’ (group iden­tity), pos­i­tive at­trac­tors in iden­ti­tys­pace do not tend to be strongly group-based. The sorts of dis­po­si­tions we would like to cul­ti­vate are not par­tic­u­larly tied up in any of the at­trac­tors listed in the ‘group’ sec­tion, and per­haps not re­ally in the ‘non­group’ sec­tion ei­ther. Dis­po­si­tions like ‘I stay fo­cused and work care­fully but effi­ciently on the task at hand’ aren’t re­ally well-con­nected to the most com­mon nat­u­ral at­trac­tors found in iden­ti­tys­pace: maybe if we were in House Huffer­puffer it’d be differ­ent, but as it is, there’s only a vague con­nec­tion with in­stru­men­tal ra­tio­nal­ity, not nearly enough to draw on the power of sig­nal­ing and con­sis­tency effects. Other dis­po­si­tions, like ‘I no­tice con­fu­sion and im­me­di­ately seek to find the faults in my model’, come more nat­u­rally to an ‘as­piring ra­tio­nal­ist’, as such cog­ni­tive tricks are the fo­cus of Less Wrong canon.

There are, of course, se­lec­tion effects: Less Wrong is an at­trac­tor in mindspace that pulls strongest on those that already have similar dis­po­si­tions and thoughts. Cas­cades are pow­er­ful. So it may not be sur­pris­ing that Huffer­puffer dis­po­si­tions do not come nat­u­rally to the as­piring ra­tio­nal­ist.

The down­sides of ra­tio­nal­ist at­trac­tors are fewer than in other groups, but they do ex­ist. The most com­mon one I see is over­con­fi­dence in philo­soph­i­cal in­tu­ition. Be­ing handed a gi­ant tome of care­ful philos­o­phy from (mostly) Eliezer and Robin, we then think that our ad­di­tional philo­soph­i­cal views are similarly bul­let­proof. I had this prob­lem bad dur­ing the first 6 months af­ter read­ing Less Wrong; I didn’t no­tice that in reach­ing the level of ra­tio­nal­ity I’d reached, I was only ver­ify­ing the rea­son­ing of oth­ers, not do­ing new and thor­ough anal­y­sis my­self. This gave me an in­flated con­fi­dence in the cor­rect­ness of my in­tu­ition, even when it clashed with the in­tu­ition of my ra­tio­nal­ist su­pe­ri­ors. Hav­ing the iden­tity of ‘care­ful an­a­lyt­i­cal thinker’ can lead you to think you’re be­ing care­ful when you’re not.

Also, typ­i­cal group­think. We’re a pretty ho­mo­ge­neous bunch, and some­times Less Wrong acts an echo cham­ber for not-ob­vi­ously-bad but non-op­ti­mal be­liefs, habits, and ideas. Even so, this part of the ra­tio­nal­ist iden­tity is coun­tered by counter-cultish­ness, which it­self is coun­tered by aware­ness of cultish counter-cultish­ness and the re­lated coun­tersig­nal­ing. Oh yeah, that’s an­other dou­ble-edged at­trac­tor in ra­tio­nal­ist mindspace: we’re very quick to go meta. Differ­ent peo­ple have differ­ent thoughts as to the over­all use­ful­ness of this dis­po­si­tion. I per­son­ally am rather pro-meta, whereas some of our fo­cused in­stru­men­tal ra­tio­nal­ists think of der wille zur meta as ra­tio­nal­ist fly­pa­per.

Less Wrong, though, is an un­char­ac­ter­is­tic is­land at­trac­tor of rel­a­tive san­ity amongst a con­stel­la­tion of crazy memes. Be­fore we get too ex­cited about our strengths, we should ex­plore some ways that at­trac­tors can lead to our tra­jec­tory through mindspace go­ing dis­as­trously wrong.

VI: So­cial Psy­chol­ogy Meets the Mindspace

So­cial psy­chol­ogy is an in­ter­est­ing sci­ence. It is in­con­sis­tent in its ac­cu­racy, and the the­o­ries don’t always carve re­al­ity at its joints. That said, there’s a lot of in­ter­est­ing thought that’s been put into it, and it is a real sci­ence. The field is very Han­so­nian; in­deed, con­strual level the­ory be­longs to this realm. By cher­ryp­ick­ing what seem to me to be the most in­ter­est­ing con­cepts, I think it may be pos­si­ble to es­tab­lish a frame­work for new ways to ap­proach ra­tio­nal­ist prob­lems in real life by rea­son­ing about so­cial and psy­cholog­i­cal phe­nom­ena in terms of at­trac­tors and tra­jec­to­ries through mindspace.

Cog­ni­tive dis­so­nance is

an un­com­fortable feel­ing caused by hold­ing con­flict­ing ideas si­mul­ta­neously. The the­ory of cog­ni­tive dis­so­nance pro­poses that peo­ple have a mo­ti­va­tional drive to re­duce dis­so­nance. They do this by chang­ing their at­ti­tudes, be­liefs, and ac­tions. Dis­so­nance is also re­duced by jus­tify­ing, blam­ing, and deny­ing. It is one of the most in­fluen­tial and ex­ten­sively stud­ied the­o­ries in so­cial psy­chol­ogy.

Ex­pe­rience can clash with ex­pec­ta­tions, as, for ex­am­ple, with buyer’s re­morse fol­low­ing the pur­chase of a new car. In a state

of dis­so­nance, peo­ple may feel sur­prise, dread, guilt, anger, or em­bar­rass­ment. Peo­ple are bi­ased to think of their choices as cor­rect, de­spite any con­trary ev­i­dence.

A pow­er­ful cause of dis­so­nance is an idea in con­flict with a fun­da­men­tal el­e­ment of the self-con­cept, such as “I am a good per­son” or “I made the right de­ci­sion.” The anx­iety that comes with the pos­si­bil­ity of hav­ing made a bad de­ci­sion can lead to ra­tio­nal­iza­tion, the ten­dency to cre­ate ad­di­tional rea­sons or jus­tifi­ca­tions to sup­port one’s choices. A per­son who just spent too much money on a new car might de­cide that the new ve­hi­cle is much less likely to break down than his or her old car. This be­lief may or may not be true, but it would re­duce dis­so­nance and make the per­son feel bet­ter. Dis­so­nance can also lead to con­fir­ma­tion bias, the de­nial of dis­con­firm­ing ev­i­dence, and other ego defense mechanisms.

Self-ver­ifi­ca­tion is

a so­cial psy­cholog­i­cal the­ory that as­serts peo­ple want to be known and un­der­stood by oth­ers ac­cord­ing to their firmly held be­liefs and feel­ings about them­selves, that is self-views (in­clud­ing self-con­cepts and self-es­teem). A com­pet­ing the­ory to self-ver­ifi­ca­tion is self-en­hance­ment or the drive for pos­i­tive eval­u­a­tions.

Be­cause chronic self-con­cepts and self-es­teem play an im­por­tant role in un­der­stand­ing the world, pro­vid­ing a sense of co­her­ence, and guid­ing ac­tion, peo­ple be­come mo­ti­vated to main­tain them through self-ver­ifi­ca­tion. Such striv­ings provide sta­bil­ity to peo­ple’s lives, mak­ing their ex­pe­riences more co­her­ent, or­derly, and com­pre­hen­si­ble than they would be oth­er­wise. Self-ver­ifi­ca­tion pro­cesses are also adap­tive for groups, groups of di­verse back­grounds and the larger so­ciety, in that they make peo­ple pre­dictable to one an­other thus serve to fa­cil­i­tate so­cial in­ter­ac­tion. To this end, peo­ple en­gage in a va­ri­ety of ac­tivi­ties that are de­signed to ob­tain self-ver­ify­ing in­for­ma­tion.

Con­strual level the­ory is the the study of near ver­sus far modes of cog­ni­tion. Robin Han­son’s pet topic. It at­tempts to describe

the re­la­tion be­tween psy­cholog­i­cal dis­tance and how ab­stract an ob­ject is rep­re­sented in some­one’s mind. The gen­eral idea is that the more dis­tant an ob­ject is from the in­di­vi­d­ual the more ab­stract it will be thought of, while the op­po­site re­la­tion be­tween close­ness and con­crete­ness is true as well. In CLT psy­cholog­i­cal dis­tance is defined on sev­eral di­men­sions—tem­po­ral, spa­tial, so­cial and hy­po­thet­i­cal dis­tance be­ing con­sid­ered most im­por­tant, though there is some de­bate among so­cial psy­chol­o­gists about fur­ther di­men­sions like in­for­ma­tional, ex­pe­ri­en­tial or af­fec­tive dis­tance.

Per­haps the most in­ter­est­ing im­pli­ca­tion of con­strual level the­ory is on be­hav­ior and sig­nal­ing pat­terns: ac­tions are near, goals are far. Thus we are sus­cep­ti­ble to spend­ing time think­ing, talk­ing, and plan­ning in far mode about the peo­ple we will be and the goals we will at­tain, but when the op­por­tu­ni­ties ac­tu­ally show up, near mode prag­ma­tism and hypocrisy are en­gaged, and ra­tio­nal­iza­tions are sud­denly very easy to come by. The effects are profound. Sit­ting at home I think it’d be ridicu­lous not to ask for that one girl’s num­ber. I’m a suave guy, I’m com­pe­tent, and I’m not afraid of re­jec­tion. It’s a re­ally high ex­pected value sce­nario. But when I see her on the street, it turns out my an­ti­ci­pa­tions were wildly off.

Same with grand pro­ject ideas, like re­ally jump­start­ing a ra­tio­nal­ist move­ment. That sounds great. I have a hun­dred ideas to try, and to email to other peo­ple to im­ple­ment. But for some rea­son I never get around to do­ing them my­self, even though I know the only thing stop­ping me is this weird sense of pre­med­i­tated frus­tra­tion at my own in­com­pe­tence. It’s hard to sab­o­tage your­self more suc­cess­fully than that.

One of the most amaz­ing su­per­pow­ers a ra­tio­nal­ist could pick up is the abil­ity to act in near mode to op­ti­mize for ra­tio­nal far mode prefer­ences. A hack like that would lead to a ra­tio­nal­ist win, hands down. More on this in Part Two.

Schemata the­ory should ring your pat­tern match­ing bell. I’ll quote from the Wikipe­dia ar­ti­cle:

A schema (pl. schemata), in psy­chol­ogy and cog­ni­tive sci­ence, de­scribes any of sev­eral con­cepts in­clud­ing:

  • An or­ga­nized pat­tern of thought or be­hav­ior.

  • A struc­tured cluster of pre-con­ceived ideas.

  • A men­tal struc­ture that rep­re­sents sex.

  • A spe­cific knowl­edge struc­ture or cog­ni­tive rep­re­sen­ta­tion of the self.

  • A men­tal frame­work cen­ter­ing on a spe­cific theme, that helps us to or­ga­nize so­cial in­for­ma­tion.

  • Struc­tures that or­ga­nize our knowl­edge and as­sump­tions about some­thing and are used for in­ter­pret­ing and pro­cess­ing in­for­ma­tion.

A schema for one­self is called a “self schema”. Schemata for other peo­ple are called “per­son schemata”. Schemata for roles or oc­cu­pa­tions are called “role schemata”, and schemata for events or situ­a­tions are called “event schemata” (or scripts).

Schemata in­fluence our at­ten­tion, as we are more likely to no­tice things that fit into our schema. If some­thing con­tra­dicts our schema, it may be en­coded or in­ter­preted as an ex­cep­tion or as unique. Thus, schemata are prone to dis­tor­tion. They in­fluence what we look for in a situ­a­tion. They have a ten­dency to re­main un­changed, even in the face of con­tra­dic­tory in­for­ma­tion. We are in­clined to place peo­ple who do not fit our schema in a “spe­cial” or “differ­ent” cat­e­gory, rather than to con­sider the pos­si­bil­ity that our schema may be faulty. As a re­sult of schemata, we might act in such a way that ac­tu­ally causes our ex­pec­ta­tions to come true.

Or in our ter­minol­ogy, schemata are at­trac­tors in mindspace. I highly recom­mend read­ing the whole Wikipe­dia ar­ti­cle about schemata the­ory for a more con­cise and or­ga­nized ver­sion of this post. It’s a gem.

VII: In­tro­duc­ing Goal Distortion

It is difficult to rea­son about what counts as goal dis­tor­tion. Hu­mans are usu­ally con­sid­ered pretty bad at know­ing what sorts of things they want. There are many who lead happy lives as as­cetics with­out think­ing that it’d have been nice to give more than wis­dom to the starv­ing and poor they left along their paths. And there are many more who chase af­ter money and sta­tus with­out re­ally re­al­iz­ing why, nor do they be­come hap­pier in do­ing so. It is im­por­tant, then, to iden­tify which goals we choose to define as be­ing dis­torted, and what goal sys­tem changes count as dis­tor­tion and not sim­ply en­light­ened mat­u­ra­tion. Thus we come to the con­cepts of re­flec­tive en­dorse­ment and ex­trap­o­lated vo­li­tion.

VIII: Thoughts and Signals

You think what you sig­nal, you sig­nal what you think. Feed­back pro­cesses like that don’t of­ten go su­per­crit­i­cal, but as Eliezer points out many times in this sub­se­quence, it is im­por­tant to watch out for such cas­cades. That you end up think­ing a lot about the things you wish to sig­nal is one of those in­sights that Michael Vas­sar tosses around like it’s no big deal, but if taken se­ri­ously, can be quite alarm­ing and profound. I must con­fess, ei­ther I am sig­nifi­cantly more aware of this fact than most, or, more likely, I have a rather strong form of this ten­dency. Be­ing nat­u­rally some­what nar­cis­sis­tic, I tend to have a bad habit of think­ing in di­alogue that is flat­ter­ing to my­self and my virtues. That is, flat­ter­ing to the things I wish to sig­nal. Like most peo­ple, I tend to want to sig­nal good things, and thus by think­ing about those good things I make them part of my iden­tity and be­come more likely to ac­tu­ally do them, no?

Well, some­times it works that way. But we like to sig­nal things in far mode, and it’s pretty easy to ra­tio­nal­ize hypocrisy when near mode work comes up. Most of the time, think­ing about the things I wish to sig­nal is a form of wire­head­ing. Hav­ing a di­alogue play­ing in my head about how I’m such a care­ful ra­tio­nal­ist is ar­guably more pleas­ant and un­doubt­edly a lot eas­ier than ac­tu­ally search­ing for and think­ing about my be­liefs’ or my episte­mol­ogy’s real weak points. And when I’m think­ing these self-glo­rify­ing thoughts all the time, you can bet it comes out in my in­ter­ac­tions with peo­ple.

You sig­nal what you think; it’s not easy to hide. And here en­ters yet an­other dis­tor­tion to mis­lead you: the feel­ing and the be­lief of self-jus­tifi­ca­tion is a very strong at­trac­tor in mindspace, and like many such at­trac­tors, its dan­ger is am­plified by con­fir­ma­tion bias. Imag­ine that you sig­nal your cher­ished dis­po­si­tion to a friend. Say, that you work hard on im­por­tant prob­lems. If your friend agrees and says so, you get a warm fuzzy feel­ing of self-ver­ifi­ca­tion (wiki link). The feed­back loop gets more fuel. Which is great if you’re able to use that fuel to ac­tu­ally do the im­por­tant work you want to sig­nal that you do, but not so much if the fuel is in­stead used to light dis­tract­ing fires for your mind to wor­ship all day in­stead. If your friend dis­agrees and calls you out on it, roughly the same things hap­pen and the same rules ap­ply. The re­sponses are varied, but of­ten peo­ple get offended, and dwell on that offense and how un­true it was, or dwell on ways to prove their at­tacker wrong with far mode thoughts of per­sonal glory, or dwell on past ex­am­ples of hard and dili­gent work done that prove their sig­nals are cred­ible. Such dwelling also pro­vides fuel, though it’s usu­ally cruder, and even harder for the mind to use effec­tively.


Your thoughts are bent by what you wish to sig­nal. Choose your iden­tity care­fully.

IX: Sig­nals and Identity

You sig­nal what you wish to iden­tify with. You iden­tify with what you sig­nal. As if one po­ten­tially catas­trophic feed­back cy­cle was all your brain would provide you with. Always re­mem­ber, it’s never too difficult to shoot your own foot off.

X: Cas­cades: Thoughts, Sig­nals, Identity

It is in­evitable that cas­cades will cause grav­i­ta­tion to­wards sub­op­ti­mal at­trac­tors in mindspace. The strength of the cur­rents the­o­ret­i­cally tells you how hard you must steer in the other di­rec­tion to hold a true course, but re­al­is­ti­cally, hu­mans just aren’t that good at up­dat­ing against known bi­ases. You can see an ice­berg and see its dan­ger, but the whirlpools of con­fir­ma­tion bias have an an­noy­ing ten­dency to look like safe har­bors, even when you’re stuck in them go­ing ‘round and ‘round… and ’round.

You think what you sig­nal, you sig­nal what you think, you sig­nal your iden­tity, you iden­tify with your sig­nals, you think about your iden­tity, you iden­tify with your thoughts. This… is scary. The mind is leaky, and these in­ter­ac­tions are go­ing on con­stantly. Prim­ing, an­chor­ing, com­mit­ment effects, con­sis­tency effects, and of course the dreaded con­fir­ma­tion bias and pos­i­tive bias are all po­ten­tial dan­gers. In one way, it is no won­der that peo­ple don’t seem to change much. With such con­stant con­fir­ma­tion of iden­tity, it’s hard to see how one could change at all. But the un­trained mind is chaotic and of po­ten­tially in­finite malle­abil­ity, and cas­cades can be very pow­er­ful. Drift hap­pens, im­plicit nav­i­ga­tion is un­der­taken. One twin joins a cult, the other joins Less Wrong, which is pretty much the same thing I guess but bear with me.

I will boast that I be­lieve I have found a de­cent set of at­trac­tors in iden­ti­tys­pace to aim for. Thus, though I spend a lot of time wire­head­ing and not ac­tu­ally nav­i­gat­ing to­wards my nom­i­nal ideal dis­po­si­tions nor goals, I’m at least kinda aiming in what seems to be gen­er­ally the right di­rec­tion, as far as my limited ra­tio­nal­ity can see. I’m lucky in that re­gard.

But my map is not the best one to nav­i­gate by, and the vast ma­jor­ity of it is blank, in­clud­ing the most im­por­tant parts; the parts where my goals and my ideal iden­tity lie. I have but a vague sense of di­rec­tion. Fur­ther­more, nearly all of my map is the re­sult of slovenly lines copied sec­ond­hand from the third­hand notes of oth­ers, and a whole bunch of those scrib­bles seem to be leg­ible only as “Here there be drag­ons.” I can only imag­ine how much harder it’d be if I was less aware of the limi­ta­tions of my map, or if I had not cho­sen a set of des­ti­na­tions to nav­i­gate to­wards, or if I’d ac­ci­den­tally got sucked into a whirlpool only to be eaten by Char­by­d­dis. Or what­ever the cog­ni­tive bias equiv­a­lent of Char­by­d­dis is; prob­a­bly faith.

XI: Har­ness­ing the Winds of Change

Nav­i­gat­ing iden­ti­tys­pace is tricky busi­ness. Most peo­ple don’t try to. You ask them what kind of per­son they wish to be or what goals they wish to ac­com­plish, and they ei­ther ad­mit to not re­ally think­ing about it or quickly query their far mode mod­ule for some­thing that sounds re­ally sweet and in­spiring. Those who do try tend to do so im­plic­itly, by ei­ther care­fully mon­i­tor­ing them­selves and the way they change, or care­fully mon­i­tor­ing their goals and in what or­der they are achieved. Often these are the kind of peo­ple that pur­pose­fully go out on Fri­day night with the in­tent of com­ing back home with a story they can tell for years to come. It’s difficult to track your life and your progress with­out hav­ing sto­ries and mile­stones to go by. They do not reg­u­larly try to di­rectly con­trol their course through iden­ti­tys­pace. It’s not ob­vi­ous how one would even at­tempt to do so. As afore­men­tioned, the tools we ra­tio­nal­ists do have are more nat­u­rally suited to nav­i­gat­ing thoughtspace

At­trac­tors pull. I’ve gen­er­ally dealt with that fact in a nega­tive light, be­cause I tend to think mindspace has at least 7,497 di­men­sions, and the co­or­di­nates of the set of op­ti­mal thoughts and there­fore op­ti­mal ac­tions are in a tiny cor­ner of that vast space. Your thoughts are be­ing deflected and ri­co­cheted and pul­led and pushed by a swarm of memes and bi­ases and cached selves and an­chors and all sorts of things that we just can’t keep track of, on timescales from sec­onds to decades. Some forces pul­sate, oth­ers are er­ratic. You think you’re sailing along fine when some stupid thing like the gi­ant cheese­cake fal­lacy blows you oh so slightly off course and causes your en­tire AI ca­reer to go along a to­tally hope­less tra­jec­tory with­out you’re re­al­iz­ing it. Who wants their epic jour­ney tripped up by some­thing as stupidly named as the gi­ant cheese­cake fal­lacy? It’d be less pitiable to be eaten by Char­by­d­dis, at least that’s pretty epic.

Strong meta­ra­tional­ity would have kept that from hap­pen­ing: meta­ra­tional­ity keeps you from failing pre­dictably in spe­cial do­mains. But strong meta­ra­tional­ity can be aided. Hope­fully, the things we hap­pen to be aiming for in iden­ti­tys­pace are sta­ble at­trac­tors that don’t ran­domly push you away or shift around. This is not always the case: some goals are ephemeral, some are cycli­cal. My friend wants a girlfriend one month out of ev­ery two. That sure strains the re­la­tion­ship af­ter a month or three. With such a nat­u­rally chaotic mindspace, it’s difficult to be sure that what you’re aiming for is some­thing that will be there when you get to where you thought it was. You want to be the cool girl at the party, think­ing this is a ter­mi­nal value, but then you suc­ceed and be­come the cool girl at the party and it’s just not all that fulfilling. It’d have been bet­ter to nav­i­gate to­wards a differ­ent des­ti­na­tion.

Not that you can’t set sail for mul­ti­ple places at once: some­times you just want to get to the New World, not a par­tic­u­lar reef of the Ba­hamas. And some­times you may wish to travel to two en­tirely differ­ent places. Do you con­tra­dict your­self? Very well, then you con­tra­dict your­self, you are large, you con­tain mul­ti­tudes. But I think you’ll find it difficult to have two very differ­ent des­ti­na­tions in iden­ti­tys­pace to aim for all at once.

Alright then, enough, you’ve heard all the warn­ings, seen the scrib­bles that say ‘non-neg­ligible po­ten­tial for drag­ons’, and now want to do some pos­i­tive think­ing. How can we har­ness these vari­able winds of change on our jour­ney through iden­ti­tys­pace?

Part Two: Brain­storm­ing Meth­ods for Iden­tity Optimization

