Hero Licensing

I ex­pect most read­ers to know me ei­ther as MIRI’s co-founder and the origi­na­tor of a num­ber of the early re­search prob­lems in AI al­ign­ment, or as the au­thor of Harry Pot­ter and the Meth­ods of Ra­tion­al­ity, a pop­u­lar work of Harry Pot­ter fan­fic­tion. I’ve de­scribed how I ap­ply con­cepts in Inad­e­quate Equil­ibria to var­i­ous de­ci­sions in my per­sonal life, and some read­ers may be won­der­ing how I see these ty­ing in to my AI work and my fic­tion-writ­ing. And I do think these serve as use­ful case stud­ies in in­ad­e­quacy, ex­ploita­bil­ity, and mod­esty.

As a sup­ple­ment to Inad­e­quate Equil­ibria, then, the fol­low­ing is a di­alogue that never took place—largely writ­ten in 2014, and re­vised and posted on­line in 2017.

i. Out­perform­ing and the out­side view

(The year is 2010. eliezer-2010 is sit­ting in a nonex­is­tent park in Red­wood City, Cal­ifor­nia, work­ing on his lap­top. A per­son walks up to him.)

per­son: Par­don me, but are you Eliezer Yud­kowsky?

eliezer-2010: I have that du­bi­ous honor.

per­son: My name is Pat; Pat Modesto. We haven’t met, but I know you from your writ­ing on­line. What are you do­ing with your life these days?

eliezer-2010: I’m try­ing to write a non­fic­tion book on ra­tio­nal­ity. The blog posts I wrote on Over­com­ing Bias—I mean Less Wrong—aren’t very com­pact or ed­ited, and while they had some im­pact, it seems like a book on ra­tio­nal­ity could reach a wider au­di­ence and have a greater im­pact.

pat: Sounds like an in­ter­est­ing pro­ject! Do you mind if I peek in on your screen and—

eliezer: (shield­ing the screen) —Yes, I mind.

pat: Sorry. Um… I did catch a glimpse and that didn’t look like a non­fic­tion book on ra­tio­nal­ity to me.

eliezer: Yes, well, work on that book was go­ing very slowly, so I de­cided to try to write some­thing else in my off hours, just to see if my gen­eral writ­ing speed was slow­ing down to mo­lasses or if it was this par­tic­u­lar book that was the prob­lem.

pat: It looked, in fact, like Harry Pot­ter fan­fic­tion. Like, I’m pretty sure I saw the words “Harry” and “Hermione” in con­figu­ra­tions not origi­nally writ­ten by J. K. Rowl­ing.

eliezer: Yes, and I cur­rently seem to be writ­ing it very quickly. And it doesn’t seem to use up men­tal en­ergy the way my reg­u­lar writ­ing does, ei­ther.

(A mys­te­ri­ous masked stranger, watch­ing this ex­change, sighs wist­fully.)

eliezer: Now I’ve just got to figure out why my main book-writ­ing pro­ject is go­ing so much slower and tak­ing vastly more en­ergy… There are so many books I could write, if I could just write ev­ery­thing as fast as I’m writ­ing this…

pat: Ex­cuse me if this is a silly ques­tion. I don’t mean to say that Harry Pot­ter fan­fic­tion is bad—in fact I’ve read quite a bit of it my­self—but as I un­der­stand it, ac­cord­ing to your ba­sic philos­o­phy the world is cur­rently on fire and needs to be put out. Now given that this is true, why are you writ­ing Harry Pot­ter fan­fic­tion, rather than do­ing some­thing else?

eliezer: I am do­ing some­thing else. I’m writ­ing a non­fic­tion ra­tio­nal­ity book. This is just in my off hours.

pat: Okay, but I’m ask­ing why you are do­ing this par­tic­u­lar thing in your off hours.

eliezer: Be­cause my life is limited by men­tal en­ergy far more than by time. I can cur­rently pro­duce this work very cheaply, so I’m pro­duc­ing more of it.

pat: What I’m try­ing to ask is why, even given that you can write Harry Pot­ter fan­fic­tion very cheaply, you are writ­ing Harry Pot­ter fan­fic­tion. Un­less it re­ally is true that the only rea­son is that you need to ob­serve your­self writ­ing quickly in or­der to un­der­stand the way of quick writ­ing, in which case I’d ask what prob­a­bil­ity you as­sign to learn­ing that suc­cess­fully. I’m skep­ti­cal that this is re­ally the best way of us­ing your off hours.

eliezer: I’m skep­ti­cal that you have cor­rectly un­der­stood the con­cept of “off hours.” There’s a rea­son they ex­ist, and the rea­son isn’t just that hu­mans are lazy. I ad­mit that Anna Sala­mon and Luke Muehlhauser don’t re­quire off hours, but I don’t think they are, tech­ni­cally speak­ing, “hu­mans.”

(The Mys­te­ri­ous Masked Stranger speaks for the first time.)

stranger: Ex­cuse me.

eliezer: Who are you?

stranger: No one of con­se­quence.

pat: And why are you wear­ing a mask?

stranger: Well, I’m definitely not a ver­sion of Eliezer from 2014 who’s se­cretly vis­it­ing the past, if that’s what you’re think­ing.

pat: It’s fair to say that’s not what I’m think­ing.

stranger: Pat and Eliezer-2010, I think the two of you are hav­ing some trou­ble com­mu­ni­cat­ing. The two of you ac­tu­ally dis­agree much more than you think.

pat & eliezer: Go on.

stranger: If you ask Eliezer of Fe­bru­ary 2010 why he’s writ­ing Harry Pot­ter and the Meth­ods of Ra­tion­al­ity, he will, in­deed, re­spond in terms of how he ex­pects writ­ing Meth­ods to pos­i­tively im­pact his at­tempt to write The Art of Ra­tion­al­ity, his at­tempt at a non­fic­tion how-to book. This is be­cause we have—I mean, Eliezer has—a heuris­tic of plan­ning on the main­line, which means that his pri­mary jus­tifi­ca­tion for any­thing will be phrased in terms of how it pos­i­tively con­tributes to a “nor­mal” fu­ture timeline, not low-prob­a­bil­ity side-sce­nar­ios.

eliezer: Sure.

pat: Wait, isn’t your whole life—

eliezer: No.

stranger: Eliezer-2010 also has a heuris­tic that might be de­scribed as “never try to do any­thing un­less you have a chance of ad­vanc­ing the Pareto fron­tier of the cat­e­gory.” In other words, if he’s ex­pect­ing that some other work will be strictly bet­ter than his along all di­men­sions, it won’t oc­cur to Eliezer-2010 that this is some­thing he should spend time on. Eliezer-2010 thinks he has the po­ten­tial to do things that ad­vance Pareto fron­tiers, so why would he con­sider a pro­ject that wasn’t try­ing? So, off-hours or not, Eliezer wouldn’t be work­ing on this story if he thought it would be strictly dom­i­nated along ev­ery di­men­sion by any other work of fan­fic­tion, or in­deed, any other book.

pat: Um—

eliezer: I wouldn’t put it in ex­actly those terms.

stranger: Yes, be­cause when you say things like that out loud, peo­ple start say­ing the word “ar­ro­gance” a lot, and you don’t fully un­der­stand the rea­sons. So you’ll clev­erly dance around the words and try to avoid that branch of pos­si­ble con­ver­sa­tion.

pat: Is that true?

eliezer: It sounds to me like the Masked Stranger is try­ing to use the Bar­num effect—like, most peo­ple would ac­knowl­edge that as a se­cret de­scrip­tion of them­selves if you asked them.

pat: …… I re­ally, re­ally don’t think so.

eliezer: I’d be sur­prised if it were less than 10% of the pop­u­la­tion, se­ri­ously.

stranger: Eliezer, you’ll have a some­what bet­ter un­der­stand­ing of hu­man sta­tus emo­tions in 4 years. Though you’ll still only go there when you have a point to make that can’t be made any other way, which in turn will be un­for­tu­nately of­ten as mod­est episte­mol­ogy norms prop­a­gate through your com­mu­nity. But any­way, Pat, the fact that Eliezer-2010 has spent any sig­nifi­cant amount of time on Harry Pot­ter and the Meth­ods of Ra­tion­al­ity in­deed lets you in­fer that Eliezer-2010 thinks Meth­ods has a chance of be­ing out­stand­ing along some key di­men­sion that in­ter­ests him—of ad­vanc­ing the fron­tiers of what has ever been done—al­though he might hes­i­tate to tell you that be­fore he’s ac­tu­ally done it.

eliezer: Okay, yes, that’s true. I’m un­happy with the treat­ment of sup­pos­edly “in­tel­li­gent” and/​or “ra­tio­nal” char­ac­ters in fic­tion and I want to see it done right just once, even if I have to write the story my­self. I have an ex­plicit the­sis about what’s be­ing done wrong and how to do it bet­ter, and if this were not the case then the prospect of writ­ing Meth­ods would not in­ter­est me as much.

stranger: (aside) There’s so much civ­i­liza­tional in­ad­e­quacy in our wor­ld­view that we hardly even no­tice when we in­voke it. Not that this is an alarm­ing sign, since, as it hap­pens, we do live in an in­ad­e­quate civ­i­liza­tion.

eliezer: (con­tin­u­ing to Pat) How­ever, the rea­son I hold back from say­ing in ad­vance what Meth­ods might ac­com­plish isn’t just mod­esty. I’m gen­uinely un­sure that I can make Meth­ods be what I think it can be. I don’t want to promise more than I can de­liver. And since one should first plan along the main­line, if in­ves­ti­gat­ing the con­di­tions un­der which I can write quickly weren’t a suffi­ciently im­por­tant rea­son, I wouldn’t be do­ing this.

stranger: (aside) I have some doubts about that alleged jus­tifi­ca­tion in ret­ro­spect, though it wasn’t stupid.

pat: Can you say more about how you think your Harry Pot­ter story will have out­stand­ingly “in­tel­li­gent” char­ac­ters?

eliezer: I’d rather not? As a mat­ter of liter­a­ture, I should show, not tell, my the­sis. Ob­vi­ously it’s not that I think that my char­ac­ters are go­ing to learn fifty-seven lan­guages be­cause they’re su­per-smart. I think most at­tempts to cre­ate “in­tel­li­gent char­ac­ters” fo­cus on sur­face qual­ities, like how many lan­guages some­one has learned, or they fo­cus on stereo­typ­i­cal sur­face fea­tures the au­thor has seen in other “ge­nius” char­ac­ters, like a feel­ing of aliena­tion. If it’s a movie, the char­ac­ter talks with a Bri­tish ac­cent. It doesn’t seem like most such au­thors are aware of Vinge’s rea­son­ing for why it should be hard to write a char­ac­ter that is smarter than the au­thor. Like, if you know ex­actly where an ex­cel­lent chess­player would move on a chess­board, you must be at least that good at play­ing chess your­self, be­cause you could always just make that move. For ex­actly the same rea­son, it’s hard to write a char­ac­ter that’s more ra­tio­nal than the au­thor.

I don’t think the con­cept of “in­tel­li­gence” or “ra­tio­nal­ity” that’s be­ing used in typ­i­cal liter­a­ture has any­thing to do with dis­cern­ing good choices or mak­ing good pre­dic­tions. I don’t think there is a stan­dard liter­ary con­cept for char­ac­ters who ex­cel at cog­ni­tive op­ti­miza­tion, dis­tinct from char­ac­ters who just win be­cause they have a magic sword in their brains. And I don’t think most au­thors of “ge­nius” char­ac­ters re­spect their sup­posed ge­niuses enough to re­ally put them­selves in their shoes—to re­ally feel what their in­ner lives would be like, and think be­yond the first cliche that comes to mind. The au­thor still sets them­selves above the “ge­nius,” gives the ge­nius some kind of ob­vi­ous stu­pidity that lets the au­thor main­tain emo­tional dis­tance…

stranger: (aside) Most writ­ers have a hard time con­cep­tu­al­iz­ing a char­ac­ter who’s gen­uinely smarter than the au­thor; most fu­tur­ists have a hard time con­cep­tu­al­iz­ing gen­uinely smarter-than-hu­man AI; and in­deed, peo­ple of­ten ne­glect the hy­poth­e­sis that par­tic­u­larly smart hu­man be­ings will have already taken into ac­count all the fac­tors that they con­sider ob­vi­ous. But with re­spect to suffi­ciently com­pe­tent in­di­vi­d­u­als mak­ing de­ci­sions that they can make on their own cog­nizance—as op­posed to any larger bu­reau­cracy or com­mit­tee, or the col­lec­tive be­hav­ior of a field—it is of­ten ap­pro­pri­ate to ask if they might be smarter than you think, or have bet­ter jus­tifi­ca­tions than are ob­vi­ous to you.

pat: Okay, but sup­pos­ing you can write a book with in­tel­li­gent char­ac­ters, how does that help save the world, ex­actly?

eliezer: Why are you fo­cus­ing on the word “in­tel­li­gence” in­stead of “ra­tio­nal­ity”? But to an­swer your ques­tion, non­fic­tion writ­ing con­veys facts; fic­tion writ­ing con­veys ex­pe­riences. I’m wor­ried that my pre­vi­ous two years of non­fic­tion blog­ging haven’t pro­duced nearly enough trans­fer of real cog­ni­tive skills. The hope is that writ­ing about the in­ner ex­pe­rience of some­one try­ing to be ra­tio­nal will con­vey things that I can’t eas­ily con­vey with non­fic­tion blog posts.

stranger: (laughs)

eliezer: What is it, Masked Stranger?

stranger: Just… you’re so very mod­est.

eliezer: You’re say­ing this to me?

stranger: It’s sort of ob­vi­ous from where I live now. So very care­ful not to say what you re­ally hope Harry Pot­ter and the Meth­ods of Ra­tion­al­ity will do, be­cause you know peo­ple like Pat won’t be­lieve it and can’t be per­suaded to be­lieve it.

pat: This guy is weird.

eliezer: (shrug­ging) A lot of peo­ple are.

pat: Let’s ig­nore him. So you’re presently in­vest­ing a lot of hours—

eliezer: But sur­pris­ingly lit­tle men­tal en­ergy.

stranger: Where I come from, we would say that you’re in­vest­ing sur­pris­ingly few spoons.

pat: —but still a lot of hours, into craft­ing a Harry Pot­ter story with, you hope, ex­cep­tion­ally ra­tio­nal char­ac­ters. Which will cause some of your read­ers to ab­sorb the ex­pe­rience of be­ing ra­tio­nal. Which you think even­tu­ally ends up im­por­tant to sav­ing the world.

eliezer: Mm, more or less.

pat: What do you think the out­side view would say about—

eliezer: Ac­tu­ally, I think I’m about out of time for to­day. (Starts to close his lap­top.)

stranger: Wait. Please stick around. Can you take my word that it’s im­por­tant?

eliezer: …all right. I sup­pose I don’t have very much ex­pe­rience with listen­ing to Masked Strangers, so I’ll try that and see what hap­pens.

pat: What did I say wrong?

stranger: You said that the con­ver­sa­tion would never go any­where helpful.

eliezer: I wouldn’t go that far. It’s true that in my ex­pe­rience, though, peo­ple who use the phrase “out­side view” usu­ally don’t offer ad­vice that I think is true, and the con­ver­sa­tions take up a lot of men­tal en­ergy—spoons, you called them? But since I’m tak­ing the Masked Stranger’s word on things and try­ing to con­tinue, fine. What do you think the out­side view has to say about the Meth­ods of Ra­tion­al­ity pro­ject?

pat: Well, I was just go­ing to ask you to con­sider what the av­er­age story with a ra­tio­nal char­ac­ter in it ac­com­plishes in the way of skill trans­fer to read­ers.

eliezer: I’m not try­ing to write an av­er­age story. The whole point is that I think the av­er­age story with a “ra­tio­nal” char­ac­ter is screwed up.

pat: So you think that your char­ac­ters will be truly ra­tio­nal. But maybe those au­thors also think their char­ac­ters are ra­tio­nal—

eliezer: (in a whisper to the Masked Stranger) Can I exit this con­ver­sa­tion?

stranger: No. Se­ri­ously, it’s im­por­tant.

eliezer: Fine. Pat, your pre­sump­tion is wrong. Th­ese hy­po­thet­i­cal au­thors mak­ing a huge effort to craft ra­tio­nal char­ac­ters don’t ac­tu­ally ex­ist. They don’t re­al­ize that it should take an effort to craft ra­tio­nal char­ac­ters; they’re just re­gur­gi­tat­ing cliches about Straw Vul­cans with very lit­tle self-per­ceived men­tal effort.

stranger: Or as I would phrase it: This is not one of the places where our civ­i­liza­tion puts in enough effort that we should ex­pect ad­e­quacy.

pat: Look, I don’t dis­pute that you can prob­a­bly write char­ac­ters more ra­tio­nal than those of the av­er­age au­thor; I just think it’s im­por­tant to re­mem­ber, on each oc­ca­sion, that be­ing wrong feels just like be­ing right.

stranger: Eliezer, please tell him what you ac­tu­ally think of that re­mark.

eliezer: You do not re­mem­ber on each oc­ca­sion that “be­ing wrong feels just like be­ing right.” You re­mem­ber it on highly se­lec­tive oc­ca­sions where you are mo­ti­vated to be skep­ti­cal of some­one else. This feels just like re­mem­ber­ing it on ev­ery rele­vant oc­ca­sion, since, af­ter all, ev­ery time you felt like you ought to think of it, you did. You just used a fully gen­eral coun­ter­ar­gu­ment, and the prob­lem with ar­gu­ments like that is that they provide no Bayesian dis­crim­i­na­tion be­tween oc­ca­sions where we are wrong and oc­ca­sions where we are right. Like “but I have faith,” “be­ing wrong feels just like be­ing right” is as easy to say on oc­ca­sions when some­one is right as on oc­ca­sions when they are wrong.

stranger: There is a stage of cog­ni­tive prac­tice where peo­ple should med­i­tate on how the map is not the ter­ri­tory, es­pe­cially if it’s never be­fore oc­curred to them that what feels like the uni­verse of their im­mer­sion is ac­tu­ally their brain’s re­con­structed map of the true uni­verse. It’s just that Eliezer went through that phase while read­ing S. I. Hayakawa’s Lan­guage in Thought and Ac­tion at age eleven or so. Once that les­son is fully ab­sorbed in­ter­nally, in­vok­ing the map-ter­ri­tory dis­tinc­tion as a push against ideas you don’t like is (fully gen­eral) mo­ti­vated skep­ti­cism.

pat: Leav­ing that aside, there’s this re­search show­ing that there’s a very use­ful tech­nique called “refer­ence class fore­cast­ing”—

eliezer: I am aware of this.

pat: And I’m won­der­ing what refer­ence class fore­cast­ing would say about your at­tempt to do good in the world via writ­ing Harry Pot­ter fan­fic­tion.

eliezer: (to the Masked Stranger) Please can I run away?

stranger: No.

eliezer: (sigh­ing) Okay, to take the ques­tion se­ri­ously as more than generic skep­ti­cism: If I think of the books which I re­gard as hav­ing well-done ra­tio­nal char­ac­ters, their track record isn’t bad. A. E. van Vogt’s The World of Null-A was an in­spira­tion to me as a kid. Null-A didn’t just teach me the phrase “the map is not the ter­ri­tory”; it was where I got the idea that peo­ple em­ploy­ing ra­tio­nal­ity tech­niques ought to be awe­some and if they weren’t awe­some that meant they were do­ing some­thing wrong. There are a heck of a lot of sci­en­tists and en­g­ineers out there who were in­spired by read­ing one of Robert A. Hein­lein’s hymns in praise of sci­ence and en­g­ineer­ing—yes, I know Hein­lein had prob­lems, but the fact re­mains.

stranger: I won­der what smart kids who grew up read­ing Harry Pot­ter and the Meth­ods of Ra­tion­al­ity as twelve-year-olds will be like as adults…

pat: But surely van Vogt’s Null-A books are an ex­cep­tional case of books with ra­tio­nal­ist char­ac­ters. My first ques­tion is, what rea­son do you have to be­lieve you can do that? And my sec­ond ques­tion is, even given that you write a ra­tio­nal char­ac­ter as in­spiring as a char­ac­ter in a Hein­lein novel, how much im­pact do you think one char­ac­ter like that has on an av­er­age reader, and how many peo­ple do you think will read your Harry Pot­ter fan­fic­tion in the best case?

eliezer: To be hon­est, it feels to me like you’re ask­ing the wrong ques­tions. Like, it would never oc­cur to me to ask any of the ques­tions you’re ask­ing now, in the course of set­ting out to write Meth­ods.

stranger: (aside) That’s true, by the way. None of these ques­tions ever crossed my mind in the origi­nal timeline. I’m only ask­ing them now be­cause I’m writ­ing the char­ac­ter of Pat Modesto. A voice like Pat Modesto is not a pro­duc­tive voice to have in­side your head, in my opinion, so I don’t spon­ta­neously won­der what he would say.

eliezer: To pro­duce the best novel I can, it makes sense for me to ask what other au­thors were do­ing wrong with their ra­tio­nal char­ac­ters, and what A. E. van Vogt was do­ing right. I don’t see how it makes sense for me to be ner­vous about whether I can do bet­ter than A. E. van Vogt, who had no bet­ter source to work with than Alfred Korzyb­ski, decades be­fore Daniel Kah­ne­man was born. I mean, to be hon­est about what I’m re­ally think­ing: So far as I’m con­cerned, I’m already walk­ing out­side what­ever so-called refer­ence class you’re in­evitably go­ing to put me in—

pat: What?! What the heck does it mean to “walk out­side” a refer­ence class?

eliezer: —which doesn’t guaran­tee that I’ll suc­ceed, be­cause be­ing out­side of a refer­ence class isn’t the same as be­ing bet­ter than it. It means that I don’t draw con­clu­sions from the refer­ence class to my­self. It means that I try, and see what hap­pens.

pat: You think you’re just au­to­mat­i­cally bet­ter than ev­ery other au­thor who’s ever tried to write ra­tio­nal char­ac­ters?

eliezer: No! Look, think­ing things like that is just not how the in­side of my head is or­ga­nized. There’s just the book I have in my head and the ques­tion of whether I can trans­late that image into re­al­ity. My men­tal world is about the book, not about me.

pat: But if the book you have in your head im­plies that you can do things at a very high per­centile level, rel­a­tive to the av­er­age fic­tion au­thor, then it seems rea­son­able for me to ask why you already think you oc­cupy that per­centile.

stranger: Let me try and push things a bit fur­ther. Eliezer-2010, sup­pose I told you that as of the start of 2014, Meth­ods suc­ceeded to the fol­low­ing level. First, it has roughly half a mil­lion words, but you’re not finished writ­ing it—

eliezer: Damn. That’s dis­ap­point­ing. I must have slowed down a lot, and definitely haven’t mas­tered the se­cret of what­ever speed-writ­ing I’m do­ing right now. I won­der what went wrong? Ac­tu­ally, why am I hy­po­thet­i­cally con­tin­u­ing to write this book in­stead of giv­ing up?

stranger: Be­cause it’s the most re­viewed work of Harry Pot­ter fan­fic­tion out of more than 500,000 sto­ries on fan­fic­tion.net, has or­ga­nized fan­doms in many uni­ver­si­ties and col­leges, has re­ceived at least 15,000,000 page views on what is no longer the main refer­enced site, has been turned by fans into an au­dio­book via an or­ga­nized pro­ject into which you your­self put zero effort, has been trans­lated by fans into many lan­guages, is fa­mous among the Caltech/​​MIT crowd, has its own daily-traf­ficked sub­red­dit with 6,000 sub­scribers, is of­ten cited as the most fa­mous or the most pop­u­lar work of Harry Pot­ter fan­fic­tion, is con­sid­ered by a no­tice­able frac­tion of its read­ers to be liter­ally the best book they have ever read, and on at least one oc­ca­sion in­spired an In­ter­na­tional Math­e­mat­i­cal Olympiad gold medal­ist to join the al­li­ance and come to mul­ti­ple math work­shops at MIRI.

eliezer: I like this sce­nario. It is weird, and I like weird. I would de­rive end­less plea­sure from in­flict­ing this state of af­fairs on re­al­ity and forc­ing peo­ple to come to terms with it.

stranger: Any­way, what prob­a­bil­ity would you as­sign to things go­ing at least that well?

eliezer: Hm… let me think. Ob­vi­ously this ex­act sce­nario is im­prob­a­ble, be­cause con­junc­tive. But if we par­ti­tion out­comes ac­cord­ing to whether they rank at least this high or bet­ter in my util­ity func­tion, and ask how much prob­a­bil­ity mass I put into out­comes like that, then I think it’s around 10%. That is, a suc­cess like this would come in at around the 90th per­centile of my hopes.

pat: (in­co­her­ent noises)

eliezer: Oh. Oops. I for­got you were there.

pat: 90th per­centile?! You mean you se­ri­ously think there’s a 1 in 10 chance that might hap­pen?

eliezer: Ah, um…

stranger: Yes, he does. He wouldn’t have con­sid­ered it in ex­actly those words if I hadn’t put it that way—not just be­cause it’s ridicu­lously spe­cific, but be­cause Eliezer Yud­kowsky doesn’t think in terms like that in ad­vance of en­coun­ter­ing the ac­tual fact. He would con­sider it a “spe­cific fan­tasy” that was threat­en­ing to drain away his emo­tional en­ergy. But if it did hap­pen, he would af­ter­ward say that he had achieved an out­come such that around 10% of his prob­a­bil­ity mass “would have been” in out­comes like that one or bet­ter, though he would worry about be­ing hind­sight-bi­ased.

pat: I think a rea­son­able prob­a­bil­ity for an out­come like that would be more like 0.1%, and even that is be­ing ex­tremely gen­er­ous!

eliezer: “Out­side view­ers” sure seem to tell me that a lot when­ever I try to do any­thing in­ter­est­ing. I’m ac­tu­ally kind of sur­prised to hear you say that, though. I mean, my ba­sic hy­poth­e­sis for how the “out­side view” thing op­er­ates is that it’s an ex­pres­sion of in­cre­dulity that can be lev­eled against any tar­get by cherry-pick­ing a refer­ence class that pre­dicts failure. One then builds an in­escapable epistemic trap around that refer­ence class by talk­ing about the Dun­ning-Kruger effect and the dan­gers of in­side-view­ing. But try­ing to write Harry Pot­ter fan­fic­tion, even un­usu­ally good Harry Pot­ter fan­fic­tion, should sound to most peo­ple like it’s not high-sta­tus. I would ex­pect peo­ple to re­act mainly to the part about the IMO gold medal­ist, even though the base rate for be­ing an IMO gold medal­ist is higher than the base rate for au­thor­ing the most-re­viewed Harry Pot­ter fan­fic­tion.

pat: Have you ever even tried to write Harry Pot­ter fan­fic­tion be­fore? Do you know any of the stan­dard awards that help pub­li­cize the best Harry Pot­ter fan works or any of the stan­dard sites that recom­mend them? Do you have any idea what the vast ma­jor­ity of the au­di­ence for Harry Pot­ter fan­fic­tion wants? I mean, just the fact that you’re pub­lish­ing on FanFic­tion.Net is go­ing to turn off a lot of peo­ple; the bet­ter sto­ries tend to be hosted at ArchiveOfOurOwn.Org or on other, more spe­cial­ized sites.

eliezer: Oh. I see. You do know about the pre-ex­ist­ing on­line Harry Pot­ter fan­fic­tion com­mu­nity, and you’re in­volved in it. You ac­tu­ally have a pre-ex­ist­ing sta­tus hi­er­ar­chy built up in your mind around Harry Pot­ter fan­fic­tion. So when the Masked Stranger talks about Meth­ods be­com­ing the most pop­u­lar Harry Pot­ter fan­fic­tion ever, you re­ally do hear that as an over­reach­ing sta­tus-claim, and you do that thing that makes an ar­bi­trary propo­si­tion sound very im­prob­a­ble us­ing the “out­side view.”

pat: I don’t think the out­side view, or refer­ence class fore­cast­ing, can make ar­bi­trary events sound very im­prob­a­ble. I think it makes events that won’t ac­tu­ally hap­pen sound very im­prob­a­ble. As for my prior ac­quain­tance with the com­mu­nity—how is that sup­posed to de­value my opinions? I have do­main ex­per­tise. I have some ac­tual idea of how many thou­sands of au­thors, in­clud­ing some very good au­thors, are try­ing to write Harry Pot­ter fan­fic­tion, only one of whom can au­thor the most-re­viewed story. And I’ll ask again, did you bother to ac­quire any idea of how this com­mu­nity ac­tu­ally works? Can you name a sin­gle an­nual award that’s given out in the Harry Pot­ter fan­fic­tion com­mu­nity?

eliezer: Um… not off the top of my head.

pat: Have you asked any of the ex­ist­ing top Harry Pot­ter fan­fic­tion au­thors to re­view your pro­posed plot, or your pro­posed story ideas? Like Non­jon, au­thor of A Black Com­edy? Or Sarah1281 or JBern or any of the other au­thors who have cre­ated mul­ti­ple works widely ac­knowl­edged as ex­cel­lent?

eliezer: I must hon­estly con­fess, al­though I’ve read those au­thors and liked their sto­ries, that thought never even crossed my mind as a pos­si­ble ac­tion.

pat: So you haven’t con­sulted any­one who knows more about Harry Pot­ter fan­dom than you do.

eliezer: Nope.

pat: You have not writ­ten any prior Harry Pot­ter fan­fic­tion—not even a short story.

eliezer: Cor­rect.

pat: You have made no pre­vi­ous effort to en­gage with the ex­ist­ing com­mu­nity of peo­ple who read or write Harry Pot­ter fan­fic­tion, or learn about ex­ist­ing gate­keep­ers on which the suc­cess of your story will de­pend.

eliezer: I’ve read some of the top pre­vi­ous Harry Pot­ter fan works, since I en­joyed read­ing them. That, of course, is why the story idea popped into my head in the first place.

pat: What would you think of some­body who’d read a few pop­u­lar physics books and wanted to be the world’s great­est physi­cist?

stranger: (aside) It ap­pears to me that since the “out­side view” as usu­ally in­voked is re­ally about sta­tus hi­er­ar­chy, signs of dis­re­spect­ing the ex­ist­ing hi­er­ar­chy will tend to pro­voke stronger re­ac­tions, and dis­re­spect­ful-seem­ing claims that you can out­perform some bench­mark will be treated as much larger fac­tors pre­dict­ing failure than re­spect­ful-seem­ing claims that you can out­perform an equiv­a­lent bench­mark. It seems that physics crack­pots feel rele­vantly analo­gous here be­cause crack­pots aren’t just epistem­i­cally mis­guided—that would be tragi­comic, but it wouldn’t evoke the same feel­ings of con­tempt or dis­gust. What dis­t­in­guishes physics crack­pots is that they’re epistem­i­cally mis­guided in ways that dis­re­spect high-sta­tus peo­ple on an im­por­tant hi­er­ar­chy—physi­cists. This feels like a rele­vant refer­ence class for un­der­stand­ing other ap­par­ent ex­am­ples of dis­re­spect­fully claiming to be high-sta­tus, be­cause the evoked feel­ing is similar even if the phe­nom­ena differ in other ways.

eliezer: If you want to be a great physi­cist, you have to find the true law of physics, which is already out there in the world and not known to you. This isn’t some­thing you can re­al­is­ti­cally achieve with­out work­ing alongside other physi­cists, be­cause you need an ex­traor­di­nar­ily spe­cific key to fit into this ex­traor­di­nar­ily spe­cific lock. In con­trast, there are many pos­si­ble books that would suc­ceed over all past Harry Pot­ter fan­fic­tion, and you don’t have to build a par­ti­cle ac­cel­er­a­tor to figure out which one to write.

stranger: I no­tice that when you try to es­ti­mate the difficulty of be­com­ing the great­est physi­cist ever, Eliezer, you try to figure out the difficulty of the cor­re­spond­ing cog­ni­tive prob­lem. It doesn’t seem to oc­cur to you to fo­cus on the fame.

pat: Eliezer, you seem to be de­liber­ately miss­ing the point of what’s wrong with read­ing a few physics books and then try­ing to be­come the world’s great­est physi­cist. Don’t you see that this er­ror has the same struc­ture as your Harry Pot­ter pipe dream, even if the mis­take’s mag­ni­tude is greater? That a critic would say the same sort of things to them as I am say­ing to you? Yes, be­com­ing the world’s great­est physi­cist is much more difficult. But you’re try­ing to do this lesser im­pos­si­ble task in your off-hours be­cause you think it will be easy.

eliezer: In the suc­cess sce­nario the Masked Stranger de­scribed, I would in­vest more effort into later chap­ters be­cause it would have proven to be worth it.

stranger: Hey, Pat? Did you know that Eliezer hasn’t ac­tu­ally read the origi­nal Harry Pot­ter books four through six, just watched the movies? And even af­ter the book starts to take off, he still won’t get around to read­ing them.

pat: (in­co­her­ent noises)

eliezer: Um… look, I read books one through three when they came out, and later I tried read­ing book four. The prob­lem was, I’d already read so much Harry Pot­ter fan­fic­tion by then that I was used to think­ing of the Pot­ter­verse as a place for grown-up sto­ries, and this pro­duced a state change in my brain, so when I tried to read Harry Pot­ter and the Goblet of Fire it didn’t feel right. But I’ve read enough fan­fic­tion based in the Pot­ter­verse that I know the uni­verse very well. I can tell you the name of Fleur Dela­cour’s lit­tle sister. In fact, I’ve read an en­tire novel about Gabrielle Dela­cour. I just haven’t read all the origi­nal books.

stranger: And when that’s not good enough, Eliezer con­sults the Harry Pot­ter Wikia to learn rele­vant facts from canon. So you see he has all the knowl­edge he thinks he needs.

pat: (more in­co­her­ent noises)

eliezer: …why did you tell Pat that, Masked Stranger?

stranger: Be­cause Pat will think it’s a tremen­dously rele­vant fact for pre­dict­ing your failure. This illus­trates a crit­i­cal life les­son about the differ­ence be­tween mak­ing obei­sances to­ward a field by read­ing works to demon­strate so­cial re­spect, and try­ing to gather key knowl­edge from a field so you can ad­vance it. The lat­ter is nec­es­sary for suc­cess; the former is pri­mar­ily im­por­tant in­so­far as pub­lic re­la­tions with gate­keep­ers is im­por­tant. I think that peo­ple who aren’t sta­tus-blind have a harder time tel­ling the differ­ence.

pat: It’s true that I feel a cer­tain sense of in­dig­na­tion—of, in­deed, J. K. Rowl­ing and the best ex­ist­ing Harry Pot­ter fan­fic­tion writ­ers be­ing ac­tively dis­re­spected—when you tell me that Eliezer hasn’t read all of the canon books and that he thinks he’ll make up for it by con­sult­ing a wiki.

eliezer: Well, if I can try to re­pair some of the pub­lic re­la­tions dam­age: If I thought I could write chil­dren’s books as pop­u­lar as J. K. Rowl­ing’s origi­nals, I would be do­ing that in­stead. J. K. Rowl­ing is now a billion­aire, plus she taught my lit­tle sister to en­joy read­ing. Peo­ple who triv­ial­ize that as “writ­ing chil­dren’s books” ob­vi­ously have never tried to write any­thing them­selves, let alone chil­dren’s books. Writ­ing good chil­dren’s liter­a­ture is hard—which is why Meth­ods is go­ing to be aimed at older read­ers. Con­trary to the model you seem to be form­ing of me, I have a de­tailed model of my own limi­ta­tions as well as my cur­rent ca­pa­bil­ities, and I know that I am not cur­rently a good enough au­thor to write chil­dren’s books.

pat: I can imag­ine a state of af­fairs where I would es­ti­mate some­one to have an ex­cel­lent chance of writ­ing the best Harry Pot­ter fan­fic­tion ever made, even af­ter read­ing only the first three canon books—say, if Neil Gaiman tried it. (Though Neil Gaiman, I’m damned sure, just would read the origi­nal canon books.) Do you think you’re as good as Neil Gaiman?

eliezer: I don’t ex­pect to ever have enough time to in­vest in writ­ing to be­come as good as Neil Gaiman.

pat: I’ve read your Three Wor­lds Col­lide, which I think is your best story, and I’m aware that it was men­tioned fa­vor­ably by a Hugo-award-win­ning au­thor, Peter Watts. But I don’t think Three Wor­lds Col­lide is on the liter­ary level of, say, the fan­fic­tion Always and Always Part 1: Back­wards With Pur­pose. So what feats of writ­ing have you already performed that make you think your pro­ject has a 10% chance of be­com­ing the most-re­viewed Harry Pot­ter fan­fic­tion in ex­is­tence?

eliezer: What you’re cur­rently do­ing is what I call “de­mand­ing to see my hero li­cense.” Roughly, I’ve de­clared my in­ten­tion to try to do some­thing that’s in ex­cess of what you think matches my cur­rent so­cial stand­ing, and you want me to show that I already have enough sta­tus to do it.

pat: Ad hominem; you haven’t an­swered my ques­tion. I don’t see how, on the knowl­edge you presently have and on the ev­i­dence already available, you can pos­si­bly jus­tify giv­ing your­self a 10% prob­a­bil­ity here. But let me make sure, first, that we’re us­ing the same con­cepts. Is that “10%” sup­posed to be an ac­tual well-cal­ibrated prob­a­bil­ity?

eliezer: Yes, it is. If I in­ter­ro­gate my mind about bet­ting odds, I think I’d take your money at 20:1—like, if you offered me $20 against $1 that the fan­fic­tion wouldn’t suc­ceed—and I’d start feel­ing ner­vous about bet­ting the other way at $4 against $1, where you’ll pay out $4 if the fan­fic­tion suc­ceeds in ex­change for $1 if it doesn’t. Split­ting the differ­ence at some­where near the ge­o­met­ric mean, we could call that 9:1 odds.

pat: And do you think you’re well-cal­ibrated? Like, things you as­sign 9:1 odds should hap­pen 9 out of 10 times?

eliezer: Yes, I think I could make 10 state­ments of this difficulty that I as­sign 90% prob­a­bil­ity, and be wrong on av­er­age about once. I haven’t tested my cal­ibra­tion as ex­ten­sively as some peo­ple in the ra­tio­nal­ist com­mu­nity, but the last time I took a CFAR cal­ibra­tion-test­ing sheet with 10 items on them and tried to put 90% cred­i­bil­ity in­ter­vals on them, I got ex­actly one true value out­side my in­ter­val. Achiev­ing okay cal­ibra­tion, with a bit of study and a bit of prac­tice, is not any­where near as sur­pris­ing as out­side-view types make it out to be.

stranger: (aside) Eliezer-2010 doesn’t use Pre­dic­tionBook as of­ten as Gw­ern Bran­wen, doesn’t play cal­ibra­tion party games as of­ten as Anna Sala­mon and Carl Shul­man, and didn’t join Philip Tet­lock’s study on su­per­pre­dic­tion. But I did make bets when­ever I had the op­por­tu­nity, and still do; and I try to set nu­meric odds when­ever I feel un­cer­tain and know I’ll find out the true value shortly.

I re­cently saw a cryp­tic set of state­ments on my re­friger­a­tor’s white­board about a “boiler” and var­i­ous strange num­bers and di­a­grams, which greatly con­fused me for five sec­onds be­fore I hy­poth­e­sized that they were notes about Brienne’s on­go­ing progress through the game Myst. Since I felt un­cer­tain, but could find out the truth soon, I spent thirty sec­onds try­ing to tweak my ex­act prob­a­bil­ity es­ti­mate of these be­ing notes for Brienne’s game. I started with a 90% “first pass” prob­a­bil­ity that they were Myst notes, which felt ob­vi­ously over­con­fi­dent, so I ad­justed that down to 80% or 4:1. Then I thought about how there might be un­fore­seen other com­pact ex­pla­na­tions for the cryp­tic words on the white­board and ad­justed down to 3:1. I then asked Brienne, and learned that it was in fact about her Myst game. I then did a thirty-sec­ond “up­date med­i­ta­tion” on whether per­haps it wasn’t all that prob­a­ble that there would be some other com­pact ex­pla­na­tion for the cryp­tic writ­ings; so maybe once the writ­ings seemed ex­plained away, I should have been less wor­ried about un­fore­seen com­pact al­ter­na­tives.

But I didn’t med­i­tate on it too long, be­cause it was just one sam­ple out of my life, and the point of ex­pe­riences like that is that you have a lot of them, and up­date a lit­tle each time, and even­tu­ally the ex­pe­rience ac­cu­mu­lates. Med­i­tat­ing on it as much as I’m cur­rently do­ing by writ­ing about it here would not be good prac­tice in gen­eral. (Those of you who have a ba­sic ac­quain­tance with neu­ral net­works and the delta rule should rec­og­nize what I’m try­ing to get my brain to do here.) I feel guilty about not bet­ting more sys­tem­at­i­cally, but given my limited sup­ply of spoons, this kind of in­for­mal and op­por­tunis­tic but reg­u­lar prac­tice is about all that I’m likely to ac­tu­ally do, as op­posed to feel guilty about not do­ing.

As I do my edit­ing pass on this doc­u­ment, I more re­cently as­signed 5:1 odds against two char­ac­ters on House of Cards hav­ing sex, who did in fact have sex; and that pro­vides a big­ger poke of ad­just­ment against over­con­fi­dence. (Ac­cord­ing to the delta rule, this was a big­ger er­ror.)

pat: But there are stud­ies show­ing that even af­ter be­ing warned about over­con­fi­dence, read­ing a study about over­con­fi­dence, and be­ing al­lowed to prac­tice a bit, over­con­fi­dence is re­duced but not elimi­nated—right?

eliezer: On av­er­age across all sub­jects, over­con­fi­dence is re­duced but not elimi­nated. That doesn’t mean that in ev­ery in­di­vi­d­ual sub­ject, over­con­fi­dence is re­duced but not elimi­nated.

pat: What makes you think you can do bet­ter than av­er­age?


eliezer: What makes me think I could do bet­ter than av­er­age is that I prac­ticed much more than those sub­jects, and I don’t think the level of effort put in by the av­er­age sub­ject, even a sub­ject who’s warned about over­con­fi­dence and given one prac­tice ses­sion, is the limit of hu­man pos­si­bil­ity. And what makes me think I ac­tu­ally suc­ceeded is that I checked. It’s not like there’s this “refer­ence class” full of over­con­fi­dent peo­ple who hal­lu­ci­nate prac­tic­ing their cal­ibra­tion and hal­lu­ci­nate dis­cov­er­ing that their cred­i­bil­ity in­ter­vals have started be­ing well-cal­ibrated.

stranger: I offer some rele­vant in­for­ma­tion that I learned from Sarah Con­stantin’s “Do Ra­tional Peo­ple Ex­ist?”: Stanovich and West (1997) found that 88% of study par­ti­ci­pants were sys­tem­at­i­cally over­con­fi­dent, which means that they couldn’t demon­strate over­con­fi­dence for the re­main­ing 12%. And this isn’t too sur­pris­ing; Stanovich and West (1998) note a num­ber of other tests where around 10% of un­der­grad­u­ates fail to ex­hibit this or that bias.

eliezer: Right. So the ques­tion is whether I can, with some prac­tice, make my­self as non-over­con­fi­dent as the top 10% of col­lege un­der­grads. This… does not strike me as a par­tic­u­larly har­row­ing challenge. It does re­quire effort. I have to con­sciously work to ex­pand my cred­i­bil­ity in­ter­vals past my first thought, and I ex­pect that col­lege stu­dents who out­perform have to do the same. The po­ten­tial to do bet­ter buys lit­tle of it­self; you have to ac­tu­ally put in the effort. But when I think I’ve ex­panded my in­ter­vals enough, I stop.

ii. Suc­cess fac­tors and be­lief sharing

pat: So you ac­tu­ally think that you’re well-cal­ibrated in as­sign­ing 9:1 odds for Meth­ods failing ver­sus suc­ceed­ing, to the ex­treme lev­els as­signed by the Masked Stranger. Are you go­ing to ar­gue that I ought to widen my con­fi­dence in­ter­vals for how much suc­cess Harry Pot­ter and the Meth­ods of Ra­tion­al­ity might en­joy, in or­der to avoid be­ing over­con­fi­dent my­self?

eliezer: No. That feels equiv­a­lent to ar­gu­ing that you shouldn’t as­sign a 0.1% prob­a­bil­ity to Meth­ods suc­ceed­ing be­cause 1,000:1 odds are too ex­treme. I was care­ful not to put it that way, be­cause that isn’t a valid ar­gu­ment form. That’s the kind of think­ing which leads to pa­pers like Ord, Hiller­brand, and Sand­berg’s “Prob­ing the Im­prob­a­ble,” which I think are wrong. In gen­eral, if there are 500,000 fan works, only one of which can have the most re­views, then you can’t pick out one of them at ran­dom and say that 500,000:1 is too ex­treme.

pat: I’m glad you agree with this ob­vi­ous point. And I’m not stupid; I rec­og­nize that your sto­ries are bet­ter than av­er­age. 90% of Harry Pot­ter fan­fic­tion is crap by Stur­geon’s Law, and 90% of the re­main­ing 10% is go­ing to be un­in­spired. That leaves maybe 5,000 fan works that you do need to se­ri­ously com­pete with. And I’ll even say that if you’re try­ing rea­son­ably hard, you can end up in the top 10% of that pool. That leaves a 1-in-500 chance of your be­ing the best Harry Pot­ter au­thor on fan­fic­tion.net. We then need to fac­tor in the other Harry Pot­ter fan­fic­tion sites, which have fewer works but much higher av­er­age qual­ity. Let’s say it works out to a 1-in-1,000 chance of yours be­ing the best story ever, which I think is ac­tu­ally very gen­er­ous of me, given that in a lot of ways you seem ridicu­lously un­pre­pared for the task—um, are you all right, Masked Stranger?

stranger: Ex­cuse me, please. I’m just dis­tracted by the thought of a world where I could go on fan­fic­tion.net and find 1,000 other sto­ries as good as Harry Pot­ter and the Meth­ods of Ra­tion­al­ity. I’m think­ing of that world and try­ing not to cry. It’s not that I can’t imag­ine a world in which your mod­est-sound­ing Fermi es­ti­mate works cor­rectly—it’s just that the world you’re de­scribing looks so very differ­ent from this one.

eliezer: Pat, I can see where you’re com­ing from, and I’m hon­estly not sure what I can say to you about it, in ad­vance of be­ing able to show you the book.

pat: What about what I tried to say to you? Does it in­fluence you at all? The method I used was rough, but I thought it was a very rea­son­able ap­proach to get­ting a Fermi es­ti­mate, and if you dis­agree with the con­clu­sion, I would like to know what fur­ther fac­tors make your own Fermi es­ti­mate work out to 10%.

stranger: You un­der­es­ti­mate the gap be­tween how you two think. It wouldn’t oc­cur to Eliezer to even con­sider any one of the fac­tors you named, while he was mak­ing his prob­a­bil­ity es­ti­mate of 10%.

eliezer: I have to ad­mit that that’s true.

pat: Then what do you think are the most im­por­tant fac­tors in whether you’ll suc­ceed?

eliezer: Hm. Good ques­tion. I’d say… whether I can main­tain my writ­ing en­thu­si­asm, whether I can write fast enough, whether I can pro­duce a story that’s re­ally as good as I seem to be en­vi­sion­ing, whether I’ll learn as I go and do bet­ter than I cur­rently en­vi­sion. Plus a large amount of un­cer­tainty in how peo­ple will ac­tu­ally re­act to the work I have in my head if I can ac­tu­ally write it.

pat: Okay, so that’s five key fac­tors. Let’s es­ti­mate prob­a­bil­ities for each one. Sup­pose we grant that there’s an 80% chance of your main­tain­ing en­thu­si­asm, a 50% chance that you’ll write fast enough—though you’ve had trou­ble with that be­fore; it took you fully a year to pro­duce Three Wor­lds Col­lide, if I re­call cor­rectly. A 25% prob­a­bil­ity that you can suc­cess­fully write down this in­cred­ible story that seems to be in your mind—I think this part al­most always fails for au­thors, and is al­most cer­tainly the part that will fail for you, but we’ll give it a one-quar­ter prob­a­bil­ity any­way, to be gen­er­ous and steel­man the whole ar­gu­ment. Then a 50% prob­a­bil­ity that you’ll learn fast enough to not be tor­pe­doed by the defic­its you already know you have. Now even with­out say­ing any­thing about au­di­ence re­ac­tions (re­ally, you’re go­ing to try to mar­ket cog­ni­tive sci­ence and for­mal episte­mol­ogy to Harry Pot­ter fans?), and even though I’m be­ing very gen­er­ous here, mul­ti­ply­ing these prob­a­bil­ities to­gether already gets us to the 5% level, which is less than the 10% you es­ti­mated—

stranger: Wrong.

pat: … Wrong? What do you mean?

stranger: Let’s con­sider the fac­tors that might be in­volved in your above rea­son­ing not be­ing wrong. Let us first es­ti­mate the prob­a­bil­ity that any given English-lan­guage sen­tence will turn out to be true. Then, we have to con­sider the prob­a­bil­ity that a given ar­gu­ment sup­port­ing some con­clu­sion will turn out to be free of fatal bi­ases, the prob­a­bil­ity that some­one who calls an ar­gu­ment “wrong” will be mis­taken—

pat: Eliezer, if you dis­agree with my con­clu­sions, then what’s wrong with my prob­a­bil­ities?

eliezer: Well, for a start: Whether I can main­tain my writ­ing speed is not con­di­tion­ally in­de­pen­dent of whether I main­tain my en­thu­si­asm. The au­di­ence re­ac­tion is not con­di­tion­ally in­de­pen­dent of whether I main­tain my writ­ing speed. Whether I’m learn­ing things is not con­di­tion­ally in­de­pen­dent of whether I main­tain my en­thu­si­asm. Your at­tempt to mul­ti­ply all those num­bers to­gether was gib­ber­ish as prob­a­bil­ity the­ory.

pat: Okay, let’s ask about the prob­a­bil­ity that you main­tain writ­ing speed, given that you main­tain en­thu­si­asm—

eliezer: Do you think that your num­bers would have ac­tu­ally been that differ­ent, if that had been the ques­tion you’d ini­tially asked? I’m pretty sure that if you’d thought to phrase the ques­tion as “the prob­a­bil­ity given that...” and hadn’t first done it the other way, you would have elic­ited ex­actly the same prob­a­bil­ities from your­self, driven by the same bal­ance of men­tal forces—pick­ing some­thing low that sounds rea­son­able, or some­thing like that. And the prob­lem of con­di­tional de­pen­dence is far from the only rea­son I think “es­ti­mate these prob­a­bil­ities, which I shall mul­ti­ply to­gether” is just a rhetor­i­cal trick.

pat: A rhetor­i­cal trick?

eliezer: By pick­ing the right set of fac­tors to “elicit,” some­one can eas­ily make peo­ple’s “an­swers” come out as low as de­sired. As an ex­am­ple, see van Boven and Epley’s “The Un­pack­ing Effect in Eval­u­a­tive Judg­ments.” The prob­lem here is that peo­ple… how can I com­pactly phrase this… peo­ple tend to as­sign me­dian-tend­ing prob­a­bil­ities to any cat­e­gory you ask them about, so you can very strongly ma­nipu­late their prob­a­bil­ity dis­tri­bu­tions by pick­ing the cat­e­gories for which you “elicit” prob­a­bil­ities. Like, if you ask car me­chan­ics about the pos­si­ble causes of a car not start­ing—ex­pe­rienced car me­chan­ics, who see the real fre­quen­cies on a daily ba­sis!—and you ask them to as­sign a prob­a­bil­ity to “elec­tri­cal sys­tem failures” ver­sus ask­ing sep­a­rately for “dead bat­tery,” “al­ter­na­tor prob­lems,” and “spark plugs,” the un­packed cat­e­gories get col­lec­tively as­signed much greater to­tal prob­a­bil­ity than the packed cat­e­gory.

pat: But per­haps, when I’m un­pack­ing things that can po­ten­tially go wrong, I’m just com­pen­sat­ing for the plan­ning fal­lacy and how peo­ple usu­ally aren’t pes­simistic enough—

eliezer: Above all, the prob­lem with your rea­son­ing is that the stated out­come does not need to be a perfect con­junc­tion of those fac­tors. Not ev­ery­thing on your list has to go right si­mul­ta­neously for the whole pro­cess to work. You have omit­ted other dis­junc­tive path­ways to the same end. In your uni­verse, no­body ever tries harder or re­pairs some­thing af­ter it goes wrong! I have never yet seen an in­for­mal con­junc­tive break­down of an allegedly low prob­a­bil­ity in which the fi­nal con­clu­sion ac­tu­ally re­quired ev­ery one of the premises. That’s why I’m always care­ful to avoid the “I shall helpfully break down this propo­si­tion into a big con­junc­tion and ask you to as­sign each term a prob­a­bil­ity” trick.

Its only real use, at least in my ex­pe­rience, is that it’s a way to get peo­ple to feel like they’ve “as­signed” prob­a­bil­ities while you ma­nipu­late the setup to make the con­clu­sion have what­ever prob­a­bil­ity you like—it doesn’t have any role to play in hon­est con­ver­sa­tion. Out of all the times I’ve seen it used, to sup­port con­clu­sions I en­dorse as well as ones I re­ject, I’ve never once seen it ac­tu­ally work as a way to bet­ter dis­cover truth. I think it’s bad episte­mol­ogy that sticks around be­cause it sounds sort of rea­son­able if you don’t look too closely.

pat: I was work­ing with the fac­tors you picked out as crit­i­cal. Which spe­cific parts of my es­ti­mate do you dis­agree with?

stranger: (aside) The mul­ti­ple-stage fal­lacy is an amaz­ing trick, by the way. You can ask peo­ple to think of key fac­tors them­selves and still ma­nipu­late them re­ally eas­ily into giv­ing an­swers that im­ply a low fi­nal an­swer, be­cause so long as peo­ple go on list­ing things and as­sign­ing them prob­a­bil­ities, the product is bound to keep get­ting lower. Once we re­al­ize that by con­tinu­ally mul­ti­ply­ing out prob­a­bil­ities the product keeps get­ting lower, we have to ap­ply some com­pen­sat­ing fac­tor in­ter­nally so as to go on dis­crim­i­nat­ing truth from false­hood.

You have effec­tively de­cided on the an­swer to most real-world ques­tions as “no, a pri­ori” by the time you get up to four fac­tors, let alone ten. It may be wise to list out many pos­si­ble failure sce­nar­ios and de­cide in ad­vance how to han­dle them—that’s Mur­phyjitsu—but if you start as­sign­ing “the prob­a­bil­ity that X will go wrong and not be han­dled, con­di­tional on ev­ery­thing pre­vi­ous on the list hav­ing not gone wrong or hav­ing been suc­cess­fully han­dled,” then you’d bet­ter be will­ing to as­sign con­di­tional prob­a­bil­ities near 1 for the kinds of pro­jects that suc­ceed some­times—pro­jects like Meth­ods. Other­wise you’re rul­ing out their suc­cess a pri­ori, and the “elic­i­ta­tion” pro­cess is a sham.

Frankly, I don’t think the un­der­ly­ing method­ol­ogy is worth re­pairing. I don’t think it’s worth both­er­ing to try to make a com­pen­sat­ing ad­just­ment to­ward higher prob­a­bil­ities. We just shouldn’t try to do “con­junc­tive break­downs” of a suc­cess prob­a­bil­ity where we make up lots and lots of failure fac­tors that all get in­for­mal prob­a­bil­ity as­sign­ments. I don’t think you can get good es­ti­mates that way even if you try to com­pen­sate for the pre­dictable bias.

eliezer: I did list my own key fac­tors, and I do feel doubt about whether they’ll work out. If I were re­ally con­fi­dent in them, I’d be as­sign­ing a higher prob­a­bil­ity than 10%. But be­sides hav­ing con­di­tional de­pen­den­cies, my fac­tors also have dis­junc­tive as well as con­junc­tive char­ac­ter; they don’t all need to go right and stay right si­mul­ta­neously. I could get far enough into Meth­ods to ac­quire an au­di­ence, sud­denly lose my writ­ing speed, and Meth­ods could still end up ul­ti­mately hav­ing a large im­pact.

pat: So how do you ma­nipu­late those fac­tors to ar­rive at an es­ti­mate of 10% prob­a­bil­ity of ex­treme suc­cess?

eliezer: I don’t. That’s not how I got my es­ti­mate. I found two brack­ets, 20:1 and 4:1, that I couldn’t nudge fur­ther with­out feel­ing ner­vous about be­ing over­con­fi­dent in one di­rec­tion or the other. In other words, the same way I gen­er­ated my set of ten cred­i­bil­ity in­ter­vals for CFAR’s cal­ibra­tion test. Then I picked some­thing in the log­a­r­ith­mic mid­dle.

pat: So you didn’t even try to list out all the fac­tors and then mul­ti­ply them to­gether?

eliezer: No.

pat: Then where the heck does your 10% figure ul­ti­mately come from? Say­ing that you got two other cryp­tic num­bers, 20:1 and 4:1, and picked some­thing in the ge­o­met­ric mid­dle, doesn’t re­ally an­swer the fun­da­men­tal ques­tion.

stranger: I be­lieve the tech­ni­cal term for the method­ol­ogy is “pul­ling num­bers out of your ass.” It’s im­por­tant to prac­tice cal­ibrat­ing your ass num­bers on cases where you’ll learn the cor­rect an­swer shortly af­ter­ward. It’s also im­por­tant that you learn the limits of ass num­bers, and don’t make un­re­al­is­tic de­mands on them by as­sign­ing mul­ti­ple ass num­bers to com­pli­cated con­di­tional events.

eliezer: I’d say I reached the es­ti­mate… by think­ing about the ob­ject-level prob­lem? By us­ing my do­main knowl­edge? By hav­ing already thought a lot about the prob­lem so as to load many rele­vant as­pects into my mind, then con­sult­ing my mind’s na­tive-for­mat prob­a­bil­ity judg­ment—with some prior prac­tice at bet­ting hav­ing already taught me a lit­tle about how to trans­late those na­tive rep­re­sen­ta­tions of un­cer­tainty into 9:1 bet­ting odds. I’m not sure what ad­di­tional in­for­ma­tion you want here. If there’s a way to pro­duce gen­uinely, demon­stra­bly su­pe­rior judg­ments us­ing some kind of break-it-down pro­ce­dure, I haven’t read about it in the liter­a­ture and I haven’t prac­ticed us­ing it yet. If you show me that you can pro­duce 9-out-of-10 cor­rect 90% cred­ible in­ter­vals, and your in­ter­vals are nar­rower than my in­ter­vals, and you got them us­ing a break-it-down pro­ce­dure, I’m happy to hear about it.

pat: So ba­si­cally your 10% prob­a­bil­ity comes from in­ac­cessible in­tu­ition.

eliezer: In this case? Yeah, pretty much. There’s just too lit­tle I can say to you about why Meth­ods might work, in ad­vance of be­ing able to show you what I have in mind.

pat: If the rea­son­ing in­side your head is valid, why can’t it be ex­plained to me?

eliezer: Be­cause I have pri­vate in­for­ma­tion, frankly. I know the book I’m try­ing to cre­ate.

pat: Eliezer, I think one of the key in­sights you’re ig­nor­ing here is that it should be a clue to you that you think you have in­com­mu­ni­ca­ble rea­sons for be­liev­ing your Meth­ods of Ra­tion­al­ity pro­ject can suc­ceed. Isn’t be­ing un­able to con­vince other peo­ple of their prospects of suc­cess just the sort of ex­pe­rience that crack­pots have when they set out to in­vent bad physics the­o­ries? Isn’t this in­com­mu­ni­ca­ble in­tu­ition just the sort of jus­tifi­ca­tion that they would try to give?

eliezer: But the method you’re us­ing—the method you’re call­ing “refer­ence class fore­cast­ing”—is too de­mand­ing to ac­tu­ally de­tect whether some­one will end up writ­ing the world’s most re­viewed Harry Pot­ter fan­fic­tion, whether that’s me or some­one else. The fact that a mod­est critic can’t be per­suaded isn’t Bayesian dis­crim­i­na­tion be­tween things that will suc­ceed and things that will fail; it isn’t ev­i­dence.

pat: On the con­trary, I would think it very rea­son­able if Non­jon told me that he in­tended to write the most-re­viewed Harry Pot­ter fan­fic­tion. Non­jon’s A Black Com­edy is widely ac­knowl­edged as one of the best sto­ries in the genre, Non­jon is well-placed in in­fluen­tial re­view­ing and recom­mend­ing com­mu­ni­ties—Non­jon might not be cer­tain to write the most re­viewed story ever, but he has le­gi­t­i­mate cause to think that he is one of the top con­tenders for writ­ing it.

stranger: It’s in­ter­est­ing how your es­ti­mates of suc­cess prob­a­bil­ities can be well sum­ma­rized by a sin­gle quan­tity that cor­re­lates very well with how re­spectable a per­son is within a sub­com­mu­nity.

pat: Ad­di­tion­ally, even if my de­mands were un­satis­fi­able, that wouldn’t nec­es­sar­ily im­ply a hole in my rea­son­ing. No­body who buys a lot­tery ticket can pos­si­bly satisfy me that they have good rea­son to be­lieve they’ll win, even the per­son who does win. But that doesn’t mean I’m wrong in as­sign­ing a low suc­cess prob­a­bil­ity to peo­ple who buy lot­tery tick­ets.

Non­jon may le­gi­t­i­mately have a 1-in-10 lot­tery ticket. Neil Gaiman might have 2-in-3. Yours, as I’ve said, is prob­a­bly more like 1-in-1,000, and it’s only that high ow­ing to your hav­ing already demon­strated some good writ­ing abil­ities. I’m not even pe­nal­iz­ing you for the fact that your plan of offer­ing ex­plic­itly ra­tio­nal char­ac­ters to the Harry Pot­ter fan­dom sounds very un­like ex­ist­ing top sto­ries. I might be un­duly in­fluenced by the fact that I like your pre­vi­ous writ­ing. But your claim to have in­com­mu­ni­ca­ble ad­vance knowl­edge that your lot­tery ticket will do bet­ter than this by a fac­tor of 100 seems very sus­pi­cious to me. Valid ev­i­dence should be com­mu­ni­ca­ble be­tween peo­ple.

stranger: “I be­lieve my­self to be writ­ing a book on eco­nomic the­ory which will largely rev­olu­tionize—not I sup­pose, at once but in the course of the next ten years—the way the world thinks about its eco­nomic prob­lems. I can’t ex­pect you, or any­one else, to be­lieve this at the pre­sent stage. But for my­self I don’t merely hope what I say,—in my own mind, I’m quite sure.” Lot­tery win­ner John May­nard Keynes to Ge­orge Bernard Shaw, while writ­ing The Gen­eral The­ory of Em­ploy­ment, In­ter­est and Money.

eliezer: Come to think of it, if I do suc­ceed with Meth­ods, Pat, you your­self could end up in an in­com­mu­ni­ca­ble epistemic state rel­a­tive to some­one who only heard about me later through my story. Some­one like that might sus­pect that I’m not a purely ran­dom lot­tery ticket win­ner, but they won’t have as much ev­i­dence to that effect as you. It’s a pretty in­ter­est­ing and fun­da­men­tal episte­molog­i­cal is­sue.

pat: I dis­agree. If you have valid in­tro­spec­tive ev­i­dence, then talk to me about your state of mind. On my view, you shouldn’t end up in a situ­a­tion where you up­date differ­ently on what your ev­i­dence “feels like to you” than what your ev­i­dence “sounds like to other peo­ple”; both you and other peo­ple should just do the sec­ond up­date.

stranger: No, in this sce­nario, in the pres­ence of other sus­pected bi­ases, two hu­man be­ings re­ally can end up in in­com­mu­ni­ca­ble epistemic states. You would know that “Eliezer wins” had gen­uinely been sin­gled out in ad­vance as a dis­t­in­guished out­come, but the sec­ond per­son would have to as­sess this sup­pos­edly dis­t­in­guished out­come with the benefit of hind­sight, and they may le­gi­t­i­mately never trust their hind­sight enough to end up in the same men­tal state as you.

You’re right, Pat, that com­pletely un­bi­ased agents who lack truly foun­da­tional dis­agree­ments on pri­ors should never end up in this situ­a­tion. But hu­mans can end up in it very eas­ily, it seems to me. Ad­vance pre­dic­tions have spe­cial au­thor­ity in sci­ence for a rea­son: hind­sight bias makes it hard to ever reach the same con­fi­dence in a pre­dic­tion that you only hear about af­ter the fact.

pat: Are you re­ally sug­gest­ing that the prevalence of cog­ni­tive bias means you should be more con­fi­dent that your own rea­son­ing is cor­rect? My episte­mol­ogy seems to be much more straight­for­ward than yours on these mat­ters. Ap­ply­ing the “valid ev­i­dence should be com­mu­ni­ca­ble” rule to this case: A hy­po­thet­i­cal per­son who saw Eliezer Yud­kowsky write the Less Wrong Se­quences, heard him men­tion that he as­signed a non-tiny prob­a­bil­ity to suc­ceed­ing in his Meth­ods am­bi­tions, and then saw him suc­ceed at Meth­ods should just re­al­ize what an ex­ter­nal ob­server would say to them about that. And what they’d say is: you just hap­pened to be the lucky or un­lucky rel­a­tives of a lot­tery ticket buyer who claimed in ad­vance to have psy­chic pow­ers, and then hap­pened to win.

eliezer: This sounds a lot like a difficulty I once sketched out for the “method of imag­i­nary up­dates.” Hu­man be­ings aren’t log­i­cally om­ni­scient, so we can’t be sure we’ve rea­soned cor­rectly about prior odds. In ad­vance of see­ing Meth­ods suc­ceed, I can see why you’d say that, on your wor­ld­view, if it did hap­pen then it would just be a 1000:1 lot­tery ticket win­ning. But if that ac­tu­ally hap­pened, then in­stead of say­ing, “Oh my gosh, a 1000:1 event just oc­curred,” you ought to con­sider in­stead that the method you used to as­sign prior prob­a­bil­ities was flawed. This is not true about a lot­tery ticket, be­cause we’re ex­tremely sure about how to as­sign prior prob­a­bil­ities in that case—and by the same to­ken, in real life nei­ther of us will ac­tu­ally see our friends win­ning the lot­tery.

pat: I agree that if it ac­tu­ally hap­pens, I would re­con­sider your pre­vi­ous ar­gu­ments rather than in­sist­ing that I was cor­rect about prior odds. I’m happy to con­cede this point be­cause I am very, very con­fi­dent that it won’t ac­tu­ally hap­pen. The ar­gu­ment against your suc­cess in Harry Pot­ter fan­fic­tion seems to me about as strong as any ar­gu­ment the out­side-view per­spec­tive might make.

stranger: Oh, we aren’t dis­put­ing that.

pat: You aren’t?

stranger: That’s the whole point, from my per­spec­tive. If mod­est episte­mol­ogy sounds per­sua­sive to you, then it’s triv­ial to in­vent a crush­ing ar­gu­ment against any pro­ject that in­volves do­ing some­thing im­por­tant that hasn’t been done in the past. Any pro­ject that’s try­ing to ex­ceed any va­ri­ety of civ­i­liza­tional in­ad­e­quacy is go­ing to be ruled out.

pat: Look. You can­not just waltz into a field and be­come its lead­ing figure on your first try. Modest episte­mol­ogy is just right about that. You are not sup­posed to be able to suc­ceed when the odds against you are like those I have de­scribed. Maybe out of a mil­lion con­tenders, some­one will suc­ceed by luck when the mod­est would have pre­dicted their failure, but if we’re bat­ting 999,999 out of 1,000,000 I say we’re do­ing pretty well. Un­less, of course, Eliezer would claim that the pro­ject of writ­ing this new Harry Pot­ter fan­fic­tion is so im­por­tant that a 0.0001% chance of suc­cess is still worth it—

eliezer: I never say that. Ever. If I ever say that you can just shoot me.

pat: Then why are you not re­spond­ing to the very clear, very stan­dard, very ob­vi­ous rea­sons I have laid out to think that you can­not do this? I mean, se­ri­ously, what is go­ing through your head right now?

eliezer: A hel­pless feel­ing of be­ing un­able to com­mu­ni­cate.

stranger: Grim amuse­ment.

pat: Then I’m sorry, Mr. Eliezer Yud­kowsky, but it seems to me that you are be­ing ir­ra­tional. You aren’t even try­ing to hide it very hard.

eliezer: (sigh­ing) I can imag­ine why it would look that way to you. I know how to com­mu­ni­cate some of the thought pat­terns and styles that I think have served me well, that I think gen­er­ate good pre­dic­tions and poli­cies. The other pat­terns leave me with this hel­pless feel­ing of know­ing but be­ing un­able to speak. This con­ver­sa­tion has en­tered a de­pen­dency on the part that I know but don’t know how to say.

pat: Why should I be­lieve that?

eliezer: If you think the part I did figure out how to say was im­pres­sive enough. That was hid­den pur­pose #7 of the Less Wrong Se­quences—to provide an earnest-to­ken of all the tech­niques I couldn’t show. All I can tell you is that ev­ery­thing you’re so busy wor­ry­ing about is not the cor­rect thing for me to be think­ing about. That your en­tire ap­proach to the prob­lem is wrong. It is not just that your ar­gu­ments are wrong. It is that they are about the wrong sub­ject mat­ter.

pat: Then what’s the right sub­ject mat­ter?

eliezer: That’s what I’m hav­ing trou­ble say­ing. I can say that you ought to dis­card all thoughts from your mind about com­pet­ing with oth­ers. The oth­ers who’ve come be­fore you are like probes, flashes of sound, ping­backs that give you an in­com­plete sonar of your prob­lem’s difficulty. Some­times you can swim past the parts of the prob­lem that tan­gled up other peo­ple and en­ter a new part of the ocean. Which doesn’t ac­tu­ally mean you’ll suc­ceed; all it means is that you’ll have very lit­tle in­for­ma­tion about which parts are difficult. There of­ten isn’t ac­tu­ally any need to think at all about the in­trin­sic qual­ities of your com­pe­ti­tion—like how smart or mo­ti­vated or well-paid they are—be­cause their work is laid out in front of you and you can just look at the qual­ity of the work.

pat: Like some­body who pre­dicts hy­per­in­fla­tion, say­ing all the while that they’re free to dis­re­gard con­ven­tional economists be­cause of how those idiot economists think you can triple the money sup­ply with­out get­ting in­fla­tion?

eliezer: I don’t re­ally know what goes through some­one else’s mind when that hap­pens to them. But I don’t think that tel­ling them to be more mod­est is a fix. Tel­ling some­body to shut up and re­spect aca­demics is not a gen­er­ally valid line of ar­gu­men­ta­tion be­cause it doesn’t dis­t­in­guish main­stream eco­nomics (which has rel­a­tively high schol­arly stan­dards) from main­stream nu­tri­tion sci­ence (which has rel­a­tively low schol­arly stan­dards). I’m not sure there is any ro­bust way out ex­cept by un­der­stand­ing eco­nomics for your­self, and to the ex­tent that’s true, I ought to ad­vise our hy­po­thet­i­cal ill-in­formed con­trar­ian to read a lot of eco­nomics blogs and try to fol­low the ar­gu­ments, or bet­ter yet read an eco­nomics text­book. I don’t think that peo­ple sit­ting around and anx­iously ques­tion­ing them­selves and won­der­ing whether they’re too au­da­cious is a route out of that par­tic­u­lar hole—let alone the hole on the other side of the fence.

pat: So your meta-level episte­mol­ogy is to re­main as ul­ti­mately in­ac­cessible to me as your ob­ject-level es­ti­mates.

eliezer: I can un­der­stand why you’re skep­ti­cal.

pat: I some­how doubt that you could pass an Ide­olog­i­cal Tur­ing Test on my point of view.

stranger: (smil­ing) Oh, I think I’d do pretty well at your ITT.

eliezer: Pat, I un­der­stand where your es­ti­mates are com­ing from, and I’m sure that your ad­vice is truly meant to be helpful to me. But I also see that ad­vice as an ex­pres­sion of a kind of anx­iety which is not at all like the things I need to ac­tu­ally think about in or­der to pro­duce good fic­tion. It’s a wasted mo­tion, a thought which pre­dictably will not have helped in ret­ro­spect if I suc­ceed. How good I am rel­a­tive to other peo­ple is just not some­thing I should spend lots of time ob­sess­ing about in or­der to make Meth­ods be what I want it to be. So my thoughts just don’t go there.

pat: This no­tion, “that thought will pre­dictably not have helped in ret­ro­spect if I suc­ceed,” seems very strange to me. It helps pre­cisely be­cause we can avoid­ing wast­ing our effort on pro­jects which are un­likely to suc­ceed.

stranger: Sounds very rea­son­able. All I can say in re­sponse is: try do­ing it my way for a day, and see what hap­pens. No thoughts that pre­dictably won’t have been helpful in ret­ro­spect, in the case that you suc­ceed at what­ever you’re cur­rently try­ing to do. You might learn some­thing from the ex­pe­rience.

eliezer: The thing is, Pat… even an­swer­ing your ob­jec­tions and defend­ing my­self from your va­ri­ety of crit­i­cism trains what look to me like un­healthy habits of thought. You’re re­lentlessly fo­cused on me and my psy­chol­ogy, and if I en­gage with your ar­gu­ments and try to defend my­self, I have to fo­cus on my­self in­stead of my book. Which gives me that much less at­ten­tion to spend on sketch­ing out what Pro­fes­sor Quir­rell will do in his first Defense les­son. Worse, I have to defend my de­ci­sions, which can make them harder to change later.

stranger: Con­sider how much more difficult it will be for Eliezer to swerve and drop his other pro­ject, The Art of Ra­tion­al­ity, if it fails af­ter he has a num­ber of (real or in­ter­nal) con­ver­sa­tion like this—con­ver­sa­tions where he has to defend all the rea­sons why it’s okay for him to think that he might write a non­fic­tion best­sel­ler about ra­tio­nal­ity. This is why it’s im­por­tant to be able to ca­su­ally in­voke civ­i­liza­tional in­ad­e­quacy. It’s im­por­tant that peo­ple be al­lowed to try am­bi­tious things with­out feel­ing like they need to make a great pro­duc­tion out of defend­ing their hero li­cense.

eliezer: Right. And… the men­tal mo­tions in­volved in wor­ry­ing what a critic might think and try­ing to come up with defenses or con­ces­sions are differ­ent from the men­tal mo­tions in­volved in be­ing cu­ri­ous about some ques­tion, try­ing to learn the an­swer, and com­ing up with tests; and it’s differ­ent from how I think when I’m work­ing on a prob­lem in the world. The thing I should be think­ing about is just the work it­self.

pat: If you were just try­ing to write okay Harry Pot­ter fan­fic­tion for fun, I might agree with you. But you say you can pro­duce the best fan­fic­tion. That’s a whole differ­ent ball game—

eliezer: No! The per­spec­tive I’m try­ing to show you, the way it works in the in­side of my head, is that try­ing to write good fan­fic­tion, and the best fan­fic­tion, are not differ­ent ball games. There’s an ob­ject level, and you try to op­ti­mize it. You have an es­ti­mate of how well you can op­ti­mize it. That’s all there ever is.

iii. So­cial heuris­tics and prob­lem im­por­tance, tractabil­ity, and neglectedness

pat: A funny thought has just oc­curred to me. That thing where you’re try­ing to work out the the­ory of Friendly AI—

eliezer: Let me guess. You don’t think I can do that ei­ther.

pat: Well, I don’t think you can save the world, of course! (laughs) This isn’t a sci­ence fic­tion book. But I do see how you can rea­son­ably hope to make an im­por­tant con­tri­bu­tion to the the­ory of Friendly AI that ends up be­ing use­ful to what­ever group ends up de­vel­op­ing gen­eral AI. What’s in­ter­est­ing to note here is that the sce­nario the Masked Stranger de­scribed, the class of suc­cesses you as­signed 10% ag­gre­gate prob­a­bil­ity, is ac­tu­ally harder to achieve than that.

stranger: (smil­ing) It re­ally, re­ally, re­ally isn’t.

I’ll men­tion as an aside that talk of “Friendly” AI has been go­ing out of style where I’m from. We’ve started talk­ing in­stead in terms of “al­ign­ing smarter-than-hu­man AI with op­er­a­tors’ goals,” mostly be­cause “AI al­ign­ment” smacks less of an­thro­po­mor­phism than “friendli­ness.”

eliezer: Align­ment? Okay, I can work with that. But Pat, you’ve said some­thing I didn’t ex­pect you to say and gone out­side my cur­rent vi­sion of your Ide­olog­i­cal Tur­ing Test. Please con­tinue.

pat: Okay. Con­trary to what you think, my words are not fully gen­eral coun­ter­ar­gu­ments that I launch against just any­thing I in­tu­itively dis­like. They are based on spe­cific, visi­ble, third-party-as­sess­able fac­tors that make as­ser­tions be­liev­able or un­be­liev­able. If we leave aside in­ac­cessible in­tu­itions and just look at third-party-visi­ble fac­tors, then it is very clear that there’s a huge com­mu­nity of writ­ers who are ex­plic­itly try­ing to cre­ate Harry Pot­ter fan­fic­tion. This com­mu­nity is far larger and has far more ac­tivity—by ev­ery ob­jec­tive, third-party met­ric—than the com­mu­nity work­ing on is­sues re­lated to al­ign­ment or friendli­ness or what­ever. Be­ing the best writer in a much larger com­mu­nity is much more im­prob­a­ble than your mak­ing a sig­nifi­cant con­tri­bu­tion to AI al­ign­ment when al­most no­body else is work­ing on that prob­lem.

eliezer: The rel­a­tive size of ex­ist­ing com­mu­ni­ties that you’ve just de­scribed is not a fact that I re­gard as im­por­tant for as­sess­ing the rel­a­tive difficulty of “mak­ing a key con­tri­bu­tion to AI al­ign­ment” ver­sus “get­ting Meth­ods to the level de­scribed by the Masked Stranger.” The num­ber of com­pet­ing fan­fic­tion au­thors would be in­for­ma­tive to me if I hadn’t already checked out the Harry Pot­ter fan works with the best rep­u­ta­tions. If I can see how strong the com­pe­ti­tion is with my own eyes, then that screens off in­for­ma­tion about the size of the com­mu­nity from my per­spec­tive.

pat: But surely the size of the com­mu­nity should give you some pause re­gard­ing whether you should trust your felt in­tu­ition that you could write some­thing bet­ter than the product of so many other au­thors.

stranger: See, that meta-rea­son­ing right there? That’s the part I think is go­ing to com­pletely com­pro­mise how peo­ple think about the world if they try to rea­son that way.

eliezer: Would you ask a jug­gler, in the mid­dle of jug­gling, to sud­denly start wor­ry­ing about whether she’s in a refer­ence class of peo­ple who merely think that they’re good at catch­ing balls? It’s all just… wasted mo­tion.

stranger: So­cial anx­iety and over­ac­tive scrupu­los­ity.

eliezer: Not what brains look like when they’re think­ing pro­duc­tively.

pat: You’ve been claiming that the out­side view is a fully gen­eral coun­ter­ar­gu­ment against any claim that some­one with rel­a­tively low sta­tus will do any­thing im­por­tant. I’m ex­plain­ing to you why the method of trust­ing ex­ter­nally visi­ble met­rics and things that third par­ties can be con­vinced of says that you might make im­por­tant con­tri­bu­tions to AI al­ign­ment where no­body else is try­ing, but that you won’t write the most re­viewed Harry Pot­ter fan­fic­tion where thou­sands of other au­thors are com­pet­ing with you.

(A wan­der­ing by­stan­der sud­denly steps up to the group, in­ter­ject­ing.)

by­stan­der: Okay, no. I just can’t hold my tongue any­more.

pat: Huh? Who are you?

by­stan­der: I am the true voice of mod­esty and the out­side view!

I’ve been over­hear­ing your con­ver­sa­tion, and I’ve got to say—there’s no way it’s eas­ier to make an im­por­tant con­tri­bu­tion to AI al­ign­ment than it is to write pop­u­lar fan­fic­tion.

eliezer: … That’s true enough, but who…?

by­stan­der: The name’s Maude Stevens.

pat: Well, it’s nice to make your ac­quain­tance, Maude. I am always ea­ger to hear about my mis­takes, even from peo­ple with sus­pi­ciously rele­vant back­ground in­for­ma­tion who ran­domly walk up to me in parks. What is my er­ror on this oc­ca­sion?

maude: All three of you have been tak­ing for granted that if peo­ple don’t talk about “al­ign­ment” or “friendli­ness,” then their work isn’t rele­vant. But those are just words. When we take into ac­count ma­chine ethi­cists work­ing on real-world trol­ley dilem­mas, economists work­ing on tech­nolog­i­cal un­em­ploy­ment, com­puter sci­en­tists work­ing on Asi­mo­vian agents, and so on, the field of com­peti­tors all try­ing to make progress on these is­sues be­comes much, much larger.

pat: What? Is that true, Eliezer?

eliezer: Not to my knowl­edge—un­less Maude is here from the NSA to tell me about some very in­ter­est­ing be­hind-closed-doors re­search. The ex­am­ples Maude listed aren’t ad­dress­ing the tech­ni­cal is­sues I’ve been call­ing “friendli­ness.” Progress on those prob­lems doesn’t help you with spec­i­fy­ing prefer­ences that you can rea­son­ably ex­pect to pro­duce good out­comes even when the sys­tem is smarter than you and search­ing a much wider space of strate­gies than you can con­sider or check your­self. Or de­sign­ing sys­tems that are sta­ble un­der self-mod­ifi­ca­tion, so that good prop­er­ties of a seed AI are pre­served as the agent gets smarter.

maude: And your claim is that no one else in the world is smart enough to no­tice any of this?

eliezer: No, that’s not what I’m say­ing. Con­cerns like “how do we spec­ify cor­rect goals for par-hu­man AI?” and “what hap­pens when AI gets smart enough to au­to­mate AI re­search it­self?” have been around for a long time, sort of just hang­ing out and not visi­bly shift­ing re­search pri­ori­ties. So it’s not that the com­mu­nity of peo­ple who have ever thought about su­per­in­tel­li­gence is small; and it’s not that there are no on­go­ing lines of work on ro­bust­ness, trans­parency, or se­cu­rity in nar­row AI sys­tems that will in­ci­den­tally make it eas­ier to al­ign smarter-than-hu­man AI. But the com­mu­nity of peo­ple who go into work ev­ery day and make de­ci­sions about what tech­ni­cal prob­lems to tackle based on any ex­tended think­ing re­lated to su­per­in­tel­li­gent AI is very small.

maude: What I’m say­ing is that you’re jump­ing ahead and try­ing to solve the far end of the prob­lem be­fore the field is ready to fo­cus efforts there. The cur­rent work may not all bear di­rectly on su­per­in­tel­li­gence, but we should ex­pect all the sig­nifi­cant progress on AI al­ign­ment to be pro­duced by the in­tel­lec­tual heirs of the peo­ple presently work­ing on top­ics like drone war­fare and un­em­ploy­ment.

pat: (cau­tiously) I mean, if what Eliezer says is true—and I do think that Eliezer is hon­est, if of­ten, by my stan­dards, slightly crazy—then the state of the field in 2010 is just like it looks naively. There aren’t many peo­ple work­ing on top­ics re­lated to smarter-than-hu­man AI, and Eliezer’s group and the Oxford Fu­ture of Hu­man­ity In­sti­tute are the only ones with a rea­son­able claim to be work­ing on AI al­ign­ment. If Eliezer says that the prob­lems of craft­ing a smarter-than-hu­man AI to not kill ev­ery­one are not of a type with cur­rent ma­chine ethics work, then I can buy that as plau­si­ble, though I’d want to hear oth­ers’ views on the is­sue be­fore reach­ing a firm con­clu­sion.

maude: But Eliezer’s field of com­pe­ti­tion is far wider than just the peo­ple writ­ing ethics pa­pers. Any­one work­ing in ma­chine learn­ing, or in­deed in any branch of com­puter sci­ence, might end up con­tribut­ing to AI al­ign­ment.

eliezer: Um, that would cer­tainly be great news to hear. The win state here is just “the prob­lem gets solved”—

pat: Wait a sec­ond. I think you’re leav­ing the realm of what’s third-party ob­jec­tively ver­ifi­able, Maude. That’s like say­ing that Eliezer has to com­pete with Stephen King be­cause Stephen King could in prin­ci­ple de­cide to start writ­ing Harry Pot­ter fan­fic­tion. If all these other peo­ple in AI are not work­ing on the par­tic­u­lar prob­lems Eliezer is work­ing on, whereas the broad com­mu­nity of Harry Pot­ter fan­fic­tion writ­ers is com­pet­ing di­rectly with Eliezer on fic­tion-writ­ing, then any rea­son­able third party should agree that the out­side view coun­ter­ar­gu­ment ap­plies very strongly to the sec­ond case, and much more weakly (if at all) to the first.

maude: So now fan­fic­tion is sup­posed to be harder than sav­ing the world? Se­ri­ously? Just no.

eliezer: Pat, while I dis­agree with Maude’s ar­gu­ments, she does have the ad­van­tage of ra­tio­nal­iz­ing a true con­clu­sion rather than a false con­clu­sion. AI al­ign­ment is harder.

pat: I’m not ex­pect­ing you to solve the whole thing. But mak­ing a sig­nifi­cant con­tri­bu­tion to a suffi­ciently spe­cial­ized cor­ner of academia that very few other peo­ple are ex­plic­itly work­ing on should be eas­ier than be­com­ing the sin­gle most suc­cess­ful figure in a field that lots of other peo­ple are work­ing in.

maude: This is ridicu­lous. Fan­fic­tion writ­ers are sim­ply not the same kind of com­pe­ti­tion as ma­chine learn­ing ex­perts and pro­fes­sors at lead­ing uni­ver­si­ties, any of whom could end up mak­ing far more im­pres­sive con­tri­bu­tions to the cut­ting edge in AGI re­search.

eliezer: Um, ad­vanc­ing AGI re­search might be im­pres­sive, but un­less it’s AGI al­ign­ment it’s—

pat: Have you ever tried to write fic­tion your­self? Try it. You’ll find it’s a heck of a lot harder than you seem to imag­ine. Be­ing good at math does not qual­ify you to waltz in and—

(The Masked Stranger raises his hand and snaps his fingers. All time stops. Then the Masked Stranger looks over at Eliezer-2010 ex­pec­tantly.)

eliezer: Um… Masked Stranger… do you have any idea what’s go­ing on here?

stranger: Yes.

eliezer: Thank you for that con­cise and in­for­ma­tive re­ply. Would you please ex­plain what’s go­ing on here?

stranger: Pat is thor­oughly ac­quainted with the sta­tus hi­er­ar­chy of the es­tab­lished com­mu­nity of Harry Pot­ter fan­fic­tion au­thors, which has its own rit­u­als, prizes, poli­tics, and so on. But Pat, for the sake of liter­ary hy­poth­e­sis, lacks an in­stinc­tive sense that it’s au­da­cious to try to con­tribute work to AI al­ign­ment. If we in­ter­ro­gated Pat, we’d prob­a­bly find that Pat be­lieves that al­ign­ment is cool but not as­tro­nom­i­cally im­por­tant, or that there are many other ex­is­ten­tial risks of equal stature. If Pat be­lieved that long-term civ­i­liza­tional out­comes de­pended mostly on solv­ing the al­ign­ment prob­lem, as you do, then he would prob­a­bly as­sign the prob­lem more in­stinc­tive pres­tige—hold­ing con­stant ev­ery­thing Pat knows about the ob­ject-level prob­lem and how many peo­ple are work­ing on it, but rais­ing the prob­lem’s felt sta­tus.

Maude, mean­while, is the re­verse: not ac­quainted with the poli­ti­cal minu­tiae and sta­tus dy­nam­ics of Harry Pot­ter fans, but very sen­si­tive to the im­por­tance of the al­ign­ment prob­lem. So to Maude, it’s in­tu­itively ob­vi­ous that mak­ing tech­ni­cal progress on AI al­ign­ment re­quires a much more im­pres­sive hero li­cense than writ­ing the world’s lead­ing Harry Pot­ter fan­fic­tion. Pat doesn’t see it that way.

eliezer: But ideas in AI al­ign­ment have to be for­mal­ized; and the for­mal­ism needs to satisfy many differ­ent re­quire­ments si­mul­ta­neously, with­out much room for er­ror. It’s a very ab­stract, very highly con­strained task be­cause it has to put an in­for­mal prob­lem into the right for­mal struc­ture. When writ­ing fic­tion, yes, I have to jug­gle things like plot and char­ac­ter and ten­sion and hu­mor, but that’s all still a much less con­strained cog­ni­tive prob­lem—

stranger: That kind of con­sid­er­a­tion isn’t likely to en­ter Pat or Maude’s minds.

eliezer: Does it mat­ter that I in­tend to put far more effort into my re­search than into fic­tion-writ­ing? If Meth­ods doesn’t work the first time, I’ll just give up.

stranger: Sorry. Whether or not you’re al­lowed to do high-sta­tus things can’t de­pend on how much effort you say you in­tend to put in. Be­cause “any­one could say that.” And then you couldn’t slap down pre­tenders—which is ter­rible.

eliezer: …… Is there some kind of or­ga­niz­ing prin­ci­ple that makes all of this make sense?

stranger: I think the key con­cepts you need are civ­i­liza­tional in­ad­e­quacy and sta­tus hi­er­ar­chy main­te­nance.

eliezer: En­lighten me.

stranger: You know how Pat ended up calcu­lat­ing that there ought to be 1,000 works of Harry Pot­ter fan­fic­tion as good as Meth­ods? And you know how I got all weepy vi­su­al­iz­ing that world? Imag­ine Maude as mak­ing a similar mis­take. There’s a world in which some scruffy out­sider like you wouldn’t be able to es­ti­mate a sig­nifi­cant chance of mak­ing a ma­jor con­tri­bu­tion to AI al­ign­ment, let alone help found the field, be­cause peo­ple had been try­ing to do se­ri­ous tech­ni­cal work on it since the 1960s, and were putting sub­stan­tial thought, in­ge­nu­ity, and care into mak­ing sure they were work­ing on the right prob­lems and us­ing solid method­olo­gies. Func­tional de­ci­sion the­ory was de­vel­oped in 1971, two years af­ter Robert Noz­ick’s pub­li­ca­tion of “New­comb’s Prob­lem and Two Prin­ci­ples of Choice.” Every­one ex­pects hu­mane val­ues to have high Kol­mogorov com­plex­ity. Every­one un­der­stands why, if you pro­gram an ex­pected util­ity max­i­mizer with util­ity func­tion 𝗨 and what you re­ally meant is 𝘝, the 𝗨-max­i­mizer has a con­ver­gent in­stru­men­tal in­cen­tive to de­ceive you into be­liev­ing that it is a 𝘝-max­i­mizer. No­body as­sumes you can “just pull the plug” on some­thing much smarter than you are. And the world’s other large-scale ac­tivi­ties and in­sti­tu­tions all scale up similarly in com­pe­tence.

We could call this the Ad­e­quate World, and con­trast it to the way things ac­tu­ally are. The Ad­e­quate World has a prop­erty that we could call in­ex­ploita­bil­ity; or in­ex­ploita­bil­ity-by-Eliezer. We can com­pare it to how you can’t pre­dict a 5% change in Microsoft’s stock price over the next six months—take that prop­erty of S&P 500 stocks, and scale it up to a whole planet whose ex­perts you can’t sur­pass, where you can’t find any know­able mis­take. They still make mis­takes in the Ad­e­quate World, be­cause they’re not perfect. But they’re smarter and nicer at the group level than Eliezer Yud­kowsky, so you can’t know which things are epistemic or moral mis­takes, just like you can’t know whether Microsoft’s equity price is mis­taken on the up-side or low-side on av­er­age.

eliezer: Okay… I can see how Maude’s con­clu­sion would make sense in the Ad­e­quate World. But how does Maude rec­on­cile the ar­gu­ments that reach that con­clu­sion with the vastly differ­ent world we ac­tu­ally live in? It’s not like Maude can say, “Look, it’s ob­vi­ously already be­ing han­dled!” be­cause it ob­vi­ously isn’t.

stranger: Sup­pose that you have an in­stinct to reg­u­late sta­tus claims, to make sure no­body gets more sta­tus than they de­serve.

eliezer: Okay…

stranger: This gives rise to the be­hav­ior you’ve been call­ing “hero li­cens­ing.” Your cur­rent model is that peo­ple have read too many nov­els in which the pro­tag­o­nist is born un­der the sign of a su­per­nova and car­ries a leg­endary sword, and they don’t re­al­ize real life is not like that. Or they as­so­ci­ate the deeds of Ein­stein with the pres­tige that Ein­stein has now, not re­al­iz­ing that prior to 1905, Ein­stein had no visi­ble aura of des­tiny.

eliezer: Right.

stranger: Wrong. Your model of heroic sta­tus is that it ought to be a re­ward for heroic ser­vice to the tribe. You think that while of course we should dis­cour­age peo­ple from claiming this heroic sta­tus with­out hav­ing yet served the tribe, no one should find it in­tu­itively ob­jec­tion­able to merely try to serve the tribe, as long as they’re care­ful to dis­claim that they haven’t yet served it and don’t claim that they already de­serve the rele­vant sta­tus boost.

eliezer: … this is wrong?

stranger: It’s fine for “sta­tus-blind” peo­ple like you, but it isn’t how the stan­dard-is­sue sta­tus emo­tions work. Sim­ply put, there’s a level of sta­tus you need in or­der to reach up for a given higher level of sta­tus; and this is a rel­a­tively ba­sic feel­ing for most peo­ple, not some­thing that’s trained into them.

eliezer: But be­fore 1905, Ein­stein was a patent ex­am­iner. He didn’t even get a PhD un­til 1905. I mean, Ein­stein wasn’t a typ­i­cal patent ex­am­iner and he no doubt knew that him­self, but some­one on the out­side look­ing at just his CV—

stranger: We aren’t talk­ing about an epistemic pre­dic­tion here. This is just a fact about how hu­man sta­tus in­stincts work. Hav­ing a cer­tain prob­a­bil­ity of writ­ing the most pop­u­lar Harry Pot­ter fan­fic­tion in the fu­ture comes with a cer­tain amount of sta­tus in Pat’s eyes. Hav­ing a cer­tain prob­a­bil­ity of mak­ing im­por­tant progress on the AI al­ign­ment prob­lem in the fu­ture comes with a cer­tain amount of sta­tus in Maude’s eyes. Since your cur­rent sta­tus in the rele­vant hi­er­ar­chy seems much lower than that, you aren’t al­lowed to en­dorse the rele­vant prob­a­bil­ity as­sign­ments or act as though you think they’re cor­rect. You are not al­lowed to just try it and see what hap­pens, since that already im­plies that you think the prob­a­bil­ity is non-tiny. The very act of af­fili­at­ing your­self with the pos­si­bil­ity is sta­tus-over­reach­ing, re­quiring a slap­down. Other­wise any old per­son will be al­lowed to claim too much sta­tus—which is ter­rible.

eliezer: Okay. But how do we get from there to delu­sions of civ­i­liza­tional ad­e­quacy?

stranger: Back­ward chain­ing of ra­tio­nal­iza­tions, per­haps mixed with some amount of just-world and sta­tus-quo bias. An economist would say “What?” if you pre­sented an ar­gu­ment say­ing you ought to be able to dou­ble your money ev­ery year by buy­ing and sel­l­ing Microsoft stock in some sim­ple pat­tern. The economist would then, quite rea­son­ably, ini­ti­ate a men­tal search to try to come up with some way that your al­gorithm doesn’t do what you thought it did, a hid­den risk it con­tained, a way to pre­serve the idea of an in­ex­ploitable mar­ket in equities.

Pat tries to pre­serve the idea of an in­ex­ploitable-by-Eliezer mar­ket in fan­fic­tion (since on a gut level it feels to him like you’re too low-sta­tus to be able to ex­ploit the mar­ket), and comes up with the idea that there are a thou­sand other peo­ple who are writ­ing equally good Harry Pot­ter fan­fic­tion. The re­sult is that Pat hy­poth­e­sizes a world that is ad­e­quate in the rele­vant re­spect. Writ­ers’ efforts are cheaply con­verted into sto­ries so pop­u­lar that it’s just about hu­manly im­pos­si­ble to fore­see­ably write a more pop­u­lar story; and the world’s ad­e­quacy in other re­gards en­sures that any out­siders who do have a shot at out­perform­ing the mar­ket, like Neil Gaiman, will already be rich in money, es­teem, etc.

And the phe­nomenon gen­er­al­izes. If some­one be­lieves that you don’t have enough sta­tus to make bet­ter pre­dic­tions than the Euro­pean Cen­tral Bank, they’ll have to be­lieve that the Euro­pean Cen­tral Bank is rea­son­ably good at its job. Tra­di­tional eco­nomics doesn’t say that the Euro­pean Cen­tral Bank has to be good at its job—an economist would tell you to look at in­cen­tives, and that the de­ci­sion­mak­ers don’t get paid huge bonuses if Europe’s econ­omy does bet­ter. For the sta­tus or­der to be pre­served, how­ever, it can’t be pos­si­ble for Eliezer to out­smart the Euro­pean Cen­tral Bank. For the world’s sta­tus or­der to be un­challenge­able, it has to be right and wise; for it to be right and wise, it has to be in­ex­ploitable. A gut-level ap­pre­ci­a­tion of civ­i­liza­tional in­ad­e­quacy is a pow­er­ful tool for dis­pel­ling mirages like hero li­cens­ing and mod­est episte­mol­ogy, be­cause when mod­est episte­mol­ogy back­ward-chains its ra­tio­nal­iza­tions for why you can’t achieve big things, it ends up as­sert­ing ad­e­quacy.

eliezer: Civ­i­liza­tion could be in­ex­ploitable in these ar­eas with­out be­ing ad­e­quate, though; and it sounds like you’re say­ing that Pat and Maude mainly care about in­ex­ploita­bil­ity.

stranger: You could have a world where poor in­cen­tives re­sult in al­ign­ment re­search visi­bly be­ing ne­glected, but where there’s no re­al­is­tic way for well-in­formed and mo­ti­vated in­di­vi­d­u­als to strate­gi­cally avoid those in­cen­tives with­out be­ing out­com­peted in some other in­dis­pens­able re­source. You could also have a world that’s in­ex­ploitable to you but ex­ploitable to many other peo­ple. How­ever, as­sert­ing ad­e­quacy reaf­firms the rele­vant sta­tus hi­er­ar­chy in a much stronger and more air­tight way. The no­tion of an Ad­e­quate World more closely matches the in­tu­itive sense that the world’s most re­spectable and au­thor­i­ta­tive peo­ple are just un­touch­able—too well-or­ga­nized, well-in­formed, and well-in­ten­tioned for just any­body to spot Moloch’s hand­i­work, whether or not they can do any­thing about it. And af­firm­ing ad­e­quacy in a way that sounds vaguely plau­si­ble gen­er­ally re­quires less de­tailed knowl­edge of microe­co­nomics, of the in­di­vi­d­u­als try­ing to ex­ploit the mar­ket, and of the spe­cific prob­lems they’re try­ing to solve than is the case for ap­peals to in­ex­ploitable in­ad­e­quacy.

Civ­i­liza­tional in­ad­e­quacy is the ba­sic rea­son why the world as a whole isn’t in­ex­ploitable in the fash­ion of short-term equity price changes. The mod­est view, roughly, is that the world is in­ex­ploitable as far as you can pre­dict, be­cause you can never know­ably know bet­ter than the ex­perts.

eliezer: I… sort of get it? I still don’t un­der­stand Maude’s ac­tual thought pro­cess here.

stranger: Let’s watch, then.

(The Masked Stranger raises his hands and snaps his fingers again, restart­ing time.)

pat: —take over liter­a­ture be­cause mere fic­tion writ­ers are stupid.

maude: My good fel­low, please take a mo­ment to con­sider what you’re propos­ing. If the AI al­ign­ment prob­lem were re­ally as im­por­tant as Eliezer claims, would he re­ally be one of the only peo­ple work­ing on it?

pat: Well, it sure looks like he is.

maude: Then the prob­lem can’t be as im­por­tant as he claims. The al­ter­na­tive is that a lone crank has iden­ti­fied an im­por­tant is­sue that he and very few oth­ers are work­ing on; and that means ev­ery­one else in his field is an idiot. Who does Eliezer think he is, to defy the aca­demic con­sen­sus to the effect that AI al­ign­ment isn’t an in­ter­est­ing idea worth work­ing on?

pat: I mean, there are all sorts of bar­ri­ers I could imag­ine a typ­i­cal aca­demic run­ning into if they wanted to work on AI al­ign­ment. Maybe it’s just hard to get aca­demic grants for this kind of work.

maude: If it’s hard to get grants, then that’s be­cause the grant-mak­ers cor­rectly rec­og­nize that this isn’t a pri­or­ity prob­lem.

pat: So now the state of aca­demic fund­ing is said to be so wise that peo­ple can’t find ne­glected re­search op­por­tu­ni­ties?

stranger: What per­son with grant-mak­ing power gets paid less in the wor­lds where al­ign­ment is im­por­tant and yet ne­glected? If no one loses their bonuses or in­curs any other per­cep­ti­ble cost, then you’re done. There’s no mys­tery here.

maude: All of the ev­i­dence is perfectly con­sis­tent with the hy­poth­e­sis that there are no aca­demic grants on offer be­cause the grant­mak­ers have made a thought­ful and in­formed de­ci­sion that this is a pseudo-prob­lem.

eliezer: I ap­pre­ci­ate Pat’s defense, but I think I can bet­ter speak to this. Is­sues like in­tel­li­gence ex­plo­sion and the idea that there’s an im­por­tant prob­lem to be solved in AI goal sys­tems, as I men­tioned ear­lier, aren’t origi­nal to me. They’re rea­son­ably widely known, and peo­ple at all lev­els of se­nior­ity are of­ten happy to talk about it face-to-face, though there’s dis­agree­ment about the mag­ni­tude of the risk and about what kinds of efforts are like­liest to be use­ful for ad­dress­ing it. You can find it dis­cussed in the most com­monly used un­der­grad text­book in AI, Ar­tifi­cial In­tel­li­gence: A Modern Ap­proach. You can’t claim that there’s a con­sen­sus among re­searchers that this is not an im­por­tant prob­lem.

maude: Then the grant­mak­ers prob­a­bly care­fully looked into the prob­lem and de­ter­mined that the best way to pro­mote hu­man­ity’s long-term welfare is to ad­vance the field of AI in other ways, and only work on al­ign­ment once we reach some par­tic­u­lar ca­pa­bil­ities thresh­old. At that point, in all like­li­hood, fun­ders plan to co­or­di­nate to launch a ma­jor field-wide re­search effort on al­ign­ment.

eliezer: How, ex­actly, could they reach a con­clu­sion like that with­out study­ing the prob­lem in any visi­ble way? If the en­tire grant­mak­ing com­mu­nity was able to ar­rive at a con­sen­sus to that effect, then where are the pa­pers and analy­ses they used to reach their con­clu­sion? What are the ar­gu­ments? You sound like you’re talk­ing about a silent con­spir­acy of com­pe­tent grant­mak­ers at a hun­dred differ­ent or­ga­ni­za­tions, who have in some way col­lec­tively de­vel­oped or gained ac­cess to a liter­a­ture of strate­gic and tech­ni­cal re­search that Nick Bostrom and I have never heard about, es­tab­lish­ing that the pre­sent-day re­search prob­lems that look rele­vant and tractable aren’t so promis­ing, and that ca­pa­bil­ities will de­velop in a spe­cific known di­rec­tion at a par­tic­u­lar rate that lends it­self to late co­or­di­nated in­ter­ven­tion.

Are you say­ing that de­spite all the re­searchers in the field ca­su­ally dis­cussing self-im­prov­ing AI and Asi­mov Laws over coffee, there’s some hid­den clever rea­son why study­ing this prob­lem isn’t a good idea, which the grant­mak­ers all ar­rived at in uni­son with­out leav­ing a pa­per trail about their de­ci­sion-mak­ing pro­cess? I just… There are so many well-known and perfectly nor­mal dys­func­tions of grant­mak­ing ma­chin­ery and the aca­demic in­cen­tive struc­ture that al­low al­ign­ment to be a crit­i­cal prob­lem with­out there nec­es­sar­ily be­ing a huge aca­demic rush to work on it. In­stead you’re pos­tu­lat­ing a mas­sive global con­spir­acy of hid­den com­pe­tence grounded in se­cret analy­ses and ar­gu­ments. Why would you pos­si­bly go there?

maude: Be­cause oth­er­wise—

(The Stranger snaps his fingers again.)

stranger: Okay, Eliezer-2010, go ahead and an­swer. Why is Maude go­ing there?

eliezer: Be­cause… to pre­vent rel­a­tively unim­pres­sive or unau­thor­i­ta­tive-look­ing peo­ple from af­fili­at­ing with im­por­tant prob­lems, from Maude’s per­spec­tive there can’t be know­ably low-hang­ing re­search fruit. If there were know­ably im­por­tant prob­lems that the grant­mak­ing ma­chin­ery and aca­demic re­ward sys­tem had left un­touched, then some­body like me could know­ably be work­ing on them. If there were a prob­lem with the grant­mak­ers, or a prob­lem with aca­demic in­cen­tives, at least of the kind that some­one like me could iden­tify, then it might be pos­si­ble for some­one unim­por­tant like me to know that an im­por­tant prob­lem was not be­ing worked on. The alleged state of academia and in­deed the whole world has to back­ward chain to avoid there be­ing low-hang­ing re­search fruit.

First Maude tried to ar­gue that the prob­lem is already well-cov­ered by re­searchers in the field, as it would be in the Ad­e­quate World you de­scribed. When that po­si­tion be­came difficult to defend, she switched to ar­gu­ing that au­thor­i­ta­tive an­a­lysts have looked into the prob­lem and col­lec­tively de­ter­mined it’s a pseudo-prob­lem. When that be­came difficult to defend, she switched to ar­gu­ing that au­thor­i­ta­tive an­a­lysts have looked into the prob­lem and col­lec­tively de­vised a bet­ter strat­egy in­volv­ing de­lay­ing al­ign­ment re­search tem­porar­ily.

stranger: Very differ­ent hy­pothe­ses that share this prop­erty: they al­low there to be some­thing like an effi­cient mar­ket in high-value re­search, where in­di­vi­d­u­als and groups that have high sta­tus in the stan­dard aca­demic sys­tem can’t end up visi­bly drop­ping the ball.

Per­haps Maude’s next pro­posal will be that top re­searchers have de­ter­mined that the prob­lem is easy. Per­haps there’s a hid­den con­sen­sus that AGI is cen­turies away. In my ex­pe­rience, peo­ple like Maude can be bound­lessly in­ven­tive. There’s always some­thing.

eliezer: But why go to such lengths? No real economist would tell us to ex­pect an effi­cient mar­ket here.

stranger: Sure, says Maude, the sys­tem isn’t perfect. But, she con­tinues, nei­ther are we perfect. All the grant­mak­ers and tenure-granters are in an equiv­a­lent po­si­tion to us, and do­ing their own part to ac­tively try to com­pen­sate for any bi­ases in the sys­tem they think they can see.

eliezer: But that’s visi­bly con­tra­dicted both by ob­ser­va­tion and by the eco­nomic the­ory of in­cen­tives.

stranger: Yes. But at the same time, it has to be as­sumed true. Be­cause while ex­perts can be wrong, we can also be wrong, right? Maybe we’re the ones with bad sys­temic in­cen­tives and only short-term re­wards.

eliezer: But be­ing in­side a sys­tem with badly de­signed in­cen­tives is not the same as be­ing un­able to dis­cern the truth of… oh, never mind.

This has all been very ed­u­ca­tional, Masked Stranger. Thanks.

stranger: Thanks for what, Eliezer? Show­ing you a prob­lem isn’t much of a ser­vice if there’s noth­ing you can do to fix it. You’re no bet­ter off than you were in the origi­nal timeline.

eliezer: It still feels bet­ter to have some idea of what’s go­ing on.

stranger: That, too, is a trap, as we’re both aware. If you need an elab­o­rate the­ory to jus­tify see­ing the ob­vi­ous, it will only be­come more elab­o­rate and dis­tract­ing as time goes on and you try harder and harder to re­as­sure your­self. It’s much bet­ter to just take things at face value, with­out need­ing a huge ar­gu­ment to do so. If you must ig­nore some­one’s ad­vice, it’s bet­ter not to make up big elab­o­rate rea­sons why you’re li­censed to ig­nore it; that makes it eas­ier to change your mind and take the ad­vice later, if you hap­pen to feel like it.

eliezer: True. Then why are you even say­ing these things to me?

stranger: I’m not. You never were the one to whom I was speak­ing, this whole time. That is the last les­son, that I didn’t ever say these things to my­self.

(The Stranger turns upon his own heel three times, and was never there.)