Transcription of Eliezer’s January 2010 video Q&A

Spurred by discussion of whether Luke’s Q&A session should be on video or text-only, I volunteered to transcribe Eliezer’s Q&A videos from January 2010. I finished last night, much earlier than my estimate, mostly due to feeling motivated to finish it and spending more on it than my very conservative estimated 30 minutes a day (estimate of number of words was pretty close; about 16000). I have posted a link to this post as a comment in the original thread here, if you would like to upvote that.

Some advice for transcribing videos: I downloaded the .wmv videos, which allowed me to use VLC’s global hotkeys to create a pause and “short skip backwards and forwards” buttons (ctrl-space and ctrl-shift left/​right arrow), which were so much more convenient than any other method I tried.

Edited out: repetition of the question, “um/​uh”, “you know,” false starts.

Punctuation, capitalization, and structure, etc may not be entirely consistent.

Keep in mind the opinions expressed here are those of Eliezer circa January 2010.

1. What is your information diet like? Do you control it deliberately (do you have a method; is it, er, intelligently designed), or do you just let it happen naturally.

By that I mean things like: Do you have a reading schedule (x number of hours daily, etc)? Do you follow the news, or try to avoid information with a short shelf-life? Do you frequently stop yourself from doing things that you enjoy (f.ex reading certain magazines, books, watching films, etc) to focus on what is more important? etc.


It’s not very planned, most of the time, in other words Hacker News, Reddit, Marginal Revolution, other random stuff found on the internet. In order to learn something, I usually have to set aside blocks of time and blocks of effort and just focus on specifically reading something. It’s only sort of popular level books which I can put on a restroom shelf and get them read that way. In order to learn actually useful information I generally find that I have to set aside blocks of time or run across a pot of gold, and you’re about as likely to get a pot of gold from Hacker News as anywhere else really. So not very controlled.



2. Your “Bookshelf” page is 10 years old (and contains a warning sign saying it is obsolete): http://​​yudkowsky.net/​​obsolete/​​bookshelf.html

Could you tell us about some of the books and papers that you’ve been reading lately? I’m particularly interested in books that you’ve read since 1999 that you would consider to be of the highest quality and/​or importance (fiction or not).


I guess I’m a bit ashamed of how little I’ve been reading whole books and how much I’ve been reading small bite-sized pieces on the internet recently. Right now I’m reading Predictably Irrational which is a popular book by Dan Ariely about biases, it’s pretty good, sort of like a sequence of Less Wrong posts. I’ve recently finished reading Good and Real by Gary Drescher, which is something I kept on picking up and putting down, which is very Lesswrongian, it’s master level Reductionism and the degree of overlap was incredible enough that I would read something and say ‘OK I should write this up on my own before I read how Drescher wrote it so that you can get sort of independent views of it and see how they compare.’


Let’s see, other things I’ve read recently. I’ve fallen into the black hole of Fanfiction.net, well actually fallen into a black hole is probably too extreme. It’s got a lot of reading and the reading’s broken up into nice block size chapters and I’ve yet to exhaust the recommendations of the good stuff, but probably not all that much reading there, relatively speaking.


I guess it really has been quite a while since I picked up a good old-fashioned book and said ‘Wow, what an amazing book’. My memory is just returning the best hits of the last 10 years instead of the best hits of the last six months or anything like that. If we expand it out to the best hits of the last 10 years then Artificial Intelligence: A Modern Approach by Russell and Norvig is a really wonderful artificial intelligence textbook. It was on reading through that that I sort of got the epiphany of artificial intelligence really has made a lot more progress than people credit for, it’s just not really well organized, so you need someone with good taste to go through and tell you what’s been done before you recognize what has been done.


There was a book on statistical inference, I’m trying to remember the exact title, it’s by Hastie and Tibshirani, Elements of Statistical Learning, that was it. Elements of Statistical Learning was when I realized that the top people, they really do understand their subject, the people who wrote the Elements of Statistical Learning, they really understand statistics. At the same time you read through and say ‘Gosh, by comparison with these people, the average statistician, to say nothing of the average scientist who’s just using statistics, doesn’t really understand statistics at all.’


Let’s see, other really great… Yeah my memory just doesn’t really associate all that well I’m afraid, it doesn’t sort of snap back and cough up a list of the best things I’ve read recently. This would probably be something better for me to answer in text than in video I’m afraid.



3. What is a typical EY workday like? How many hours/​day on average are devoted to FAI research, and how many to other things, and what are the other major activities that you devote your time to?


I’m not really sure I have anything I could call a ‘typical’ workday. Akrasia, weakness of will, that has always been what I consider to be my Great Bugaboo, and I still do feel guilty about the amount of rest time and downtime that I require to get work done, and even so I sometimes suspect that I’m taking to little downtime relative to work time just because on those occasions when something or other prevents me form getting work done, for a couple of days, I come back and I’m suddenly much more productive. In general, I feel like I’m stupid with respect to organizing my work day, that sort of problem, it used to feel to me like it was chaotic and unpredictable, but I now recognize that when something looks chaotic and unpredictable, that means that you are stupid with respect to that domain.


So it’ll probably look like, when I manage to get a work session in the work session will be a couple of hours, I’ll sometimes when I run into a difficult problem I’ll sometimes stop and go off and read things on the internet for a few minutes or a lot of minutes, until I can come back and I can come back and solve the problem or my brain is rested enough to go to the more tiring high levels of abstraction where I can actually understand what it is that’s been blocking me and move on. That’s for writing, which I’ve been doing a lot of lately.


A typical workday when I’m actually working on Friendly AI with Marcello, that’ll look like we get together and sit down and open up a notebook and stare at our notebooks and throw ideas back and forth and sometimes sit in silence and think about things, write things down, I’ll propose things, Marcello will point out flaws in them or vice versa, sort of reach the end of a line of thought, go blank, stop and stare at each other and try to think of another line of thought, keep that up for two to three hours, break for lunch, keep it up for another two to three hours, and then break for a day, could spend the off day just recovering or reading math if possible or otherwise just recovering. Marcello doesn’t need as much recovery time, but I also suspect that Marcello, because he’s still sort of relatively inexperienced isn’t quite confronting the most difficult parts of the problem as directly.


So taking a one-day-on one-day-off, with respect to Friendly AI I actually don’t feel guilty about it at all, because it really is apparent that I just cannot work two days in a row on this problem and be productive. It’s just really obvious, and so instead of the usual cycle of ‘Am I working enough? Could I be working harder?’ and feeling guilty about it it’s just obvious that in that case after I get a solid day’s work I have to take a solid day off.


Let’s see, any other sorts of working cycles? Back when I was doing the Overcoming Bias/​Less Wrong arc at one post per day, I would sometimes get more than one post per day in and that’s how I’d occasionally get a day off, other times a post would take more than one day. I find that I am usually relatively less productive in the morning; a lot of advice says ‘as soon as you get up in the morning, sit down, start working, get things done’; that’s never quite worked out for me, and of course that could just be because I’m doing it wrong, but even so I find that I tend to be more productive later in the day.


Let’s see, other info… Oh yes, at one point I tried to set up my computer to have a separate login without any of the usual distractions, and that caused my productivity to drop down because it meant that when I needed to take some time off, instead of browsing around the internet and then going right back to working, I’d actually separated work and so it was harder to switch back and forth between them both, so that was something that seemed like it was a really good idea that ought to work in theory, setting aside this sort of separate space with no distractions to work, and that failed.


And right now I’m working sort of on the preliminaries for the book, The Art of Rationality being the working title, and I haven’t started writing the book yet, I’m still sort of trying to understand what it is that I’ve previously written on Less Wrong, Overcoming Bias, organize it using mind mapping software from FreeMind which is open source mind mapping software; it’s really something I wish I’d known existed and started using back when the whole Overcoming Bias/​Less Wrong thing started, I think it might have been a help.


So right now I’m just still sort of trying to understand what did I actually say, what’s the point, how do the points relate to each other, and thereby organizing the skeleton of the book, rather than writing it just yet, and the reason I’m doing it that way is that when it comes to writing things like books where I don’t push out a post every day I tend to be very slow, unacceptably slow even, and so one method of solving that was rite a post every day and this time I’m seeing if I can, by planning everything out sufficiently thoroughly in advance and structuring it sufficiently thoroughly in advance, get it done at a reasonable clip.



4. Could you please tell us a little about your brain? For example, what is your IQ, at what age did you learn calculus, do you use cognitive enhancing drugs or brain fitness programs, are you Neurotypical and why didn’t you attend school?


So the question is ‘please tell us a little about your brain.’ What’s your IQ? Tested as 143, that would have been back when I was… 12? 13? Not really sure exactly. I tend to interpret that as ‘this is about as high as the IQ test measures’ rather than ‘you are three standard deviations above the mean’. I’ve scored higher than that on(?) other standardized tests; the largest I’ve actually seen written down was 99.9998th percentile, but that was not really all that well standardized because I was taking the test and being scored as though for the grade above mine and so it was being scored for grade rather than by age, so I don’t know whether or not that means that people who didn’t advance through grades tend to get the highest scores and so I was competing well against people who were older than me, or whether if the really smart people all advanced farther through the grades and so the proper competition doesn’t really get sorted out, but in any case that’s the highest percentile I’ve seen written down.


‘At what age did I learn calculus’, well it would have been before 15, probably 13 would be my guess. I’ll also state at just how stunned I am at how poorly calculus is taught.

Do I use cognitive enhancing drugs or brain fitness programs? No. I’ve always been very reluctant to try tampering with the neurochemistry of my brain because I just don’t seem to react to things typically; as a kid I was given Ritalin and Prozac and neither of those seemed to help at all and the Prozac in particular seemed to blur everything out and you just instinctively(?) just… eugh.


One of the questions over here is ‘are you neurotypical’. And my sort of instinctive reaction to that is ‘Hah!’ And for that reason I’m reluctant to tamper with things. Similarly with the brain fitness programs, don’t really know which one of those work and which don’t, I’m sort of waiting for other people in the Less Wrong community to experiment with that sort of thing and come back and tell the rest of us what works and if there’s any consensus between them, I might join the crowd.


‘Why didn’t you attend school?’ Well I attended grade school, but when I got out of grade school it was pretty clear that I just couldn’t handle the system; I don’t really know how else to put it. Part of that might have been that at the same time that I hit puberty my brain just sort of… I don’t really know how to describe it. Depression would be one word for it, sort of ‘spontaneous massive will failure’ might be another way to put; it’s not that I was getting more pessimistic or anything, just that my will sort of failed and I couldn’t get stuff done. Sort of a long process to drag myself out that and you could probably make a pretty good case that I’m still there, I just handle it a lot better? Not even really sure quite what I did right, as I said in an answer to a previous question, this is something I’ve been struggling with for a while and part of having a poor grasp on something is that even when you do something right you don’t understand afterwards quite what it is that you did right.


So… ‘tell us about your brain’. I get the impression that it’s got a different balance of abilities; like, some neurons got allocated to different areas, other areas got shortchanged, some areas got some extra neurons, other areas got shortchanged, the hypothesis has occurred to me lately that my writing is attracting other people with similar problems because of the extent to which one has noticed a sort of similar tendency to fall on the lines of very reflective, very analytic and has mysterious trouble executing and getting things done and working at sustained regular output for long periods of time, among the people who like my stuff.


On the whole though, I never actually got around to getting an MRI scan; it’s probably a good thing to do one of these days, but this isn’t Japan where that sort of thing only costs 100 dollars, and getting it analyzed, you know they’re not just looking for some particular thing but just sort of looking at it and saying ‘Hmm, well what is this about your brain?’, well I’d have to find someone to do that too.


So, I’m not neurotypical… asking sort of ‘what else can you tell me about your brain’ is sort of ‘what else can you tell me about who you are apart from your thoughts’, and that’s a bit of a large question. I don’t try and whack on my brain because it doesn’t seem to react typically and I’m afraid of being in a sort of narrow local optimum where anything I do is going to knock it off the tip of the local peak, just because it works better than average and so that’s sort of what you would expect to find there.



5. During a panel discussion at the most recent Singularity Summit, Eliezer speculated that he might have ended up as a science fiction author, but then quickly added:

I have to remind myself that it’s not what’s the most fun to do, it’s not even what you have talent to do, it’s what you need to do that you ought to be doing.

Shortly thereafter, Peter Thiel expressed a wish that all the people currently working on string theory would shift their attention to AI or aging; no disagreement was heard from anyone present.

I would therefore like to ask Eliezer whether he in fact believes that the only two legitimate occupations for an intelligent person in our current world are (1) working directly on Singularity-related issues, and (2) making as much money as possible on Wall Street in order to donate all but minimal living expenses to SIAI/​Methuselah/​whatever.

How much of existing art and science would he have been willing to sacrifice so that those who created it could instead have been working on Friendly AI? If it be replied that the work of, say, Newton or Darwin was essential in getting us to our current perspective wherein we have a hope of intelligently tackling this problem, might the same not hold true in yet unknown ways for string theorists? And what of Michelangelo, Beethoven, and indeed science fiction? Aren’t we allowed to have similar fun today? For a living, even?


So, first, why restrict it to intelligent people in today’s world? Why not everyone? And second… the reply to the essential intent of the question is yes, with a number of little details added. So for example, if you’re making money on Wall Street, I’m not sure you should be donating all but minimal living expenses because that may or may not be sustainable for you. And in particular if you’re, say, making 500,000 dollars a year and you’re keeping 50,000 dollars of that per year, which is totally not going to work in New York, probably, then it’s probably more effective to double your living expenses to 100,000 dollars per year and have the amount donated to the Singularity Institute go from 450,000 to 400,000 when you consider how much more likely that makes it that more people follow in your footsteps. That number is totally not realistic and not even close to the percentage of income donated versus spent on living expenses for present people working on Wall Street who are donors to the Singularity Institute. So considering at present that no one seems willing to do that, I wouldn’t even be asking that, but I would be asking for more people to make as much money as possible if they’re the sorts of people who can make a lot of money and can donate a substantial amount fraction, never mind all the minimal living expenses, to the Singularity Institute.


Comparative advantage is what money symbolizes; each of us able to specialize in doing what we do best, get a lot of experience doing it, and trade off with other people specialized at what they’re doing best with attendant economies of scale and large fixed capital installations as well, that’s what money symbolizes, sort of in idealistic reality, as it were; that’s what money would mean to someone who could look at human civilization and see what it was really doing. On the other hand, what money symbolizes emotionally in practice, is that it imposes market norms, instead of social norms. If you sort of look at how cooperative people are, they can actually get a lot less cooperative once you offer to pay them a dollar, because that means that instead of cooperating because it’s a social norm, they’re now accepting a dollar, and a dollar puts it in the realm of market norms, and they become much less altruistic.


So it’s sort of a sad fact about how things are set up that people look at the Singularity Institute and think ‘Isn’t there some way for me to donate something other than money?’ partially for the obvious reason and partially because their altruism isn’t really emotionally set up to integrate properly with their market norms. For me, money is reified time, reified labor. To me it seems that if you work for an hour on something and then donate the money, that’s more or less equivalent to donating the money (time?), or should be, logically. We have very large bodies of experimental literature showing that the difference between even a dollar bill versus a token that’s going to be exchanged for a dollar bill at the end of the experiment can be very large, just because that token isn’t money. So there’s nothing dirty about money, and there’s nothing dirty about trying to make money so that you can donate it to a charitable cause; the question is ‘can you get your emotions to line up with reality in this case?’


Part of the question was sort of like ‘What of Michaelangelo, Beethoven, and indeed science fiction? Aren’t we allowed to have similar fun today? For a living even?’


This is crunch time. This is crunch time for the entire human species. This is the hour before the final exam, we are trying to get as much studying done as possible, and it may be that you can’t make yourself feel that, for a decade, or 30 years on end or however long this crunch time lasts. But again, the reality is one thing, and the emotions are another. So it may be that you can’t make yourself feel that this is crunch time, for more than an hour at a time, or something along those lines. But relative to the broad sweep of human history, this is crunch time; and it’s crunch time not just for us, it’s crunch time for the intergalactic civilization whose existence depends on us. I think that if you’re actually just going to sort of confront it, rationally, full-on, then you can’t really justify trading off any part of that intergalactic civilization for any intrinsic thing that you could get nowadays, and at the same time it’s also true that there are very few people who can live like that, and I’m not one of them myself, so because trying to live with that would even rule out things like ordinary altruism; I hold open doors for little old ladies, because I find that I can’t live only as an altruist in theory; I need to commit sort of actual up-front deeds of altruism, or I stop working properly.


So having seen that intergalactic civilization depends on us, in one sense, all you can really do is try not to think about that, and in another sense though, if you spend your whole life creating art to inspire people to fight global warming, you’re taking that ‘forgetting about intergalactic civilization’ thing much too far. If you look over our present civilization, part of that sort of economic thinking that you’ve got to master as a rationalist is learning to think on the margins. On the margins, does our civilization need more art and less work on the singularity? I don’t think so. I think that the amount of effort that our civilization invests in defending itself against existential risks, and to be blunt, Friendly AI in particular is ludicrously low. Now if it became the sort of pop-fad cause and people were investing billions of dollars into it, all that money would go off a cliff and probably produce anti-science instead of science, because very few people are capable of working on a problem where they don’t find immediately whether or not they were wrong, and it would just instantaneously go wrong and generate a lot of noise from people of high prestige who would just drown out the voices of sanity. So wouldn’t it be a nice thing if our civilization started devoting billions of dollars to Friendly AI research because our civilization is not set up to do that sanely. But at the same time, the Singularity Institute exists, the Singularity Institute, now that Michael Vassar is running it, should be able to scale usefully; that includes actually being able to do interesting things with more money, now that Michael Vassar’s the president.


To say ‘No, on the margin, what human civilization, at this present time, needs to do is not put more money in the Singularity Institute, but rather do this thing that I happen to find fun’ not that I’m doing this and I’m going to professionally specialize in it and become good in it and sort of trade hours of doing this thing that I’m very good at for hours that go into the Singularity Institute via the medium of money, but rather ‘no, this thing that I happen to find fun and interesting is actually what our civilization needs most right now, not Friendly AI’, that’s not defensible; and, you know, these are all sort of dangerous things to think about possibly, but I think if you sort of look at that face-on, up-front, take it and stare at it, there’s no possible way the numbers could work out that way.


It might be helpful to visualize a Friendly Singularity so that the kid who was one year old at the time is now 15 years old and still has something like a 15 year old human psychology and they’re asking you ‘So here’s this grand, dramatic moment in history, not human history, but history, on which the whole future of the intergalactic civilization that we now know we will build; it hinged on this one moment, and you knew that was going to happen. What were you doing?’ and you say, ‘Well, I was creating art to inspire people to fight global warming.’ The kid says ‘What’s global warming?’


That’s what you get for not even taking into account at all the whole ‘crunch time, fate of the world depends on it, squeaking through by a hair if we do it at all, already played into a very poor position in terms of how much work has been done and how much work we need to do relative to the amount of work that needs to be done to destroy the world as opposed to saving it; how long we could have been working on this previously and how much trouble it’s been to still get started.’ When this is all over, it’s going to be difficult to explain to that kid, what in the hell the human species was thinking. It’s not going to be a baroque tale. It’s going to be a tale of sheer insanity. And you don’t want you to be explaining yourself to that kid afterward as part of the insanity rather than the sort of small core of ‘realizing what’s going on and actually doing something about it that got it done.’



6. I know at one point you believed in staying celibate, and currently your main page mentions you are in a relationship. What is your current take on relationships, romance, and sex, how did your views develop, and how important are those things to you? (I’d love to know as much personal detail as you are comfortable sharing.)


This is not a topic on which I consider myself an expert, and so it shouldn’t be shocking to hear that I don’t have incredibly complicated and original theories about these issues. Let’s see, is there anything else to say about that… It’s asking ‘at one point I believed in staying celibate and currently your main page mentions your are in a relationship.’ So, it’s not that I believed in staying celibate as a matter of principle, but that I didn’t know where I could find a girl who would put up with me and the life that I intended to lead, and said as much, and then one woman, Erin, read about the page I’d put up to explain why I didn’t think any girl would put up with me and my life and said essentially ‘Pick me! Pick me!’ and it was getting pretty difficult to keep up with the celibate lifestyle by then so I said ‘Ok!’ And that’s how we got together, and if that sounds a bit odd to you, or like, ‘What!? What do you mean...?’ then… that’s why you’re not my girlfriend.


I really do think that in the end I’m not an expert; that might be as much as there is to say.



7. What’s your advice for Less Wrong readers who want to help save the human race?


Find whatever you’re best at; if that thing that you’re best at is inventing new math[s] of artificial intelligence, then come work for the Singularity Institute. If the thing that you’re best at is investment banking, then work for Wall Street and transfer as much money as your mind and will permit to the Singularity institute where [it] will be used by other people. And for a number of sort of intermediate cases, if you’re familiar with all the issues of AI and all the issues of rationality and you can write papers at a reasonable clip, and you’re willing to work for a not overwhelmingly high salary, then the Singularity Institute is, as I understand it, hoping to make a sort of push toward getting some things published in academia. I’m not going to be in charge of that, Michael Vassar and Anna Salamon would be in charge of that side of things. There’s an internship program whereby we provide you with room and board and you drop by for a month or whatever and see whether or not this is work you can do and how good you are at doing it.


Aside from that, though, I think that saving the human species eventually comes down to, metaphorically speaking, nine people and a brain in a box in a basement, and everything else feeds into that. Publishing papers in academia feeds into either attracting attention that gets funding, or attracting people who read about the topic, not necessarily reading the papers directly even but just sort of raising the profile of the issues where intelligent people wonder what they can do with their lives think artificial intelligence instead of string theory. Hopefully not too many of them are thinking that because that would just generate noise, but the very most intelligent people… string theory is a marginal waste of the most intelligent people. Artificial intelligence and Friendly Artificial Intelligence, sort of developing precise, precision grade theories of artificial intelligence that you could actually use to actually build a Friendly AI instead of blowing up the world; the need for one more genius there is much greater than the need for one more genius in string theory. Most of us can’t work on that problem directly. I, in a sense, have been lucky enough not to have to confront a lot of the hard issues here, because of being lucky enough to be able to work on the problem directly, which simplifies my choice of careers.


For everyone else, I’ll just sort of repeat what I said in an earlier video about comparative advantage, professional specialization, doing what we do best at and practicing a lot; everyone doing that and trading with each other is the essence of economics, and the symbol of this is money, and it’s completely respectable to work hours doing what you’re best at, and then transfer the sort of expected utilons that a society assigns to that to the Singularity Institute, where it can pay someone else to work at it such that it’s an efficient trade, because the total amount of labor and effectiveness that they put into it that you can purchase is more than you could do by working an equivalent number of hours on the problem yourself. And as long as that’s the case, the economically rational thing to do is going to be to do what you’re best at and trade those hours to someone else, and let them do it. And there should probably be fewer people, one expects, who working on the problem directly, full time; stuff just does not get done if you’re not working on it full time, that’s what I’ve discovered, anyway; I can’t even do more than one thing at a time. And that’s the way grown ups do it, essentially, that’s the way a grown up economy does it.



8. Autodidacticism

Eliezer, first congratulations for having the intelligence and courage to voluntarily drop out of school at age 12! Was it hard to convince your parents to let you do it? AFAIK you are mostly self-taught. How did you accomplish this? Who guided you, did you have any tutor/​mentor? Or did you just read/​learn what was interesting and kept going for more, one field of knowledge opening pathways to the next one, etc...?

EDIT: Of course I would be interested in the details, like what books did you read when, and what further interests did they spark, etc… Tell us a little story. ;)


Well, amazingly enough, I’ve discovered the true, secret, amazing formula for teaching yourself and… I lie, I just winged it. Yeah, just read whatever interested me until age 15-16 thereabouts which is when I started to discover the Singularity as opposed to background low-grade Transhumanism that I’d been engaged with up until that point; started thinking that cognitive technologies, creating smarter than human level intelligence was the place to be and initially thought that neural engineering was going to be the sort of leading, critical path of that. Studied a bit of neuroscience and didn’t get into that too far before I started thinking that artificial intelligence was going to be the route; studied computer programming, studied a bit of business type stuff because at one point I thought I’d do a start up at something I’m very glad I didn’t end up doing, in order to get the money to do the AI thing, and I’m very glad that I didn’t go that route, and I won’t even say that the knowledge has served me all that good instead, it’s just not my comparative advantage.


At some point sort of woke up and smelled the Bayesian coffee and started studying probability theory and decision theory and statistics and that sort of thing, but really I haven’t had and opportunity to study anywhere near as much as I need to know. And part of that, I won’t apologize for because a lot of sort of fact memorization is more showing off than because you’re going to use that fact every single day; part of that I will apologize for because I feel that I don’t know enough to get the job done and that when I’m done writing the book I’m just going to have to take some more time off and just study some of the sort of math and mathematical technique that I expect to need in order to get this done. I come across as very intelligent, but a surprisingly small amount of that relies on me knowing lots of facts, or at least that’s the way it feels to me. So I come across as very intelligent, but that’s because I’m good at winging it, might be one way to put it. The road of the autodidact, I feel… I used to think that anyone could just go ahead and do it and that the only reason to go to college was for the reputational ‘now people can hire you’ aspect which sadly is very important in today’s world. Since then I’ve come to realize both that college is less valuable and less important than I used to think and also that autodidacticism might be a lot harder for the average person than I thought because the average person is less similar to myself than my sort of intuitions would have it.


‘How do you become an autodidact’; the question you would ask before that would be ’what am I going to do, and is it something that’s going to rely on me having memorized lots of standard knowledge and worked out lots of standard homework problems, or is it going to be something else, because if you’re heading for a job where you going to want to memorize lots of the same standardized facts as people around you, then autodidacticism might not be the best way to go. If you’re going to be a computer programmer, on the other hand, then [going] into a field where every day is a new adventure, and most jobs in computer programming will not require you to know the Nth detail of computer science, and even if they did, the fact that this is math means you might even have a better chance of learning it out of a book, and above all it’s a field where people have some notion that you’re allowed to teach yourself; if you’re good, other people can see it by looking at your code, and so there’s sort of a tradition of being willing to hire people who don’t have a Masters.


So I guess I can’t really give all that much advice about how to be successful autodidact in terms of… studying hard, doing the same sort of thing you’d be doing in college only managing to do it on your own because you’re that self-disciplined, because that is completely not the route I took. I would rather advise you to think very hard about what it is you’re going to be doing, whether or not anyone will let you do it if you don’t have the official credential, and to what degree the road you’re going is going to depend on the sort of learning that you have found that you can get done on your own.



9. Is your pursuit of a theory of FAI similar to, say, Hutter’s AIXI, which is intractable in practice but offers an interesting intuition pump for the implementers of AGI systems? Or do you intend on arriving at the actual blueprints for constructing such systems? I’m still not 100% certain of your goals at SIAI.


Definitely actual blueprint, but, on the way to an actual blueprint, you probably have to, as an intermediate step, construct intractable theories that tell you what you’re trying to do, and enable you to understand what’s going on when you’re trying to do something. If you want a precise, practical AI, you don’t get there by starting with an imprecise, impractical AI and going to a precise, practical AI. You start with a precise, impractical AI and go to a precise, practical AI. I probably should write that down somewhere else because it’s extremely important, and as(?) various people who will try to dispute it, and at the same time hopefully ought to be fairly obvious if you’re not motivated to arrive at a particular answer there. You don’t just run out and construct something imprecise because, yeah, sure, you’ll get some experimental observations out of that, but what are your experimental observations telling you? And one might say along the lines of ‘well, I won’t know that until I see it,’ and suppose that has been known to happen a certain number of times in history; just inventing the math has also happened a certain number of times in history.


We already have a very large body of experimental observations of various forms of imprecise AIs, both the domain specific types we have now, and the sort of imprecise AI constituted by human beings, and we already have a large body of experimental data, and eyeballing it… well, I’m not going to say it doesn’t help, but on the other hand, we already have this data and now there is this sort of math step in which we understand what exactly is going on; and then the further step of translating the math back into reality. It is the goal of the Singularity Institute to build a Friendly AI. That’s how the world gets saved, someone has to do it. A lot of people tend to think that this is going to require, like, a country’s worth of computing power or something like that, but that’s because the problem seems very difficult because they don’t understand it, so they imagine throwing something at it that seems very large and powerful and gives this big impression of force, which might be a country-size computing grid, or it might be a Manhattan Project where some computer scientists… but size matters not, as Yoda says.


What matters is understanding, and if the understanding is widespread enough, then someone is going to grab the understanding and use it to throw together the much simpler AI that does destroy the world, the one that’s build to much lower standards, so the model of ‘yes, you need the understanding, the understanding has to be concentrated within a group of people small enough that there is not one defector in the group who goes off and destroys the world, and then those people have to build an AI.’ If you condition on that the world got saved, and look back and within history, I expect that that is what happened in the majority of cases where a world anything like this one gets saved, and working back from there, they will have needed a precise theory, because otherwise they’re doomed. You can make mistakes and pull yourself up, even if you think you have a precise theory, but if you don’t have a precise theory then you’re completely doomed, or if you don’t think you have a precise theory then you’re completely doomed.


And working back from there, you probably find that there were people spending a lot of time doing math based on the experimental results that other people had sort of blundered out into the dark and gathered because it’s a lot easier to blunder out into the dark; more people can do it, lots more people have done it; it’s the math part that’s really difficult. So I expect that if you look further back in time, you see a small group of people who had honed their ability to understand things to a very high pitch, and then were working primarily on doing math and relying on either experimental data that other people had gathered by accident, or doing experiments where they have a very clear idea why they’re doing the experiment and what different results will tell them.



10. What was the story purpose and/​or creative history behind the legalization and apparent general acceptance of non-consensual sex in the human society from Three Worlds Collide?


The notion that non-consensual sex is not illegal and appears to be socially accepted might seem a bit out of place in the story, as if it had been grafted on. This is correct. It was grafted on from a different story in which, for example, theft is while not so much legal, because they don’t have what would you call a strong, centralized government, but rather, say, theft is, in general, something you pull off by being clever rather than a horrible crime; but of course, you would never steal a book. I have yet to publish a really good story set in this world; most of them I haven’t finished, the one I have finished has other story problems. But if you were to see the story set in this world, then you would see that it develops out of a much more organic thing than say… dueling, theft, non-consensual sex; all of these things are governed by tradition rather than by law, and they certainly aren’t prohibited outright.


So why did I pick up that one aspect form that story and put it into Three Worlds Collide? Well, partially it was because I wanted that backpoint to introduce a culture clash between their future and our past, and that’s what came to mind, more or less, it was more something to test out to see what sort of reaction it got, to see if I could get away with putting it into this other story. Because one can’t use theft; Three Worlds Collide’s society actually does run on private propety. One can’t use dueling; their medical technology isn’t advanced enough to make that trivial. But you can use non-consensual sex and try to explain sort of what happens in a society in which people are less afraid, and not afraid of the same things. They’re stronger than we are in some senses, they don’t need as much protection, the consequences aren’t the same consequences that we know, and the people there sort of generally have a higher grade of ethics and are less likely to abuse things. That’s what made that sort of particular culture clash feature a convenient thing to pick up from one story and graft onto another, but ultimately it was a graft, and any feelings of ‘why is that there?’ that you have, might make a bit more sense if you saw the other story, if I can ever repair the flaws in it, or manage to successfully complete and publish a story set in that world that actually puts the world on display.



11. If you were to disappear (freak meteorite accident), what would the impact on FAI research be?

Do you know other people who could continue your research, or that are showing similar potential and working on the same problems? Or would you estimate that it would be a significant setback for the field (possibly because it is a very small field to begin with)?


Marcello Herreshoff is the main person whom I’ve worked with on this, and Marcello doesn’t yet seem to be to the point where he could replace me, although he’s young so he could easily develop further in coming years and take over as the lead, or even, say, ‘Aha! Now I’ve got it! No more need for Eliezer Yudkowsky.’ That sort of thing would be very nice if it happened, but it’s not the sort of thing I would rely on.


So if I got hit by a meteor right now, what would happen is that Michael Vassar would take over responsibility for seeing the planet through to safety, and say ‘Yeah I’m personally just going to get this done, not going to rely on anyone else to do it for me, this is my problem, I have to handle it.’ And Marcello Herreshoff would be the one who would be tasked with recognizing another Eliezer Yudkowsky if one showed up and could take over the project, but at present I don’t know of any other person who could do that, or I’d be working with them. There’s not really much of a motive in a project like this one to have the project split into pieces; whoever can do work on it is likely to work on it together.



12. Your approach to AI seems to involve solving every issue perfectly (or very close to perfection). Do you see any future for more approximate, rough and ready approaches, or are these dangerous?


More approximate, rough and ready approaches might produce interesting data that math theorist types can learn something from even though the people who did it didn’t have that in mind. The thing is, though, there’s already a lot of people running out and doing that and really failing at AI, or even approximate successes at AI, result in much fewer sudden thunderbolts of enlightenment about the structure of intelligence than the people that are busily producing ad hoc AI programs because that’s easier to do and you can get a paper out of it and you get respect out of it and prestige and so on. So it’s a lot harder for that sort of work to result in sudden thunderbolts of enlightenment about the structure of intelligence than the people doing it would like to think, because that way it gives them an additional justification for doing the work. The basic answer to the question is ‘no’, or at least I don’t see a future for Singularity Institute funding, going as marginal effort, into sort of rough and ready ‘forages’ like that. It’s been done already. If we had more computer power and our AIs were more sophisticated, then the level of exploration that we’re doing right now would not be a good thing, as it is, it’s probably not a very dangerous thing because the AIs are weak more or less. It’s not something you would ever do with AI that was powerful enough to be dangerous. If you know what it is that you want to learn by running a program, you may go ahead and run it; if you’re just foraging out at random, well other people are doing that, and even then they probably won’t understand what their answers mean until you on your end, the sort of math structure of intelligence type people, understand what it means. And mostly the result of an awful lot of work in domain specific AIs tell us that we don’t understand something, and this can often be surprisingly easy to figure out, simply by querying your brain without being overconfident.


So, I think that at this point, what’s needed is math structure of intelligence type understanding, and not just any math, not just ‘Ooh, I’m going to make a bunch of Greek symbols and now I can publish a paper and everyone will be impressed by how hard it is to understand,’ but sort of very specific math, the sort that results in thunderbolts of enlightenment; the usual example I hold up is the Bayesian Network Causality insight as depicted in Judea Pearl’s Probabilistic Reasoning in Intelligent Systems and (later book of causality?). So if you sort of look at the total amount of papers that have been written with neat Greek symbols and things that are mathematically hard to understand and compare that to those Judea Pearl books I mentioned, though one should always mention this is the culmination of a lot of work not just by Judea Pearl; that will give you a notion of just how specific the math has to be.


In terms of solving every issue perfectly or very close to perfection, there’s kinds of perfection. As long as I know that any proof is valid, I might not know how long it takes to do a proof; if there’s something that does proof, then I may not know how long the algorithm takes to produce a proof but I may know that anything it claims is a proof is definitely a proof, so there’s different kinds of perfection and types of precision. But basically, yeah, if you want to build a recursively self-improving AI, have it go through a billion sequential self-modifications, become vastly smarter than you, and not die, you’ve got to work to a pretty precise standard.



13. How young can children start being trained as rationalists? And what would the core syllabus /​ training regimen look like?


I am not an expert in the education of young children. One has these various ideas that one has written up on Less Wrong, and one could try to distill those ideas, popularize them, illustrate them through simpler and simpler stories and so take these ideas and push them down to a lower level, but in terms of sort of training basic though skills, training children to be self-aware, to be reflective, getting them into the habit of reading and storing up lots of pieces of information, trying to get them more interested in being fair to both sides of an argument, the virtues of honest curiosity over rationalization, not in the way that I do it by sort of telling people and trying to lay out stories and parables that illustrate it and things like that, but if there’s some other way to do it with children, I’m not sure that my grasp of this concept of teaching rationality extends to before the young adult level. I believe that we had some sort of thread on Less Wrong about this, sort of recommended reading for young rationalists, I can’t quite remember.


Oh, but one thing that does strike me as being fairly important is that if this ever starts to happen on a larger scale and individual parents teaching individual children, the number one thing we want to do is test out different approaches and see which one works experimentally.



14. Could you elaborate a bit on your “infinite set atheism”? How do you feel about the set of natural numbers? What about its power set? What about that thing’s power set, etc?

From the other direction, why aren’t you an ultrafinitist?


The question is ‘can you elaborate on your infinite set atheism’, that’s where I say ‘I don’t believe in infinite sets because I’ve never seen one.’


So first of all, my infinite set atheism is a bit tongue-in-cheek. I mean, I’ve seen a whole lot of natural numbers, and I’ve seen that times tend to have successor times, and in my experience, at least, time doesn’t return to its starting point; as I understand current cosmology, the universe is due to keep on expanding, and not return to its starting point. So it’s entirely possible that I’m faced with certain elements that have successors where if the successors of two elements are the same and the two elements are the same, in which there’s no cycle. So in that sense I might be forced to recognize the empirical existence of every member of what certainly looks like an infinite set. As for the question of whether this collection of infinitely many finite things constitutes an infinite thing exists is an interesting metaphysical one, or it would be if we didn’t have the fact that even though by looking at time we can see that it looks like infinite things ought to exist, nonetheless, we’ve never encountered an infinite thing in certain, in person. We’ve never encountered a physical process that performs a super task. If you look more at physics, you find that actually matters are even worse than this. We’ve got real numbers down there, or at least if you postulate that it’s something other than real numbers underlying physics then you have to postulate something that looks continuous but isn’t continuous, and in this way, by Occam’s Razor, one might very easily suspect that the appearance of continuity arises from actual continuity, so that we have, say, an amplitude distribution, a neighborhood in configuration space, and the amplitude[s that] flows in configuration space are continuous, instead of having a discrete time with a discrete successor, we actually have a flow of time, so when you write the rules of causality, it’s not possible to write the rules of causality the way we write them for a Turing machine, you have to write the rules of causality as differential equations.


So these are the two main cases in which the universe is defined by infinite set atheism. The universe is handing me what looks like an infinite collection of things, namely times; the universe is handing me things that exist and are causes and the simplest explanation would have them being described by continuous differential equations, not by discrete ticks. So that’s the main sense in which my infinite set atheism is challenged by the universe’s actual presentation of things to me of things that look infinite. Aside from this, however, if you start trying to hand me paradoxes that are being produced by just assuming that you have an infinite thing in hand as an accomplished fact, an infinite thing of the sort where you can’t just present to me a physical example of it, you’re just assuming that that infinity exists, and then you’re generating paradoxes from it, well, we do have these nice mathematical rules for reasoning about infinities, but, rather than putting the blame on the person for having violated these elaborate mathematical rules that we develop to reason about infinities, I’m even more likely to cluck my tongue and say ‘But what good is it?’ Now it may be a tongue-in-cheek tongue cluck… I’m trying to figure out how to put this into words… Map that corresponds to the territory, if you can’t have infinities in your map, because your neurons, they fire discretely, and you only have a finite number of neurons in your head, so if you can’t have infinities in the map, what makes you think that you can make them correspond to infinities in the territory, especially if you’ve never actually seen that sort of infinity? And so the sort of math of the higher infinities, I tend to view as works of imaginative literature, like Lord of the Rings; they may be pretty, in the same way that Tolkien Middle Earth is pretty, but they don’t correspond to anything real until proven otherwise.



15. Why do you have a strong interest in anime, and how has it affected your thinking?


‘Well, as a matter of sheer, cold calculation I decided that...’


It’s anime! (laughs)


How has it affected my thinking? I suppose that you could view it as a continuity of reading dribs and drabs of westernized eastern philosophy from Godel, Escher, Bach or Raymon Smullyan, concepts like ‘Tsuyoku Naritai’, ‘I want to become stronger’, are things that being exposed to the alternative eastern culture as found in anime might have helped me to develop concepts of. But on the whole… it’s anime! There’s not some kind of elaborate calculation behind it, and I can’t quite say that when I’m encountering a daily problem, I think to myself ‘How would Light Yagami solve this?’ If the point of studying a programing language is to change the way you think, then I’m not sure that studying anime has change the way I think all that much.



16. What are your current techniques for balancing thinking and meta-thinking?

For example, trying to solve your current problem, versus trying to improve your problem-solving capabilities.


I tend to focus on thinking, and it’s only when my thinking gets stuck or I run into a particular problem that I will resort to meta-thinking, unless it’s a particular meta skill that I already have, in which case I’ll just execute it. For example, the meta skill of trying to focus on the original problem. In one sense, a whole chunk of Less Wrong is more or less my meta-thinking skills.


So I guess on reflection (ironic look), I would say that there’s a lot of routine meta-thinking that I already know how to do, and that I do without really thinking of it as meta-thinking. On the other hand, original meta-thinking, which is the time consuming part is something I tend to resort to only when my current meta-thinking skills have broken down. And that’s probably a reasonably exceptional circumstance even though it’s something of comparative advantage and so I expect it to do a bit more of it than average. Even so, when I’m trying to work on an object-level problem at any given point, I’m probably not doing original meta-level questioning about how to execute these meta-level skills.


If I bog down in writing something I may execute my sort of existing meta-level skill of ‘try to step back and look at this from a more abstract level’, and if that fails, then I may have to sort of think about what kind of abstract levels can you view this problem on, similar problems as opposed to tasks, and in that sense go into original meta-level thinking mode. But one of those meta-level skills I would say is the notion that your meta-level problem comes from an object-level problem and you’re supposed to keep one eye on the object-level problem the whole time you’re working on the meta-level.



17. Could you give an uptodate estimate of how soon non-Friendly general AI might be developed? With confidence intervals, and by type of originator (research, military, industry, unplanned evolution from non-general AI...)


We’re talking about this very odd sector of program space and programs that self-modify and wander around that space and sort of amble into a pot of gold that enables them to keep going and… I have no idea...


There are all sorts of different ways that it could happen, I don’t know which one of them are plausible or implausible or how hard or difficult they are relative to modern hardware or computer science. I have no idea what the odds are; I know they aren’t getting any better as time goes on or that is, the probabilities of Unfriendly AI are increasing over time. So if you were actually to make some kind of graph, then you’d see the probability rising over time as the odds got worse, and then the graph would slope down again as you entered into regions where it was more likely than not that Unfriendly AI had actually occurred before that; the slope would actually fall off faster as you went forward in time because the amount of probability mass has been drained away by Unfriendly AI happening now.


‘By type of originator’ or something, I might have more luck answering. I would put academic research at the top of it, because academic research that actually can try blue sky things. Or… OK, first commercial, that wasn’t quite on the list, as in people doing startup-ish things, hedge funds, people trying to improve the internal AI systems that they’re using for something, or build weird new AIs to serve commercial needs; those are the people most likely to build AI ‘stews’(?) Then after that, academic research, because in academia you have a chance of trying blue sky things. And then military, because they can hire smart people and give the smart people lots of computing power and have a sense of always trying to be on the edge of things. Then industry, if that’s supposed to mean car factories and so on because… that actually strikes me as pretty unlikely; they’re just going to be trying to automate ordinary processes, that sort of thing, it’s generally unwise to sort of push the bounds of theoretical limits while you’re trying to do that sort of thing; you can count Google as industry, but that’s the sort of thing I had in mind when I was talking about commercial. Unplanned evolution from non-general AI [is] not really all that likely to happen. These things aren’t magic. If something can happen by itself spontaneously, it’s going to happen before that because humans are pushing on it.


As for confidence intervals… doing that just feels like pulling numbers out of thin air. I’m kind of reluctant to do it because of the extent to which I feel that, even to the extent that my brain has a grasp on this sort of thing; by making up probabilities and making up times, I’m not even translating the knowledge that I do have into reality, so much as pulling things out of thin air. And if you were to sort of ask ‘what do sort of attitude do your revealed actions indicate?’ then I would say that my revealed actions don’t indicate that I expect to die tomorrow of Unfriendly AI, and my revealed actions don’t indicate that we can safely take until 2050. And that’s not even a probability estimate, that’s sort of looking at what I’m doing and trying to figure out what my brain thinks the probabilities are.



18. What progress have you made on FAI in the last five years and in the last year?


The last five years would take us back to the end of 2004, which is fairly close to the beginning of my Bayesian enlightenment, so the whole ‘coming to grasps with the Bayesian structure of it all’, a lot of that would fall into the last five years. And if you were to ask me… the development of Timeless Decision Theory would be in the last five years. I’m tyring to think if there’s anything else I can say about that. Getting a lot of clarification of what the problems were.


In the last year, I managed to get in a decent season of work with Marcello after I stopped regular posting to OBLW over the summer, before I started writing the book. That, there’s not much I can say about; there was something I suspected was going to be a problem and we tried to either solve the problem or at least nail down exactly what the problem was, and i think that we did a fairly good job of the latter, we now have a nice precise, formal explanation of what it is we want to do and why we can’t do it in the obvious way; we came up with sort of one hack for getting around it that’s a hack and doesn’t have all the properties that we want a real solution to have.


So, step one, figure out what the problem is, step two, understand the problem, and step three, solve the problem. Some degree of progress on step two but not finished with it, and we didn’t get to step three, but that’s not overwhelmingly discouraging. Most of the real progress that has been made when we sit down and actually work on the problem [are] things I’d rather not talk about and the main exception to that is Timeless Decision Theory which has been posted to Less Wrong.



19. How do you characterize the success of your attempt to create rationalists?


It’s a bit of an ambiguous question, and certainly an ongoing project. Recently, for example, I was in a room with a group of people with a problem of what Robin Hanson called a far-type and what I would call the type where it’s difficult because you don’t get immediate feedback when you say something stupid, and it really was clear who in that room was an ‘X-rationalist’ or ‘neo-rationalist’, or ‘Lesswrongian’ or ‘Lessiath’ and who was not. The main distinction was that the sort of non-X-rationalists were charging straight off and were trying to propose complicated policy solutions right off the bat, and the rationalists were actually holding off, trying to understand the problem, break it down into pieces, analyze the pieces modularly, and just that one distinction was huge; it was the difference between ‘these are the people who can make progress on the problem’ and ‘these are the people who can’t make progress on the problem’. So in that sense, once you hand this deep, Lesswrongian types a difficult problem, the distinction between them and someone who has merely had a bunch of successful life experiences and so on is really obvious.


There’s a number of other interpretations that can be attached to the question, but I don’t really know what it means aside from that, even though it was voted up by 17 people.



20. What is the probability that this is the ultimate base layer of reality?


I would answer by saying, hold on, this is going to take me a while to calculate… um.… uh… um… 42 percent! (sarcastic)



21. Who was the most interesting would-be FAI solver you encountered?


Most people do not spontaneously try to solve the FAI problem. If they’re spontaneously doing something, they try to solve the AI problem. If we’re talking about sort of ‘who’s made interesting progress on FAI problems without being a Singularity Institute Eliezer supervised person,’ then I would have to say: Wei Dai.



22. If Omega materialized and told you Robin was correct and you are wrong, what do you do for the next week? The next decade?


If Robin’s correct, then we’re on a more or less inevitable path to competing intelligences driving existence down to subsistence level, but this does not result in the loss of everything we regard as valuable, and there seem to be some values disputes here, or things that are cleverly disguised as values disputes while probably not being very much like values disputes at all.


I’m going to take the liberty of reinterpreting this question as ‘Omega materializes and tells you “You’re Wrong”’, rather than telling me Robin in particular is right; for one thing that’s a bit more probable. And, Omega materializes and tells me ‘Friendly AI is important but you can make no contribution to that problem, in fact everything you’ve done so far is worse than nothing.’ So, publish a retraction… Ordinarily I would say that the next most important thing after this is to go into talking about rationality, but then if Omega tells me that I’ve actually managed to do worse than nothing on Friendly AI, that of course has to change my opinion of how good I am at rationality or teaching others rationality, unless this is a sort of counterfactual surgery type of thing where it doesn’t affect my opinion of how useful I can be by teaching people rationality, and mostly the thing I’d be doing if Friendly AI weren’t an option would probably be pushing human rationality. And if that were blocked out of existence, I’d probably end up as a computer programmer whose hobby was writing science fiction.


I guess I have enough difficulty visualizing what it means for Robin to be correct or how the human species isn’t just plain screwed in that situation that I could wish that Omega had materialized and either told me someone else was correct or given me a bit more detail about what I was wrong about exactly; I mean I can’t be wrong about everything; I think that two plus two equals four.



23. In one of the discussions surrounding the AI-box experiments, you said that you would be unwilling to use a hypothetical fully general argument/​”mind hack” to cause people to support SIAI. You’ve also repeatedly said that the friendly AI problem is a “save the world” level issue. Can you explain the first statement in more depth? It seems to me that if anything really falls into “win by any means necessary” mode, saving the world is it.


Ethics are not pure personal disadvantages that you take on for others’ benefit. Ethics are not just penalties to the current problem you’re working on that have sort of side benefits for other things. When I first started working on the Singularity problem, I was making non-reductionist type mistakes about Friendly AI, even though I thought of myself as a rationalist at the time. And so I didn’t quite realize that Friendly AI was going to be a problem, and I wanted to sort of go all-out on any sort of AI, as quickly as possible; and actually, later on when I realized that Friendly AI was an issue, the sort of sneers that I now get about not writing code or being a luddite were correctly anticipated by my past self with the result that my past self sort of kept on advocating the kind of ‘rush ahead and write code’ strategy, rather than face the sneers, instead of going back and replanning everything from scratch once my past self realized that Friendly AI was going to be an issue, on which basis all the plans had been made before then.


So if I’d lied to get people to do what I had wanted them to do at that point, to just get AI done, to rush ahead and write code rather than doing theory; being honest as I actually was, I could just come back and say ‘OK, here’s what I said, I’m honestly mistaken, here’s the new information that I encountered that caused me to change my mind, here’s the new strategy that we need to use after taking this new information into account’. If you lie, there’s not necessarily any equally easy way to retract your lies. … So for example, one sort of lie that I used to hear advocated back in the old days was by other people working on AI projects and it was something along the lines of ‘AI is going to be safe and harmless and will inevitably cure cancer, but not really take over the world or anything’ and if you tell that lie in order to get people to work on your AI project, then it’s going to be a bit more difficult to explain to them why you suddenly have to back off and do math and work on Friendly AI. Now, if I were an expert liar, I’d probably be able to figure out some sort of way to reconfigure those lies as well, I mean I don’t really know what an expert liar could accomplish by way of lying because I don’t have enough practice.


So I guess in that sense it’s not all that defensible… a defensive ethics, because I haven’t really tried it both ways, but it does seem to me, looking over my history, my ethics have played a pretty large role in protecting me from myself. Another example is [that] the whole reason that I originally pursued the thought of Friendly AI long enough to realize that it was important was not so much out of a personal desire as out of a sense that this was something I owed to the other people who were funding the project, Brian Atkins in particular back then, and that if there’s a possibility from their perspective that you can do better by Friendly AI, or that a fully honest account would cause them to go off and fund someone who was more concerned about Friendly AI, then I owed it to them to make sure that they didn’t suffer by helping me. And so it was a sense of ethical responsibility for others at that time which cause me to focus in on this sort of small, discordant note, ‘Well, this minor possibility that doesn’t look all that important, follow it long enough to get somewhere’. So maybe there are people who could defend the Earth by any means necessary and recruit other people to defend the Earth by any means necessary, and nonetheless have that all and well and happily smiling ever after, rather than bursting into flames and getting arrested for murder and robbing banks and being international outlaws, or more likely just arrested and attracting the ‘wrong’ sort of people who are trying to go along with this and people being corrupted by power and deciding that ‘no, the world really would be a better place with them in charge’ and etcetera etcetera etcetera.


I think if you sort of survey the Everett branches of the Many Worlds and look at the ones with successful Singularities, or pardon me, look at the conditional probability of successful Singularities, my guess is that the worlds that start out with programming teams who are trying to play it ethical versus the worlds that start off with programming teams that figure ‘well no, this is a planetary-class problem, we should throw away all our ethics and do whatever is necessary to get it done’ that the former world will have a higher proportion of happy outcomes. I could be mistaken, but if it does take a sort of master ruthless type person to do it optimally, then I am not that person, and that is not my comparative advantage, and I am not really all that willing to work with them either; so I supposed if there was any way you could end up with two Friendly AI projects, then I suppose the possibility of there actually being sort of completely ruthless programmers versus ethical programmers, they might both have good intentions and separate into two groups that refuse to work with one another, but I’m sort of skeptical about these alleged completely ruthless altruists. Has there ever, in history, been a completely ruthless altruist with that turning out well. Knut Haukelid, if I’m pronouncing his name correctly, the guy who blew up a civilian ferry in order to sink the Deuterium that the Nazis needed for their nuclear weapons program; you know you never see that in a Hollywood movie; so you killed civilians and did it to end the Nazi nuclear weapons program. So that’s about the best historical example I can think of a ruthless altruist and it turns out well, and I’m not really sure that’s quite enough to persuade me, to give up my ethics.



24. What criteria do you use to decide upon the class of algorithms /​ computations /​ chemicals /​ physical operations that you consider “conscious” in the sense of “having experiences” that matter morally? I assume it includes many non-human animals (including wild animals)? Might it include insects? Is it weighted by some correlate of brain /​ hardware size? Might it include digital computers? Lego Turing machines? China brains? Reinforcement-learning algorithms? Simple Python scripts that I could run on my desktop? Molecule movements in the wall behind John Searle’s back that can be interpreted as running computations corresponding to conscious suffering? Rocks? How does it distinguish interpretations of numbers as signed vs. unsigned, or ones complement vs. twos complement? What physical details of the computations matter? Does it regard carbon differently from silicon?


This is something that I don’t know, and would like to know. What you’re really being asked is ‘what do you consider as people? Who you consider as people is a value. How can you not know what your own values are?’ Well, for one, it’s very easy to not know what your own values are. And for another thing, my judgement of what is a person, I do want to rely, if I can, about the notion of ‘what has… (hesitant) subjective experience’. For example, one reason that I’m not very concerned about my laptop’s feelings is because I’m fairly sure that whatever else is going on in there, it’s not ‘feeling’ it. And this is really something I wish I knew more about.


And the number one reason I wish I knew more about it is because the most accurate possible model of a person is probably a person; not necessarily the same person, but if you had an Unfriendly AI and it was looking at a person and using huge amounts of computing power, or just very efficient computing power, to model that person and predict the next event as accurately and as precisely as it could, then its model of that person might not be the same person, but it would probably be a person in its own right. So, one of the problems that I don’t even try talking to other AI researchers about, because it’s so much more difficult than what they signed up to handle that I just assume that they don’t want to hear about it; I’ve confronted them with much less difficult sounding problems like this and they just make stuff up or run away, and don’t say ‘Hmm, I better solve this problem before I go on with my plans to… destroy the world,’ or whatever it is they think they’re doing.


But in terms of danger points; three example danger points. First, if you have an AI with a pleasure-pain reinforcement architecture and any sort of reflectivity, the ability to sort of learn about its own thoughts and so on, then I might consider that a possible danger point, because then, who knows, it might be able to hurt and be aware that it was hurting; in particular because pleasure-pain reinforcement architecture is something that I think of as an evolutionary legacy architecture rather than an incredibly brilliant way to do things; that scenario space is easy to clear out of.


If you had an AI with terminal values over how it was treated and its role in surrounding social networks; like you had an AI that could… just, like, not as a means to an end but just, like, in its own right, the fact that you are treating it as a non-person; even if you don’t know whether or not it was feeling that about that, you might still be treading into territory where, just for the sake of safety, it might be worth steering out of it in terms of what we would consider as a person.


Oh, and the third consideration is that if your AI spontaneously starts talking about the mystery of subjective experience and/​or the solved problem of subjective experience, and a sense of its own existence, and whether or not it seems mysterious to the AI; it could be lying, but you are now in probable trouble; you have wandered out of the safe zone. And conversely, as long as we go on about building AIs that don’t have pleasure, pain, and internal reflectivity, and anything resembling social emotions or social terminal values, and that exhibit no signs at all of spontaneously talking about a sense of their own existence, we’re hopefully still safe. I mean ultimately, if you push these things far enough without knowing what your doing, sooner or later you’re going to open the black box that contains the black swan surprise from hell. But at least as long as you sort of steer clear of those three land mines, and things just haven’t gone further and further and further, it gives you a way of looking at a pocket calculator and saying that the pocket calculator is probably safe.



25. I admit to being curious about various biographical matters. So for example I might ask: What are your relations like with your parents and the rest of your family? Are you the only one to have given up religion?


As far as I know I’m the only one in my family to give up religion except for one grand-uncle. I still talk to my parents, still phone calls and so on, amicable relations and so on. They’re Modern Orthodox Jews, and mom’s a psychiatrist and dad’s a physicist, so… ‘Escher painting’ minds; thinking about some things but always avoiding the real weak points of their beliefs and developing more and more complicated rationalizations. I tried confronting them directly about it a couple of times and each time have been increasingly surprised at the sheer depth of tangledness in there.


I might go on trying to confront them about it a bit, and it would be interesting to see what happens to them if i finish my rationality book and they read it. But certainly among the many things to resent religion for is the fact that I feel that it prevents me from having the sort of family relations that I would like; that I can’t talk with my parents about a number of things that I would like to talk with them about. The kind of closeness that I have with my fellow friends and rationalists is a kind of closeness that I can never have with them; even though they’re smart enough to learn the skills, they’re blocked off by this boulder of religion squatting in their minds. That may not be much to lay against religion, it’s not like I’m being burned at the stake, or even having my clitoris cut off, but it is one more wound to add to the list. And yeah, I resent it.


I guess even when I do meet with my parents and talk with my parents, the fact of their religion is never very far from my mind. It’s always there as the block, as a problem to be solved that dominates my attention, as something that prevented me from saying the things I want to say, and as the thing that’s going to kill them when they don’t sign up for cryonics. My parents may make it without cryonics, but all four of my grandparents are probably going to die, because of their religion. So even though they didn’t cut off all contact with me when I turned Atheist, I still feel like their religion has put a lot of distance between us.



26. Is there any published work in AI (whether or not directed towards Friendliness) that you consider does not immediately, fundamentally fail due to the various issues and fallacies you’ve written on over the course of LW? (E.g. meaningfully named Lisp symbols, hiddenly complex wishes, magical categories, anthropomorphism, etc.)

ETA: By AI I meant AGI.


There’s lots of work that’s regarded as plain old AI that does not immediately fail. There’s lots of work in plain old AI that succeeds spectacularly, and Judea Pearl is sort of like my favorite poster child there. But one could also name the whole Bayesian branch of statistical inference can be regarded with some equanimity as part of AI. There’s the sort of Bayesian methods that are used in robotics as well, which is sort of a surprisingly… how do I put it, it’s not theoretically distinct because it’s all Bayesian at heart, but in terms of the algorithms, it looks to me like there’s quite a bit of work that’s done in robotics that’s a separate branch of Bayesianism from the work done in statistical learning type stuff. That’s all well and good.


But if we’re asking about works that are sort of billing themselves as ‘I am Artificial General Intelligence’, then I would say that most of that does indeed fail immediately and indeed I cannot think of a counterexample which fails to fail immediately, but that’s a sort of extreme selection effect, and it’s because if you’ve got a good partial solution, or solution to a piece of the problem, and you’re an academic working in AI, and you’re anything like sane, you’re just going to bill it as plain old AI, and not take the reputational hit from AGI. The people who are bannering themselves around as AGI tend to be people who think they’ve solved the whole problem, and of course they’re mistaken. So to me it really seems like to say that all the things I’ve read on AGI immediately fundamentally fail is not even so much a critique of AI as rather a comment on what sort of more tends to bill itself as Artificial General Intelligence.



27. Do you feel lonely often? How bad (or important) is it?

(Above questions are a corollary of:) Do you feel that — as you improve your understanding of the world more and more —, there are fewer and fewer people who understand you and with whom you can genuinely relate in a personal level?


That’s a bit hard to say exactly. I often feel isolated to some degree, but the fact of isolation is a bit different from the emotional reaction of loneliness. I suspect and put some probability to the suspicion that I’ve actually just been isolated for so long that I don’t have a state of social fulfillment to contrast it to, whereby I could feel lonely, or as it were, lonelier, or that I’m too isolated relative to my baseline or something like that. There’s also the degree to which I, personality-wise, don’t hold with trying to save the world in an Emo fashion...? And as I improve my understanding of the world more and more, I actually would not say that I felt any more isolated as I’ve come to understand the world better.


There’s some degree to which hanging out with cynics like Robin Hanson has caused me to feel that the world is even more insane than I started out thinking it was, but that’s more a function of realizing that the rest of world is crazier than I thought rather than myself improving.


Writing Less Wrong has, I think, helped a good deal. I now feel a great deal less like I’m walking around with all of this stuff inside my head that causes most of my thoughts to be completely incomprehensible to anyone. Now my thoughts are merely completely incomprehensible to the vast majority of people, but there’s a sizable group out there who can understand up to, oh, I don’t know, like one third of my thoughts without a years worth of explanation because I actually put in the year’s worth of explanation. And even attracted a few people whom I feel like I can relate to on a personal level, and Michael Vassar would be the poster child there.



28. Previously, you endorsed this position:

Never try to deceive yourself, or offer a reason to believe other than probable truth; because even if you come up with an amazing clever reason, it’s more likely that you’ve made a mistake than that you have a reasonable expectation of this being a net benefit in the long run.

One counterexample has been proposed a few times: holding false beliefs about oneself in order to increase the appearance of confidence, given that it’s difficult to directly manipulate all the subtle signals that indicate confidence to others.

What do you think about this kind of self-deception?


So… Yeah, ’cuz y’know people are always criticizing me on the grounds that I come across as too hesitant and not self confident enough. (sarcastic)


But to just sort of answer the broad thrust of the question; four legs good, two legs bad, self-honest good, self-deception bad. You can’t sort of say ‘Ok now I’m going to execute a 180 degree turn from the entire life I’ve led up until this point and now, for the first time, I’m going to throw away all the systematic training I’ve put into noticing when I’m deceiving myself, finding the truth, noticing thoughts that are hidden away in the corner of my mind, and taking reflectivity on a serious, gut level, so that if I know I have no legitimate reason to believe something I will actually stop believing it because, by golly, when you have no legitimate reason to believe something, it’s usually wrong. I’m now going to throw that out the window; I’m going to deceive myself about something and I’m not going to realize it’s hopeless and I’m going to forget the fact that I tried to deceive myself.’ I don’t see any way that you can turn away from self-honesty and towards self-deception, once you’ve gone far enough down toward the path of self-honesty without ‘A’ relinquishing The Way and losing your powers, and ‘B’ it doesn’t work anyway.


Most of the time, deceiving yourself is much harder than people think. But, because they don’t realize this, they can easily deceive themselves into believing that they’ve deceived themselves, and since they’re expecting a placebo effect, they get most of the benefits of the placebo effect. However, at some point, you become sufficiently skilled in reflection that this sort of thing does not confuse you anymore, and you actually realize that that’s what’s going on, and at that point, you’re just stuck with the truth. How sad. I’ll take it.



29. In the spirit of considering semi abyssal plans, what happens if, say, next week you discover a genuine reduction of consciousness and in turns out that… There’s simply no way to construct the type of optimization process you want without it being conscious, even if very different from us?

ie, what if it turned out that The Law turned out to have the consequence of “to create a general mind is to create a conscious mind. No way around that”? Obviously that shifts the ethics a bit, but my question is basically if so, well… “now what?” what would have to be done differently, in what ways, etc?


Now, this question actually comes in two flavors. The difficult flavor is, you build this Friendly AI, and you realize there’s no way for it to model other people at the level of resolution that you need without every imagination that it has of another person being conscious. And so the first obvious question is ‘why aren’t my imaginations of other people conscious?’ and of course the obvious answer would be ‘they are!’ The models in your mind that you have of your friends are not your friends, they’re not identical with your friends, they’re not as complicated as the people you’re trying to model, so the person that you have in your imagination does not much resemble the person that you’re imagining; it doesn’t even much resemble the referent… like I think Michael Vassar is a complicated person, but my model of him is simple and then the person who that model is is not as complicated as my model says Michael Vassar is, etcetera, etcetera. But nonetheless, every time that I’ve modeled a person, and I write my stories, the characters that I create are real people. They may not hurt as intensely as the people do in my stories, but they nonetheless hurt when I make bad things happen to them, and as you scale up to superintelligence the problem just gets worse and worse and the people get realer and realer.


What do I do if this turns out to be the law? Now, come to think of it, I haven’t much considered what I would do in that case; and I can probably justify that to you by pointing out the fact that if I actually knew that this was the case I would know a great number of things I do not currently know. But mostly I guess I would have to start working on sort of different Friendly AI designs so that the AI could model other people less, and still get something good done.


And as for the question of ’Well, the AI can go ahead and model other people but it has to be conscious itself, and then it might experience empathically what it imagines conscious beings experiencing the same way that I experience some degree of pain and shock, although a not a correspondingly large amount of pain and shock when I imagine one of my characters watching their home planet be destroyed. So in this case, when one is now faced with the question of creating a AI such that it can, in the future, become a good person; to the extent that you regard it as having human rights, it hasn’t been set on to a trajectory that would lock it out of being a good person. And this would entail a number of complicated issues, but it’s not like you have to make a true good person right of the bat, you just have to avoid putting it into horrible pain, or making it so that it doesn’t want to be what we would think of as a humane person later on. … You might have to give it goals beyond the sort of thing I talk about in Coherent Extrapolated Volition, and at the same time, perhaps a sort of common sense understanding that it will later be a full citizen in society, but for now it can sort of help the rest of us save the world.



30. What single technique do you think is most useful for a smart, motivated person to improve their own rationality in the decisions they encounter in everyday life?


It depends on where that person has deficit; so, the first thought that came to mind for that answer is ‘hold off on proposing solutions until you’ve analyzed the problem for a bit’, but on the other hand, if dealing with someone who’s given to extensive, deliberate rationalization, then the first thing I tell them is ‘stop doing that’. If I’m dealing with someone who’s ended up stuck in a hole because they now have this immense library of flaws to accuse other people of, so that no matter what is presented to them, they can find a flaw in that and yet they don’t turn, at full force, that ability upon themselves, then the number one technique that they need is ‘avoid motivated skepticism’. If I’m dealing with someone who tends to be immensely driven by cognitive dissonance and rationalizing mistakes that they already made, then I might advise them on Cialdini’s time machine technique; ask yourself ‘would you do it differently if you could go back in time, in your heart of hearts’, or pretend that you have now been teleported into your situation spontaneously; some technique like that, say.

But these are all matters of ‘here’s a single flaw that the person has that is stopping them’. So if you move aside from that a bit and ask ‘what sort of positive counter intuitive technique you might use’, I might say ‘hold off on proposing solutions until you understand the problem. Well, the question was about everyday life, so, in everyday life, I guess I would still say that people’s intelligence might probably still be improved a bit if they sort of paused and looked at more facets of the situation before jumping to a policy solution; or it might be rationalization, cognitive dissonance, the tendency to just sort of reweave their whole life stories just to make it sound better and to justify their past mistake, that doing something to help tone that down a bit might be the most important thing they could do in their everyday lives. Or if you got someone who’s giving away their entire income to their church then they could do with a bit more reductionism in their lives, but my guess is that, in terms of everyday life, then either one of ‘holding off on proposing solutions until thinking about the problem’ or ‘against rationalization, against cognitive dissonance, against sour grapes, not reweaving your whole life story to make sure that you didn’t make any mistakes, to make sure that you’re always in the right and everyone else is in the wrong, etcetera, etcetera’, that one of those two would be the most important thing.