The Filan Cabinet Podcast with Oliver Habryka—Transcript

14 Feb 2023 2:38 UTC

104 points

This is a transcript of The Filan Cabinet podcast, episode 6, with host Daniel Filan and guest Oliver Habryka. The transcript is by MondSemmel. The original LessWrong post is here.

In addition to each “section” having a timestamp, phrases/sentences that were difficult to understand (and where the transcript is therefore most likely to contain errors) will also be marked with a timestamp, if you would like to listen to those parts yourself.

Introduction [0:00:00]

Daniel Filan: Hello everybody. In this episode, I’ll be speaking with Oliver Habryka. Oli is the head of the Lightcone Infrastructure team, the organization behind the forum LessWrong. He’s also done grantmaking on the Long-Term Future Fund. Before we start, you should know that I’ve received funding from the Long-Term Future Fund. Oli was not involved directly in those funding decisions to fund me, basically, but he was also not recused from those decisions either. It’s also the case that the Lightcone Infrastructure team runs offices in Berkeley that I am allowed to use for free.

With that out of the way, let’s begin the interview. Well, Oli, welcome to the Filan Cabinet.

Oliver Habryka: Thank you.

Daniel Filan: So I understand that you lead the Lightcone Infrastructure team, which is… I think the project that people are probably most familiar with is running LessWrong, the website.

Oliver Habryka: Correct.

On LessWrong

The History of LessWrong [0:00:46]

Daniel Filan: So, do you want to tell me a bit about, like, there was an old version of LessWrong and then there was a new version of LessWrong and I guess you were involved in that. Can you tell me the story of the rejuvenation of the website?

Oliver Habryka: Yeah. I mean, very rough story: around 2008, Eliezer Yudkowsky started posting on Overcoming Bias, made a bunch of series of essays. After a while, he made so many essays that Robin Hanson, who was co-running the site with him, was like, “I don’t know, man, how about you get your own apartment.”

So Eliezer moved out and invited a lot of friends with him and founded lesswrong.com. That site grew over a few years, first starting out when Eliezer was writing the Sequences, which is one of the most widely known series of essays that he’s written, and then [he] started writing Harry Potter and the Methods of Rationality, which drove a lot more activity, which then finished in around 2015.

After 2015, [at some point], there was that whole fun Roko’s Basilisk drama, which then caused Eliezer to have a bunch of kerfuffle with the moderators of LessWrong and a bunch of readers on it. And he was like, man, I don’t know whether I want to spend the rest of my life moderating an internet forum, which is also widely known as one of the worst jobs in the world. So he [kind of] stopped moderating it and stopped leading it in important ways and started posting less.

And then, I think, entered a period around 2016, 2017, where LessWrong declined relatively quickly in activity and quality in substantial parts because of technological problems. There was one specific guy who got extremely annoyed at all of LessWrong, who created hundreds of fake accounts a day, downvoting all the people he didn’t like. And because the old forum was a fork of the Reddit codebase, this meant that there actually just weren’t any moderator tools to handle that at all. And people would spend many hours a day trying to undo votes, trying to figure out how to ban these new users, but with terrible technological support.

Daniel Filan: I’m kind of surprised that… You’d think that this would be a problem for Reddit.

Oliver Habryka: So, *chuckles*, this was indeed a huge problem for Reddit. And they fixed it, I think, in their codebase around 2018, 2019, but this was a fork from 2013. No, actually, it would have been earlier, 2011. And I think the last open source version of the Reddit codebase was released in, 2012, 2013, I don’t remember. But yeah, it had been fixed in the modern codebase, but the [LessWrong] codebase was super old.

Yeah. And then, I think [the website] declined. At some point, downvoting was disabled. And just generally, things were pretty sad.

I had just quit my job at the Center for Effective Altruism, which I was working at during college. And [a] number of past efforts of trying to revive LessWrong during like 2015, 2016, 2017, had kind of fizzled out. And via a series of steps, Matthew Graves, Vaniver, [a] LessWrong user, ended up being declared benevolent dictator for life. Mostly as, you know, he was around, he was broadly trusted in the community. And then I met with him in 2017.

And I remember a long walk on New Year’s Eve where I was trying to figure out what to do with my life. And I was like, man, maybe I should revive LessWrong. I really liked LessWrong. And he was like, “Oh, did you know I’m benevolent dictator for life for LessWrong?” And I was like, “No, I didn’t know that.” And then I was like, “Can I revive LessWrong?” And he was like, “Yeah, sure.” He didn’t have any plans for it.

And then I started coding. I wasn’t very good at coding. But my sense was just, without fixing the technical problems, there basically wasn’t any way to fix [the website]. Because you still had all of these spammers, these people who would try pretty aggressively to take the site down.

And then, yeah, [I] got a bit better at code over a year, wrote a lot of stuff, wrote some quite terrifying things (*chuckles*) that are haunting me up to this day in terms of making the performance of the site bad and various other things, but ultimately got a basically functioning web forum up and running. And then [I] invited a lot of people back, I mean, invested many years of my life trying to make that thing take off.

And yeah, what could I say about the modern LessWrong? I mean, now it’s an organization, I’ve taken a bit of a step back from running it every day. Now Ruben Bloom [has been running] it on a daily level for about one and a half years after I’ve started focusing more on in-person community building for rationality, EA, and AI alignment things.

I don’t know, [I] could tell lots of stories about concrete things that happen on LessWrong 2.0. Also quite unsure about what the future will hold, I might focus on it more, I’ve been considering it substantially, again.

Daniel Filan: Okay. [So one thing], in some sense this is a side issue, but I’m kind of curious. So you mentioned moderator tools as one thing that needed changing. Was there [anything else]? Because, like, the new website looks kind of different than the old website.

Oliver Habryka: Totally.

Daniel Filan: Is it just moderator tools, or what else do you think was important to code up?

Oliver Habryka: One of the core things that I was always thinking about with LessWrong, and that was my kind of primary analysis of what went wrong with previous LessWrong revivals, was [kind of] an iterated, [the term] “prisoner’s dilemma” is overused, but a bit of an iterated prisoner’s dilemma or something where, like, people needed to have the trust on an ongoing basis that the maintainers and the people who run it will actually stick with it. And there’s a large amount of trust that the people need to have that, if they invest in a site and start writing content on it, that the maintainers and the people who run it actually will put the effort into making that content be shepherded well. And the people who want to shepherd it only want to do that if the maintainers actually...

And so, one of the key things that I was thinking about, was trying to figure out how to guarantee reliability. This meant, to a lot of the core contributors of the site, I made a promise when I started it, that was basically, I’m going to be making sure that LessWrong is healthy and keeps running for five years from the time I started. Which was a huge commitment—five years is a hugely long time. But my sense at the time was that type of commitment is exactly the most important thing. Because the most usual thing that I get when I talk [in] user interviews to authors and commenters is that they don’t want to contribute because they expect the thing to decline in the future. So reliability was a huge part of that.

And then I also think, signaling that there was real investment here was definitely a good chunk of it. I think UI is important, and readability of the site is important. And I think I made a lot of improvements there to decide that I’m quite happy with. But I think a lot of it was also just a costly signal that somebody cares.

I don’t know how I feel about that in retrospect. But I think that was a huge effect, where I think people looked on the site, and when [they] looked at LessWrong 2.0, there was just a very concrete sense that I could see in user interviews that they were like, “Oh, this is a site that is being taken care of. This is a thing that people are paying attention to and that is being kept up well.” In a similar [sense] to how, I don’t know, a clean house has the same symbol. I don’t really know. I think a lot of it was, they were like, wow, a lot of stuff is changing. And the fact that a lot of work is being put into this, the work itself is doing a lot of valuable signaling.

Daniel Filan: Okay. And this, like, five-year promise, if I count from 2017, 2018, is that about over now?

Oliver Habryka: Correct. I would consider it, roughly around this time, I’m technically, based on my past commitments, like, free to not keep LessWrong running. I have no interest in not making LessWrong keep running. And I might renew the promise. I’m currently kind of trying to rethink my next few years.

The Value of LessWrong [0:08:49]

Daniel Filan: Alright. So that’s a lot of investment you’ve put into this LessWrong website. What’s good about it?

Oliver Habryka: [Regarding the thing] that most motivates me, I feel a bit reminded of CFAR’s terrible slogan that it had for a while, Center for Applied Rationality, that at the time I was like, it was so dense, I couldn’t understand it, and it felt hilarious. Anna Salomon wrote it and it was on the CFAR website, I think maybe [it’s] still up. It was “rationality for the sake of existential risk for the sake of rationality”. It’s not a good slogan.

Daniel Filan: Wasn’t it “rationality for its own sake for the sake of existential risk”?

Oliver Habryka: Oh, yeah, sorry, it was “rationality for its own sake for the sake of existential risk”. Yeah. Ultimately, I think, okay, let’s ignore the complicated semantic structure in that sentence, which is trying to do all kinds of fancy things. But it has two primary nouns in it. It’s rationality and existential risk. And those are the two things that I think when I’m thinking about LessWrong, are ultimately the things that motivate me to work on it.

I’m like, man, it would be really shitty for humanity to die. I think there’s a decent chance that humanity goes extinct sometime this century. And when I’m thinking about, what are the best ways to fix that? I’m thinking a lot about artificial intelligence. I’m also thinking a lot about, like, a number of other broad-picture [topics], history, models about how humanity has developed, the Industrial Revolution, similar questions. And for those questions, LessWrong still strikes me as by far the best place to think about those questions. And that historically has done the most work in terms of an online community or even really any website [in] the world, to kind of think about, like, under which actual conditions might humanity go extinct? What are the risks of artificial intelligence? What can we do about it?

I think one of the reasons why it has been a better place than the rest of the world to discuss these things is because of its focus on rationality.

As I said earlier, the founding document of LessWrong were the Sequences. And, I mean, they’re just really good. I really, like, they continue to be probably my favorite piece of writing. They’re not perfect. In particular, in many of the pieces, there’s a good chunk of them that just, like, didn’t survive the replication crisis because they relied on specific experiments that did not age particularly well. Robber’s Cave experiment is one of the examples. The Stanford prison experiment.

But overall, when I’m thinking about, what is the document or the book or the reading, that I can give someone that allows someone to think about these really hard big-picture questions about whether humanity will go extinct: like, “What is artificial intelligence? How can we think about it? How could we control it? What is morality? What determines what is a good thing to do versus a bad thing to do?” I would think the Sequences continue to be one of the best starting points for this.

I don’t know. Lots of other stuff. I’ve over the years read lots of other useful things that helped me orient here. And then I think, kind of downstream of the Sequences and being able to take those as a prerequisite for at least many of the core contributors of the site, is a community that has developed an art of rationality that has helped people think substantially more clearly and better about a lot of important questions that I would really like to answer.

And also, it’s a big social network that by now has the ability to achieve, like, quite substantial things in the world.

I don’t know I had this conversation, I think it was like three or four years ago, [with] Paul Christiano and Katja [Grace], I had them around the same time. [I think we were talking about] some domain, some approach to AI alignment being underinvested in and how we could get people to work on it. And I think Paul’s response at the time was like, I don’t know, if I want something to happen, one of the first things I do is I just go and write a blog post about the arguments for why it should happen. And then, surprisingly frequently, the problem does indeed get solved.

And that’s just pretty great. Indeed, I have this feeling that, like, it’s not always, it’s not perfect. Indeed, most of the time, probably nothing happens. But really, to a quite substantial degree, if I want something to happen in the world, and I think that something is important, I can go and write a blog post or write a long comment in the right place on LessWrong. And this does indeed allow many other people to come to the same belief as I do, and then allow them to take action on a bunch of important things.

And that’s very rare. Most of the internet does not work this well. [In most] of the internet, I would have to, I don’t know, make a very funny meme. I don’t really know how I would get action to happen or [how to] change beliefs in most of the rest of the internet. Definitely not usually via, like, writing a blog post that’s just pretty reasoning-transparent about why I believe something.

Daniel Filan: What are examples of blog posts that have been on LessWrong and then somebody did a thing?

Oliver Habryka: [TIMESTAMP 0:13:47] I mean, like, different alignment approaches, my sense is Paul at the time was thinking about a bunch of debate-adjacent stuff that I think he was working on around that time, that I then think was probably upstream of some of the work by Geoffrey Irving and some other people who then, like, got more interested in it.

A lot of it is, like, LessWrong is not particularly full of calls to action. It is rarely the case that somebody is like, “Man, X should really happen.” But more something like, “X is really important.” And it’s pretty clear that hundreds, probably even thousands of people all, like, have changed their careers and life decisions on the basis of what things are important. This includes people being like, well, AI alignment is really important and AI existential risk is really important, but also includes things like...

[LessWrong], for example, had a huge effect on my career choices early on, even independently of existential risk and similar things, because it just had a bunch of people being like, “Oh, being a software engineer actually is just a really good deal in terms of the mixture of having freedom to think for yourself and making a lot of money.” And just people having good analyses about which kinds of things are useful. And that was in the context of effective giving stuff, like people were trying to figure out what career is probably best if you just want to make money and then donate it effectively. And in the context of that people were arguing about between different careers and which ones were most efficient here, and software engineering just kept coming out [on] top. Including software engineering particularly in the US, which was actually at the time one of the top reasons why I wanted to move into the US. Because I was like, well, I could make 50k or 70k as a software engineer in Germany, or 150k, 250k in the US, relatively quickly after graduating.

Daniel Filan: Okay. So, if I think about the value of LessWrong as like, basically, having some rationality skills. How good do you think LessWrong is at that?

Oliver Habryka: I mean, on a cosmic scale, it sucks. (*chuckles*) [If] I imagine a society that actually tried really hard to figure out how to train people to think well, they probably would not build a thing… Probably a thing like LessWrong would be part of the thing that exists. But man, it would not do most of the heavy lifting.

It doesn’t include exercises and [currently doesn’t particularly have] an active community of practice of people who are trying to figure out how to get better at this. Chat text is [a] medium that has some ability to provide you with feedback. But seeing someone, spoken word, a bunch of other dimensions, just allow more feedback to actually help people train and get better here.

But I mean on a world scale, I’m just like, well, (*sighs*), I have different cynical perspectives, depending on the day. [When] I look at most of the world, Zvi has written a lot about this, I definitely get a sense [that] man, most of the world is doing the opposite of trying to train your rationality. And LessWrong isn’t perfect on this either, but, like, a lot of things, as Zvi would say, are Out to Get You.

In the sense of, [if you’re graduating or training a degree like literature or whatever in university], there are many components of that culture that, when I talk to the people who are going through that, feel to me like they’re actively trying to draw attention away from relatively crucial dimensions of people’s life decisions. And many things in the world feel to me like they’re actively trying to make your decisions worse. Sometimes just because that’s an emergent phenomenon of large social systems and [it’s] just kind of annoying [to] have to deal with lots of different people’s objections. Sometimes it’s because actually specific individuals or specific institutions benefit from you being misinformed.

So in some sense, I’m like, what is the thing that LessWrong does, and [I think] it does a bit better, not like at a cosmic scale, not much better, but a bit better, at not actively making your thinking worse, while still providing you with a lot of information. There’s just [lots on] a lot of topics and a lot [on] the topic of thinking itself. It feels to me that a lot of writing on there is genuinely trying to understand what good thinking looks like, as opposed to trying to explain to you what good thinking looks like with a bottom line written, and therefore you should support me or therefore my guild or my tribe is the highest status one. Not saying LessWrong [is] perfect. It also has a lot of that.

On my more optimistic days when I’m a bit more optimistic about humanity, there’s definitely some stuff where I’m like, man, it’s so incredibly crucial to how I think about the world now. And I remember how hard it felt at the beginning, before I encountered LessWrong and before it became a natural part of my cognition.

A crucial example here is just being comfortable and confident in using probabilities at a daily level. It seems like an extremely valuable tool to me. And I just couldn’t do it. I mean, I remember. I remember reading LessWrong and being, in the early Bayes posts or whatever, being asked quite simple probability theory questions that I was like, “Fuck, man, how am I supposed to do this?” Eliezer had this fun story of the Beisutsukai where, as part of your onboarding, you would be asked a relatively straightforward probability theory question [combined with] some Ash conformity. And definitely at the time I had a feeling that, man, that’s extremely hard, I would totally get that wrong.

And now when I look at that story, I’m like, wow that was trivially easy, how could anyone not get that probability theory question right. Because indeed, over many years of engaging with the space, my mastery over probability has indeed just drastically increased. And it’s just extremely useful. Probabilities, assigning probabilities to stuff, being able to make bets on lots of different things… just on an object level, it’s very useful.

The same goes for [a] strong culture of estimation. I think the other skill that [I can explain with the shortest inferential distance], that’s been very valuable to me is just the habit of estimating quantities even when [they seem] on the face of it quite hard to estimate.

And that’s also just a culture that seems to be present, for example, in people doing physics degrees. It’s not a thing that’s unique to LessWrong, but [it’s] one of the best parts of the culture of doing a physics degree, and it’s also present in LessWrong.

Where I ask myself, many times a week I’m sitting there being like, man, “How many carpenters are there actually in the US?” Or, “How many carpenters are there in Berkeley?” I have to ask myself this question because these days I’m coordinating a bunch of construction around a large hotel in Berkeley. And [trying] to be like, okay, “What is the likelihood that we would find someone who’s available, what is the likelihood that we could find someone who’s skilled?” And making estimates of that type, independently of needing to rely on people who have experience in the field, and just doing it from first principles, frequently saves me like dozens of hours, and causes my decisions to be substantially better.

[A] concrete example of where I use estimates almost every day is trying to work with contractors who give me estimates for how long something should take or how expensive something should be.

And more than 50% of the time, despite these being people who have a bunch of experience in the field where they’re giving me an estimate, if I do an independent estimate, I arrive at a number easily a factor of two or three away, and then I go back to them, to the person who made the estimate, and [am] like, “Hey, I did an estimate. That estimate was [a] factor of two or three off.” And half of the time, they are like, “Oh, sure, that’s because you don’t understand this part.” But the other half of the time they look at my estimate, then they go silent, and then they’re like, “Oh, shit, I was wrong.” And are like, “Oh, yeah, your estimate is totally more correct.” In a way that has saved me tons of planning headaches and tons of money.

And then there’s lots of more subtle stuff, that’s harder to talk about. [The] Center for Applied Rationality has been [teaching a lot more] applied rationality techniques [I use] when I’m thinking about [getting] good at introspection, having a better relationship to internal conflict and a bunch of other stuff that’s just been on an object level very concretely useful, that I [don’t] think you can currently learn anywhere else in the world, and [that] just seems very helpful. My guess is other schools of thought might have similar techniques and similar ways to helpfully orient to your own mind. But at least my sense is that it’s also often much harder to vet them [regarding] to what degree those techniques actually end up being traps or having negative consequences in a bunch of different ways. But I’m not sure. [I] guess there are also other places you can learn lots of useful rationality techniques from.

Reputation Management and Narratives [0:23:15]

Daniel Filan: So let’s go back to what you think on your bad days. So you mentioned that you had this sense that lots of things in the world were, I don’t know, trying to distract you from things that are true or important. And that LessWrong did that somewhat less.

Oliver Habryka: Yeah.

Daniel Filan: Can you kind of flesh that out? What kinds of things are you thinking of?

Oliver Habryka: I mean, the central dimension that I would often think about here is reputation management. As an example, the medical profession, which, you know, generally has the primary job of helping you with your medical problems and trying to heal you of diseases and various other things, also, at the same time, seems to have a very strong norm of mutual reputation protection. Where, if you try to run a study trying to figure out which doctors in the hospital are better or worse than other doctors in the hospital, quite quickly, the hospital will close its ranks and be like, “Sorry, we cannot gather data on [which doctors are better than the other doctors in this hospital].” Because that would, like, threaten the reputation arrangement we have. This would introduce additional data that might cause some of us to be judged and some others of us to not be judged.

And my sense is the way that usually looks like from the inside is an actual intentional blinding to performance metrics in order to both maintain a sense of social peace, and often the case because… A very common pattern here [is] something like, you have a status hierarchy within a community or a local institution like a hospital. And generally, that status hierarchy, because of the way it works, has leadership of the status hierarchy be opposed to all changes to the status hierarchy. Because the current leadership is at the top of the status hierarchy, and so almost anything that we introduce into the system that involves changes to that hierarchy is a threat, and there isn’t much to be gained, [at least in] the zero-sum status conflict that is present.

And so my sense is, when you try to run these studies about comparative doctor performance, what happens is more that there’s an existing status hierarchy, and lots of people feel a sense of uneasiness and a sense of wanting to protect the status quo, and therefore they push back on gathering relevant data here. And from the inside this often looks like an aversion to trying to understand what are actually the things that cause different doctors to be better than other doctors. Which is crazy, if you’re, like, what is the primary job of a good medical institution and a good medical profession, it would be figuring out what makes people be better doctors and worse doctors. But [there are] all of the social dynamics that tend to be present in lots of different institutions that make it so that looking at relative performance [metrics] becomes a quite taboo topic and a topic that is quite scary.

So that’s one way [in which] I think many places try to actively… Many groups of people, when they try to orient and gather around a certain purpose, actually [have a harder time] or get blinded or in some sense get integrated into a hierarchy that makes it harder for them to look at a thing that they were originally interested in when joining the institution.

Yeah, I mean, that’s a good chunk of it. Like these kinds of dynamics, there’s lots of small pieces of dynamics that I could talk about that are present.

I do also think there’s more bigger-scale stuff. Ultimately, most of the world, when they say things, are playing a dance that has usually as its primary objective, like, increasing the reputation of themselves, or the reputation of their allies. And as a secondary objective, saying true things.

I think this is most obvious when you look at situations like what’s going on [at] major newspapers. We know, from like a bunch of internal documents, that the New York Times has been operating for the last two or three years on a, like, grand [narrative structure], where there’s a number of head editors who are like, “Over this quarter, over this current period, we want to write lots of articles, that, like, make this point. That have this vibe. That arrive at roughly this conclusion.” And then lots of editors are being given the task of, like, finding lots of articles to write that support this conclusion. And clearly that conclusion, [in some sense is what the head editors believe], but it’s also something that, when you look at the structure of it, tends to be [of a] very political nature. It’s like, “This quarter or this year, we want to, like, destroy the influence of Silicon Valley on, like, the modern media ecosystem” or something like this.

And then you try to write lots of articles that help with that political goal. And that’s just a very straightforward… There are people who are operating with a goal of trying to tell a very specific narrative. And that’s quite explicit in what I’ve heard in the New York Times, while of course not saying technically false things. But not saying technically false things really is a very small impediment to your ability to tell stories that support arbitrary conclusions.

Daniel Filan: For what [it’s worth], I do have this impression that the New York Times is more like this than other major newspapers.

Oliver Habryka: Yes. Yeah, which is particularly sad because I think it used to be less like this than other major newspapers. But yeah.

Changing Your Mind on the Internet [0:28:34]

Daniel Filan: And so, you think in the wider world, there’s a lot of talking for reputation management rather than truth. And there [are] a lot of, like, self-protecting status hierarchies. And you think this is less present, on LessWrong?

Oliver Habryka: A bit. Um, yeah. I think in some sense, it’s a bit nicer. [You] can still attack people, but there’s definitely a sense that I have, on LessWrong, that, like, most of the time… not most of the time, but a lot of the time at least, when somebody is writing a post, they are acting with a genuine intent to inform.

I mean, I remember there was this, I have this one specific memory. Uh, I think this was in 2014. I was just getting into this stuff. I was reading HPMOR. I was talking to my friends excitedly about a lot of the ideas that I encountered. [And one evening], I was reading LessWrong and I was in the comment section of some Sequences post. And I think somebody was talking about, like, something relevant to Baumol disease, cost disease, like why are things getting more expensive or less expensive? Somebody was proposing a hypothesis. And then somebody was like, “I think clearly Baumol disease is because of Y.” Somebody was responding with, “That clearly is not true because of all of this evidence.” The other person was like, “No, that doesn’t really check out. I still think my original hypothesis [is correct].” Then the [second] person was like, “I don’t know, man, here’s some additional evidence.” And then… the first commenter responded and was just like, “Shit, you’re right. I was wrong.”

And I remember reading this. I wasn’t that old, like 15 or 16 or something. But I remember reading this and just writing to one of my friends on Skype at the time in ALL CAPS, being like, “What the fuck man, I just saw someone change their mind on the internet!” And just being like, the thing that I had observed in that comment thread was drastically different than the thing that I had observed in all other online forums [up to that] time. It’s just a playbook that I had expected and I’d seen play out already by the time, like hundreds of times, maybe even thousands of times: [if] you have two commenters on the internet, getting more heated and angry at each other, both trying to defend, [some position] that they have some reason to hold, it is never the case that it ends up with one person being like, “Oh, I was wrong. You’re right.”

And that was huge. It’s not the case that there aren’t other places [like that] on the internet, the internet is huge these days. And there’s lots of subreddits that also are capable of people saying that they changed their mind. But it continues to be a very, very small fraction of the internet. And I think it, like, does demonstrate today something substantially different going on.

Trust Relationships [0:31:15]

Oliver Habryka: And I think there’s still, there’s a lot of that. There’s a lot of people genuinely responding to arguments, taking time, having mutual respect about reasoning ability. I think another thing that makes LessWrong very interesting is the [combination] of being a top-level blog where people can contribute high-level contributions and make substantial intellectual contributions, while also having conversations with each other. Which means that, I think among many of the core users of the site, this results over a long period of time in a lot of mutual respect. Where people are like, oh, “The person responding to my post is not Random Internet Commenter #255. It is the person who has written a post that I quite respect and [which] I’ve referenced multiple times about how neural networks work, or [which] I’ve referenced multiple times about how social dynamics work or referenced multiple times about how the brain works.”

Such that when they object to [a post] or comment on it, I feel a genuine sense of curiosity about what they have to say. And the trust relationship, I think, also has made the discussion substantially better. And that’s for example why… Over the years, many times have people suggested that I should make LessWrong submissions substantially more objective by removing [identity markers from people’s comments], like, “Why don’t you just hide the username for the first 24 hours after a comment is written so that, you know, it can be judged on its own merits”. And I’m like, “Nope, there’s a bunch of trust relationships here that are actually quite important to making it so that when you read someone’s comment, it gives you very substantial evidence about whether the person [is] objecting to you in a way that, like, is kind of obnoxious and will probably result in an unproductive discussion if you respond to them, or [if they’re] someone who probably has some genuinely useful things to contribute.

On Forecasting and the Blogosphere [0:32:59]

Daniel Filan: Yeah, so I wonder on these fronts, [how] would you compare to other online communities? [One] community or maybe set of communities that is particularly salient to me is the forecasting community where people are estimating things, they’re doing probabilistic reasoning, [in some ways] it seems like a community of practice, kind of the probabilistic thinking [that’s] kind of championed in the Sequences. How do you think LessWrong compares to that?

Oliver Habryka: I mean, I find [this comparison] particularly hilarious because as I said, when I was talking with Vaniver in 2017, late 2016, about whether I should revive LessWrong, I had two plans. Number one was [to] revive LessWrong. Number two was [to] build, like, a community of practice around the forecasting community. So in some sense, I really like the forecasting community. Indeed, at least in 2017, it was the second entry on my list with a quite narrow margin, like it was a really hard decision to figure out which of the two to work on. And so I quite like it. It has a lot of the same attributes.

It’s also, of course, culturally not that far away. Like, [it’s a bit unclear], there’s the more Tetlock-adjacent forecasting community, there’s a bit more policy forecasting, but when you think of platforms like Metaculus, when you think of platforms like Manifold Markets, I mean, it’s kind of the same people. [There’s very substantial overlap], a lot of the core users are the same.

At some point I [discovered], when I think we hired Jim Babcock to LessWrong and I just discovered that he was, like, at the time number three on Metaculus. He then stopped. I guess I kept him too busy with programming tasks and he fell off the leaderboard. But it is really quite substantially the same people. So in some sense, talking separately about [the] LessWrong platform, and forecasting platforms, I think doesn’t really make that much sense. And indeed, when I think of… [At some point] Tetlock wrote his long list of, “Here are the norms and principles that you should have in order to be a good forecaster.” And I was like, I know this list of norms and principles. This is basically a description of [how] LessWrong reasoning norms work.

So in some sense, I’m like, yep, it’s great. I like both of them. [Though I can’t really see a] forecasting community [standing] on its own. It’s a good place to have conversations, but ultimately it’s a place to just have conversations about forecasts.

[A] thing that I’ve been thinking [about], an abstraction that I’ve always been thinking a lot about when designing features for LessWrong, is the idea of making sure that the community has content that is, as I’ve been calling it, essay-complete. [Whereas] people over the years have tried many different ways of [trying to put] structure on the discussion on an online forum. The most classical example here is argument maps, where people are like, “Oh man, we’re just gonna make argument maps for everything.” But [it] turns out that actually, [many] points cannot be expressed in argument maps. It just doesn’t make any sense to put them in [the] format of an argument map. And the format that ultimately [actually] is the only thing that I know that seems to, again, it can’t express everything, but seems to have wide enough range of expressions that you can generally make a point about almost anything if you have to, is the essay. [It actually has almost all the power of spoken word [TIMESTAMP 0:36:28][unintelligible] in order for conveying complicated points.

And so one of the things that I feel about forecasting communities is this, I don’t really know, like, if I’m part of this forecasting community, how do I change its norms? How do I make points about stuff that isn’t very immediately directly relevant to a forecast? How do I create a meta-level art of forecasting? With whom do I collaborate in order to be, “Hey, let’s figure out what actually makes better forecasters or makes worse forecasters.” For that, you kind of need a platform that has more generality in the kind of things that it is capable of expressing. And I think indeed for many of the forecasters that platform is stuff like LessWrong.

Daniel Filan: I mean, for some of them, I guess it’s like their personal blogs and there is a blogosphere with connections and stuff.

Oliver Habryka: Exactly. The blogosphere exists. I think that one also really helps. Yep. I can definitely imagine a world where LessWrong wasn’t necessary, or didn’t provide marginal value above a blogosphere. But yeah, [there are] problems with it, you know, voting is really nice. Upvoting and downvoting is really nice. There just [don’t really exist] good software tools to make, like, a great glorious diaspora of private blogs just because they lack a bunch of basic tools. And making those tools is not easy. After having spent four years of my life doing software development every day, trying to build a lot of these basic features [which] would really seem like they each should only take a week to build, but turns out they don’t.

On Communication and Essays [0:37:46]

Daniel Filan: Sure. This is a bit of a tangent, but, when you were talking about the brilliance of essays: so, I don’t know how education works in Germany. In Australia, I had a class, it was called English.

Oliver Habryka: Yep.

Daniel Filan: And in high school, like the main thing we did was write essays of, I guess, rudimentary literary criticism. I wonder, what do you think the connection is between essays [as] the ideal form of communication, and also the thing everyone gets trained in, but yet somehow in a way that doesn’t feel… I don’t know, I don’t think I was a great essayist by the end of my high school education.

Oliver Habryka: [I had the same], my class was called German, unsurprisingly. I mean, my sense is, my general experience has been, the writing part of writing essays is overrated. It is not impossible, but quite rare, that I find someone who is a good communicator in spoken word, and a terrible communicator in written word. Most of the time, there’s someone who’s trying to communicate a concept to someone else, they talk about [it]. If I just take the words that they would have said right in that explanation, when they’re trying to explain something to someone else, and put it [in] writing, some clarity gets lost, because maybe they were gesticulating with their hands, maybe they were trying to draw something on a whiteboard at the same time, maybe there was affect in the turn of their voice that was communicating important information. But roughly, I don’t know, man, [a] pretty good essay [comes] out of it. So when I’m thinking about training and where people get good at essay writing, I would think most of that, like 90% of the training data, comes from conversations.

And then I do think different social environments have just hugely different incentives here. When I’m thinking about which environments train you to be a good essayist, it’s the environments where you’re encouraged to give explanations to your friends. And when [you have] frequent situations where one of your friends is like, “Hey, Oli, how does X work?” And then you have to give an explanation [that] is coherent. And there are other social environments where a lot of the things that [are] going on are much more, either there’s mostly phatic communication, or [the] communication is very short, or potentially in environments with lots of language barriers where it doesn’t really make sense to try to convey long inferential points.

[So] I wouldn’t really pay that much attention to English class, I would pay much more attention to, like, what happens during lunch break, what happens during dinner break.

Dialogues on LessWrong [0:40:09]

Daniel Filan: Okay. So, getting back to LessWrong, I’m wondering, if you could see a change from LessWrong now, like [some] marginal change to make it significantly better. What seems most juicy to you?

Oliver Habryka: Well, I kind of set myself a task for a feature that I really want to write right now, and I’ve been wanting for years, I just haven’t gotten around to it, which is a debate feature. I think [two] key inspirational pieces here are the Hanson-Yudkowsky Foom Dialogues from around a decade ago, where there was [a] kind of series of essays back and forth between Eliezer and Robin Hanson about AI takeoff speed and various other things. And then more recently, the second example, it kind of has a similar flavor, is the MIRI AI dialogues, where Nate Soares and Eliezer have a lot of conversations with people, originally in Discord.

And I’ve just been really wanting to add first-class support for that, where the overhead for doing that, and kind of the [affordances] that people have for doing that, are really low right now. If you want to set up… [Imagine a world where], instead of having this podcast, you were like, Oli, let’s have a debate that other people can read afterwards, [that shows up] on LessWrong about our opinions on, like, how LessWrong content should be structured or moderation should be structured. If you want to do the same thing as the MIRI dialogues, you would be like, okay, I will [set up] a Discord server, I will get someone to moderate, I will set a time, I will invite everyone, I will explain to a bunch of people how Discord works. Then we do it, then we edit the relevant transcript, do a bunch of hacky [conversions] in order to do that in tables that would take you probably two to three hours in order to get into the editor.

And my goal is [to] kind of just make it so you can have those dialogues, you just press a button, you invite a collaborator, they accept, it goes up on the front page. You set a window of how much time you have between writing, [where you] respond to someone else and we still have time to edit it before it goes live. So you have a bit of ability to control confidentiality stuff and similar things. But [the ability] to have these kinds of dialogues in public just goes up a lot.

And then I’m just currently really interested in *having* a number of those dialogues, both seeing more of AI-related dialogues between people who disagree in the field of AI alignment. But also a thing that I’ve been really interested in is trying to make [the whole] conversation around FTX in the EA community and also to some degree the AI alignment community go better. And trying to be like, what are the lessons we should take away from FTX, and both me debating with other people on this, and other [kinds] of leaders in the EA and AI alignment community and rationality community, debating with each other [what] we should take away from it.

Which I think currently doesn’t really have a good vehicle to happen at all. Because I think you can kind of write this top-level post and shout into the void. And then random, like, not random internet commenters, but just, I don’t know, kind of the whole internet will scream back at you in a comment section in a way that, in particular, in controversial topics like the FTX situation, is just a *very* stressful situation to be in. And in particular, right now, with increased traffic to both LessWrong and the EA Forum with lots of people who are kind of angry [about] the FTX situation, [this] doesn’t result in a great experience, because everyone is kind of a bit too triggered.

But if you instead can be like, I get to talk to Carl Schulman about a topic, and we have some mutual respect about our ability to be charitable and well-meaning and have good faith in each other in a conversation. That’s a very different expectation about what will happen if I make a point.

Daniel Filan: How do you think this differs from [how you can currently] engage with people in comment sections? [If] I look at LessWrong comment sections, often there will be, like, exchanges between two people that will go back and forth for a while hashing out some sort of dispute.

Oliver Habryka: So one of the things that I think would be the key difference, at least in my current model of this, is that the individual responses cannot be voted on. Because a thing that is the case, that is both really valuable, but I think also makes certain types of conversation much harder on LessWrong, is that frequently, you are not really responding to the person who you are engaging in a debate within the comments, you are responding to the *internet*. And it’s like you’re having a debate that’s more like a presidential debate or something, where, when two presidential candidates give talks in front of a large audience, they’re not interested in changing each other’s minds. That’s not what’s going on. They are trying to pick apart the other side’s point in order to, like, prove that they are correct.

Daniel Filan: Yeah, I think this is also—a hobby I have is watching basically Christian apologetics debates. And I think there [you] get up on a stage and [you] do try to engage each other, but I think the standard advice is, you’re not going to try to change the other person’s mind.

Oliver Habryka: [One of] the things that I think is different is that it’s actually really hard to have a conversation currently in the LessWrong comment section or the EA Forum comment section that [is just] directed at trying to have a long conversation with a specific other person. And it’s much more that you have a conversation with the public with potentially another primary person who’s debating with you and the public. But it’s hard to kind of have this [conversation] that’s clearly focused on the primary debate participants.

Lightcone Infrastructure [0:45:26]

Daniel Filan: Alright. I now want to move out a little bit. So, I don’t know exactly when this happened, but at some point, this organization that ran LessWrong changed its [name to] the Lightcone Infrastructure team and started getting into real estate development, I guess.

Oliver Habryka: *Laughs*. I mean, first of all, yeah, why did we change our name? Number one, we just started doing more things than just LessWrong. Like, we’re already running the AI alignment forum, we were also writing a lot of the code for the EA Forum. We were also running a bunch of retreats and trying to just, you know, [take] broader responsibility for kind of the rationality community, EA, AI alignment ecosystem in a bunch of different ways.

And so, definitely one of the goals which we completely failed at was to somehow make it so that I’m capable of referring to LessWrong, the LessWrong team… Okay, here [are] a few different things. Number one, I think the thing that just caused us to [think] we need to change our name was, we were sitting there and I had like four levels of LessWrong, where there was… Like, we had LessWrong the organization. *In* LessWrong, I now had the LessWrong team, working on LessWrong.

Daniel Filan: Yep.

Oliver Habryka: And then, I think the chart was like… At some point I was trying to make a diagram of where different priorities lie. And I think I said a sentence like, “LessWrong, the LessWrong team at LessWrong should make sure that next week we’re building this LessWrong feature.” And then I was just looking at that sentence and being like, that’s… Every instance of LessWrong here refers to a different instance of what LessWrong means. And then we were like, “Okay, we have to fix it.” So we decided to name ourselves Lightcone Infrastructure as a different name.

Of course, the first thing [we did] in order to, like, be maximum idiots, right after we founded Lightcone Infrastructure, was to create Lightcone offices, which everyone of course immediately started both abbreviating as Lightcone.

Daniel Filan: Yeah.

Oliver Habryka: [TIMESTAMP 0:47:30] And then we sure completely fucked that one up.

Daniel Filan: Is that why you closed Lightcone offices?

Oliver Habryka: Exactly, yeah.

Daniel Filan: Alright, we can talk more about that later. [But why] the shift to… I perceive the Lightcone team as having made some kind of shift from primarily focusing on LessWrong, to a greater curation of basically in-person spaces in Berkeley, California. What’s the deal with that?

Oliver Habryka: So, [there’s a few different things] that happened at the same time. I think one of the major things was that we kept talking to AI alignment researchers and people in the rationality community and the EA community and were like, “Hey, what would you like us to improve at LessWrong?” in user interviews, and the answers just less and less turned out to be about LessWrong. Like the answers then turned out to be more like, “Well, LessWrong could help me find better collaborators.” [or] “LessWrong could help me find other people to make an office with.” In ways that [technically], we were still asking them what they wanted for LessWrong, but [which] just clearly indicated to me that the bottlenecks to intellectual progress for those people [weren’t] really anything feature-related on LessWrong anymore.

And then I was like, well, ultimately, what I care about is intellectual progress on these key questions that I want to make progress on, AI alignment, rationality. And so I was like, well, okay, then I guess let’s [tackle] them right at [their] root. Like, “What is the thing we hear the most about?”

And one other source of information that I had, was in my role in the Long-Term Future Fund, where I’ve been a grantmaker for now also three to four years or something. And one of the facts there is, man, being an independent AI alignment researcher is one of the [jobs] with, like, the worst depression rate in the world. [Some crazy proportion,] like 50% of PhD students, [during] their time as a PhD student, either get diagnosed with severe anxiety or depression. My guess is, independent alignment researcher is even worse, like closer to 60-70%. It’s a completely miserable job.

I think there’s a bunch of different reasons for it. Number one, it’s [this thing that] feels like it comes with huge responsibility, [like], “Hey, how about you figure out this problem in a completely pre-paradigmatic field with relatively little guidance, almost no mentorship of anyone who can pay attention to you, and usually doing it at home alone in your basement?”

And so I knew that, for people who are trying to do AI alignment research in particular, not having a place to be to work with other people to get feedback and to have peers and to work together with them, was a huge bottleneck on just, like, not even making progress; on just, like, not getting depressed and inevitably stopping this job after, like, six months to a year.

Yeah. And that, combined with a bunch of other stuff, other evidence from user interviews that we gathered, I think made us more interested in a bunch of in-person stuff.

I think other major pieces of evidence that were relevant is we kind of made this pivot during the pandemic. Definitely, the pandemic was a situation where I really felt like I wanted to have a stronger community and a community that is more capable of being able to orient in the physical world. Because, I don’t know, the world just kind of stopped. What the expected thing to do was, for many of the people, was to just be at home alone for a year or something, maybe sometimes seeing people while going on walks outside, but it was pretty crazy. And I was like, man, [just] having a better in-person community that could have… There were so many things that we could have done to navigate the pandemic substantially better.

[TIMESTAMP 0:51:25] And it felt to me that LessWrong, I think, actually did a quite good job during the pandemic. It was one of the best places to, like, understand how dangerous Covid actually was, where it was going, forecasts and various other things. But it just didn’t feel to me that there was a small change to LessWrong that would have [made] me go, “Oh, of course!”. Like the rationality or the EA community just, you know, bought themselves, [rented] themselves a bunch of nearby houses and just made themselves a good bubble and they facilitated their own continuous testing schedule. Or maybe they even just went and got themselves some of the early vaccines months before the FDA had approved them, because we know that [early individual researchers were already] capable of creating mRNA vaccines in [May, June, 2020], they just weren’t approved and they couldn’t create them at 10,000- or 100,000-scale batches. But individual researchers were totally capable—and not world-class researchers, just normal individual researchers—were capable of producing enough mRNA vaccine to vaccinate themselves and their friends. And so making enough vaccines for just a small community of 200 to 500 people was totally feasible in the middle of 2020.

And so, doing these things… I just had this feeling that, man, [if] the world will be crazy like this, and I also can’t really rely on the rest of the world to just take care of my basic needs, I would really like to make a community and an ecosystem that is capable of just orienting substantially better here.

Some of this reasoning was also routing through kind of my thoughts on AI and the next century, where my sense is something like, in many worlds, the way AI risk will happen is that things will happen very quickly. Mostly the world will look normal, and then AI will be really dangerous, and everyone will die. But I also think many of the worlds, in particular many of the worlds where I think humanity has a decent shot at survival, will look more like the world will gradually get crazier and crazier. And the explicit estimate that I made at the time is that [if, like, there’s going to be] something like AGI in 20 years, that there’s going to be something like 10 years, or something between five and 10 years between now and then, that will be at least as crazy as 2020.

And then I was like, fuck, (*chuckles*), I want to be ready for that. I don’t want to be randomly buffeted around by the winds of the world. I want to somehow create an ecosystem and an institution and an engine that when the world goes crazy and new technologies propagate and, like, major nation-states go to war and the whole world ramps up into high gear… That somehow I’m capable of, like, surviving that and getting, like… And importantly, also getting out ahead. One of the things that was [a key, a really interesting thing] about the pandemic is just, it was a huge opportunity that if you had just had a group of people who are better coordinated that the rest of the world, to provide a ton of value, get a lot of relative resources, and just kind of, like, actually have a huge effect on where humanity is going at large, by taking the right actions at the right time.

Existential Risk [0:54:43]

Daniel Filan: So when you say [there’s] this community that you want to support somehow or make more resilient. So [with] the LessWrong project, [you] mentioned that it was somewhat in the spirit of rationality for its own sake for the sake of existential risk.

Oliver Habryka: Yeah.

Daniel Filan: Is the in-person presence, do you think, primarily focused on rationality, or primarily focused on, like...

Oliver Habryka: [TIMESTAMP 0:55:08] Definitely the existential risk part. I mean I was talking about this with Eliezer, like, [at various times over the years], and I haven’t talked to him that many times, but at various times I talked about what we should do on LessWrong. And he was definitely like, “Well, look, in 2012, like, actually in 2008 or 2009, I wrote this post of ‘[Rationality: Common Interest of Many Causes]’. And the community that I imagined building on LessWrong was one dedicated to a 50-year plan or something, around trying to create a sane art of rationality, and a community that’s primarily oriented around sane thinking that is, like, a wide diaspora of different approaches and people having different priorities, and on a daily level, people of all kinds of different professions.” And he was like, “Well, I don’t know, man, but AGI timelines are short and [it] sure looks like we just don’t have time for that right now.” And indeed, I think on LessWrong you should just let people know that, like, look, we still are here and deeply care about rationality, but we can’t really build, and we’re not really building this platform that’s making, a long, multi-decade plan about building a glorious and rich rationality community that’s going to raise the sanity waterline. We are going to be here and we’re going to be learning rationality for the sake of existential risk. And also some for its own sake but, like, really a lot for the sake of existential risk.

And I think in the in-person community that [sense is] even stronger. I think there’s much less of a sense of, and in some sense I deeply regret this, if I’m currently thinking about, like, [what are my deepest concerns] and ways I think things are going wrong. A lot of it is a lot of consequentialist thinking about how to navigate, like, the next century in a relatively utilitarian consequentialist way, and not a lot of thinking about how to just broadly be more sane and maintain your sanity in different environments, and I think that is probably the wrong trade-off at the margin. But I think the in-person community definitely has much more of a focus on, we’re here to get something done and that thing that we’re going to get done is [to] somehow help navigate, help humanity navigate this current century.

Online and In-Person Communities [0:57:14]

Daniel Filan: Okay. Another question I have about the kind of in-person term is, if I think about LessWrong, it seems sort of… I think this word is a bit overused, but it seems kind of democratic in the sense that, my understanding is, anybody can sign up, if you have internet access, [you can make an account], you can comment, you can post. But you can’t necessarily come to Berkeley and chill in Lightcone Infrastructure team’s buildings.

Oliver Habryka: Yeah.

Daniel Filan: How do you think about how much of that spirit of democracy you want in the projects you run?

Oliver Habryka: Definitely a really nice thing about the internet is that, you know, a single bad commenter can be annoying, but he can’t be *that* annoying. A single bad coworker can easily tank a whole organization. [If] you read [Paul Graham’s] essays or YCombinator stuff here they’re just, like, “Look, man, the downside of a hire, the downside of someone you work with every day in the same building, is extremely high.” “[Particularly] among your first 20 to 30 hires or whatever, you should think about each one of them as a pretty decent risk that could tank your organization. And if things aren’t working out, you probably want to fire them really quickly.”

And so there’s just an inherent sense of the amount of disruption, the amount of ability to put up barriers and engage with people at their own speed, and engage with people with different modalities, different cultures, in an in-person format is inherently lower than in an online format. But the in-person communication and coordination is really crucial. And so I think there’s just an inherent thing that if you want to build something that’s in-person, it will probably be less pluralistic and less democratic than things you build online.

One of the biggest regrets that I have is that, I think the correct [response], when you have a broad relatively pluralistic community like LessWrong, is to create an in-person ecosystem that itself has a lot more niches than what we currently have. Where I think the thing that happened is that things really centralized pretty hard in Berkeley, and in Berkeley things centralized relatively hard in a few places, in particular after we built Lightcone offices and Constellation and two big office spaces. In a way that really made it feel like, if you wanted to be part of the in-person LessWrong community, those were the only places you could be. There are sometimes occasional socials in different parts of the Bay Area, but it was *the* place to be. And I do sure look at that and be like, man, that sucks.

Because in order to get an accurate representation of [all the perspectives] that are important on LessWrong, and of the actual different types of people and different perspectives and different cultures that are present on LessWrong, because of the much narrower scope that a single in-person community can have, I just don’t think it makes sense to think of there being a single place, for there to be a single in-person community with a relatively shared culture. And I wish there were just many more. I wish instead Berkeley was a much more rich diaspora of lots of different niches of, like, 10 to 20 people, each one of which is really quite opinionated about a specific type of their culture or specific type of perspective. In a way that between these niches there’s a lot of trade, a lot of negotiation, a lot of people having their own money to stand on, and their own resources to stand on, as opposed to things feeling quite centralized and quite hierarchical.

And I think that’s just a mismatch between exactly the kind of cultural expectations that were set on LessWrong and the EA Forum and the online communities, that doesn’t translate very well to the in-person community.

Daniel Filan: I wonder how inevitable this kind of thing is. [If] I think about, like, [places where I can go out and buy food]. And in the real world I’m a vegan, but let’s say I’m not, right? [There] are some small restaurants, but a lot of restaurants are big and they’re franchises. If I think about consultancy firms, there’s famously the Big Four, right? [I guess] on the other hand, there aren’t, like, three major families that everyone belongs to. So the clue of this logic only goes so far, but...

Oliver Habryka: I mean, in some sense, in-person communities have generally been… When you think of [social structures in space], you get cities, you get the big cities, but man, within those cities, it usually tends to be a quite rich tapestry of different subcultures. In big cities, you have Chinatown, you have different, like, ethnicities congregating in different parts of the city, in particular in the US, and creating very much their own cultures and perspectives.

So I don’t think that, I mean in some sense there’s a physical limit of how many people you can have in one space and that just enforces [that] you can’t have that much of a centralization thing.

So I do think there’s that, but also on the other hand, the thing that I think is kind of a much more *inherent* thing is, your [headquarters need] to be somewhere. If you have something like [the United Nations], then [there’s] a huge question about, where do you put the headquarters of the United Nations? Because clearly, the headquarters of this thing that’s even just trying to coordinate in-body between many things that are clearly independent [has] a pretty huge effect on where the center of mass will be located and what a culture around the center of mass will be.

And I think the most common reason for why you get strong agglomeration effects in both companies and, I think, in-person communities, is because you often have a shared pool of resources for which governance needs to be somehow decided. And then, wherever the governance of those shared resources takes place, will be the place where everyone wants to be. Because I think they correctly and justifiably expect that if they’re not present wherever these negotiations are taking place, that they will be left out, or their subculture or their part of the community will be left out, in the allocation of those resources.

And I think in Berkeley, that’s a lot of funding stuff. A lot of things are downstream of the Open Philanthropy project’s funding, a bit the FTX Future Fund’s funding. And so ultimately people are like, “Man, if I’m not in Berkeley, I’m really worried that I can’t have, I can’t really be part [of], I don’t have the resources to build this thing.”

And because generally we are building things with a philanthropic benefit and from a charity perspective, this just means that the ability for people to self-sustainably build things in other places is very limited without having access to these giant funding bodies that are basically funding almost everything in the space.

Real Estate [1:04:01]

Daniel Filan: Okay. So, [getting] a bit more specific. [If] I think about what Lightcone has done, or things that are visible to me in the real estate business, there’s, uh, Lightcone offices.

Oliver Habryka: Yep.

Daniel Filan: There’s, um, basically an office space where people can work. There’s Rose Garden Inn which is a hotel slash events space.

Oliver Habryka: I mean, we don’t know yet what we’re going to do with it. The original plan for it was something like a relatively straightforward continuation of the Lightcone offices, with maybe some different boundaries, some different things, but, you know, roughly the same thing. Combined with more space for visitors to stay overnight and just a bit more of an intense culture that now can also extend more to people sleeping on-site. And an events space dedicated also pretty straightforwardly for just lots of AI alignment, rationality, EA adjacent events.

There’s various things that have made this harder. In some sense, [the day the] Rose Garden Inn purchase agreement was signed was November 8, which one might remember as another quite significant date in history, which is the date when Sam Bankman-Fried went on Twitter and said, “We definitely have the money, please believe us.” And lots of people were like, wait, I’m confused, in bank speech saying ‘We definitely have the money’ means ‘we definitely don’t have the money’.”

[TIMESTAMP 1:05:25] And [unintelligible] I’m looking at the Rose Garden Inn [as an interesting case] where I’m like, yep, that was a decision made in the funding ecosystem and within [the kind of integrity of the trust network and ecosystem] that I think has quite strongly been shaken, both in terms of money available, but also in terms of my interest in just, like, pouring random gas into this ecosystem since the FTX situation. And so I think we’re just dealing with huge uncertainty about what that thing will actually be. I don’t know.

The LW / Rationality, AI Alignment, EA Ecosystem [1:05:54]

Daniel Filan: Okay. Yeah, can you say a little bit more about… So when you [talk about] the ecosystem, what ecosystem do you mean, first of all?

Oliver Habryka: I mean, I generally refer to it as the LessWrong slash rationality, AI alignment, EA ecosystem. We can go piece by piece. What are these three pieces?

It’s all the LessWrong community that has been founded on LessWrong. [Key] organizations that kind of came out of that over the years [were] MIRI, [the] Machine Intelligence Research Institute, the Center For Applied Rationality. I also think kind of in a similar intellectual legacy, like the Future of Humanity Institute in Oxford, [Nick] Bostrom was also kind of part of [an] early community on the SL4 mailing list where a bunch of stuff got started. Then, kind of relatedly to that, kind of [a] lot of people thinking about the future of humanity were like, “Oh man, probably artificial intelligence is going to be the big thing to pay most attention to.” And then a research field started around the idea of AI alignment. “How can we get AI to do the right things?” And that then sparked lots of organizations. [TIMESTAMP 1:07:03] CHAI, where you still [unintelligible]?

Daniel Filan: Yes.

Oliver Habryka: I mean, tons of [organizations], so many [organizations], there’s like 20 organizations working on something [related to] AI alignment, like CAIS and CLTR and CLR and CLBR and CLAR. *Chuckles*.

Daniel Filan: The last two aren’t real, right?

Oliver Habryka: Correct. But they all start… There’s at least one other organization that has a C and an L and an R in it somewhere. In any case, they really like naming themselves Center for Long Something. *Sighs*. And then, AI alignment has become this big thing, also primarily kind of being funded, like all of these efforts, by the Open Philanthropy project, and, before FTX turned out to be a giant fraud, also the FTX Future Fund.

And then kind of the EA community, that also started kind of around the same time, always [has] been socially very close to the rationality community, started around, like, 2009, 2010, or something. Early EA Summits were organized around that time by people who are pretty core also to the rationality and [adjacent to the AI alignment community].

And that is a pretty coherent social network. Generally there [are] lots of labels that people throw around, lots of feelings about who identifies as what, but at the end of the day, there’s a social network here and pretty frequently you will find that, like, someone doing work in a space has tons of friends, or lives in a group house with other people, or dates tons of people in the space, or gets funding from one of the few funding sources in the space.

And it’s pretty clear that there’s a lot of infrastructure and a lot of shared resources that this large group of people relies on in order to get their job done. Their different jobs done.

Immoral Behavior [1:08:56]

Daniel Filan: And why… So when you say you became less excited about, um, fueling this space. How much [of it] was FTX? You know the CEO of FTX and the CEO of Alameda sort of came from this space. ’cause, I mean, you wouldn’t… On some level I think, I mean it’s not great, but it doesn’t seem like the most relevant thing for evaluating it. Or I don’t know.

Oliver Habryka: I don’t know. I feel really confused about it. I think my perspective is more that I’ve for many years had a number of things that felt really off about the space. I used to work at the Center for Effective Altruism. And I left after basically being depressed for one and a half years for stuff that, in retrospect, feels like indeed in meaningful ways the culture there was a precursor to what happened at FTX. Like, a lot of situations where just people were taking *really quite extreme deceptive action*, and the reaction of the rest of the organization, and [both people on the board and people I was working with], was often a *surprising* amount of apathy to people doing things that, at least from my perspective, in terms of honesty, were very immoral.

Daniel Filan: Like what?

Oliver Habryka: I mean people lying, like, a lot. Situations that I’m pretty sure happened were, like, there was a decision about a major merger [between] CEA and Giving What We Can, uh, where basically it was announced and executed almost fully in the week of EAG, with, like, indeed my best guess of what happened behind the scenes here was people being, like, man, when are a lot of the people who are currently, who would have opinions on this, going to be most busy so they could least object on this, but in a way that would still allow us to technically be able to claim that we listened to people.

Daniel Filan: That week being the EAG, which is this major Effective Altruism conference.

Oliver Habryka: Yeah.

Daniel Filan: G for global.

Oliver Habryka: Other situations are, there were, like, a lot of people who were of very unclear employment, where there was Leverage Research, an organization that has now kind of somewhat interestingly imploded slash exploded, that employed a lot of people who were working at the Center for Effective Altruism, where employment means something like: they were living together with everyone else at Leverage Research. They were getting some amount of housing and food support by it. They weren’t technically being paid by it but they were present and they had a manager assigned to them under Leverage and would be held accountable according to their performance in various ways. And, like, that thing was just completely held secret and hidden from the rest of the organization, where [there] were people working for Leverage Research and then people would just lie about whether they were indeed in that relationship with the other organization. These people were often also straightforwardly and internally referred to as “spies”.

And this did not spark anyone to go, like, “Oh my god, what the fuck crazy thing is going on?” It was kind of like par for the course for the kind of adversarial epistemic environment [the way things were happening].

At various points there were situations in leadership where, like, uh, Will MacAskill was supposed to be CEO of the organization, but he didn’t really, wasn’t really present, mostly, like, was busy promoting various books and so on. Which meant that the COO had actually taken over *most* of the CEO responsibilities, uh, for over a year. In that case it was Tara Mac Aulay, one of the co-founders of Alameda. But she just, like, I just had conversations with her [where] she [very clearly was] like, “Yes, and I intentionally did not tell the rest of the organizations the fact that I had taken on CEO responsibilities and was making major strategic decisions, because I expect that this would have meant that people would, like, interfere with my plans and, like, what did I have to gain from informing you?” That was a specific conversation between me and her, and she was like, “What did I have to gain from informing you about, like, that change in the power balance, seems like the primary thing that you could have done was to, like, interfere with my plans.” A terrible way to run an organization (*chuckles*).

Daniel Filan: Alright. I should, I want to add that my understanding is that she left Alameda well before, like, alleged improprieties happened. Is my read on that situation.

Oliver Habryka: I will take bets against this fact.

Daniel Filan: Okay.

Oliver Habryka: Like, my guess is there was fraud at Alameda, like, [while Tara was present]. My guess is *also* that her other crypto fund, uh, is also not particularly legitimate.

Daniel Filan: Okay. So… so what I’m hearing is, you have this sense that, like, [in] a bunch of these [professional organizations], [or] maybe just [at] CEA, you felt like there was just a lot of strategic deception going on.

Oliver Habryka: Yes.

Daniel Filan: Okay.

Oliver Habryka: Yeah.

Daniel Filan: And, like, do you think that’s still common?

Oliver Habryka: I mean, clearly yes (*chuckles*). We just had FTX explode.

Daniel Filan: I mean I wouldn’t really call it an EA organization.

Oliver Habryka: Why not?

Daniel Filan: Because… because its primary purpose was letting people trade. Well… I think the real reason I wouldn’t call it an EA organization, [I mean the actual thing it did was to] let people trade cryptocurrency futures and derivatives and stuff.

Oliver Habryka: Yeah. It also did a lot of political advocacy. It also ran the FTX Future Fund.

Daniel Filan: That’s true. Yeah. I think… I guess I don’t know, like, if it’s the case that FTX Future Fund was closely integrated to the rest of FTX the organization, that might change my mind. I kind of had the impression that they got money and, like, [maybe] Sam had some opinions about where it should go. But I would have thought of… I don’t know, I, a not super well-informed person would have thought that, uh, FTX and FTX Future Fund were somewhat organizationally distinct. I should here add that I got a nice trip to the Bahamas paid for by this organization. But, um.

Oliver Habryka: Yeah. I mean, like, probably the FTX Future Fund was not capable of making grants that Sam did not want to happen. And, if Sam wanted to make a grant happen, it would have happened by the FTX Future Fund. This… institutionally, also, the FTX Future Fund spent about half of their time at the Bahamas offices with the rest of FTX. And like, I agree. I mean, like, there was some separation. My sense of separation wasn’t huge.

Daniel Filan: Okay.

Oliver Habryka: But yeah, some substantial one. [I would probably, I would feel comfortable calling them] two branches of the same organization. I think that’s roughly the same way. Like there was a clear reporting hierarchy. There was a clear accountability.

Daniel Filan: Okay. So you think, like, FTX is an example of an organization in the EA ecosystem, that...

Oliver Habryka: Correct.

Daniel Filan: … that perhaps until recently, um, have engaged in deceptive behavior.

Oliver Habryka: Definitely. And then, I mean, my guess is there’s more. I [just don’t] see any particular reason for why we would have caught all of them.

AI Alignment Memes [1:16:27]

Oliver Habryka: I mean, there’s like another, like, I’m really not particularly excited about the way OpenAI relates to a lot of stuff. Um, I think they also engage in a lot of deceptive practices.

Daniel Filan: Such as?

Oliver Habryka: [TIMESTAMP 1:16:40] I mean, there’s a lot of stuff that here [less] straightforwardly, I would describe as fraud, but a lot of stuff that I would describe as something like memetic warfare.

A key component was, like… I think, like, the term “AI safety” and “AI alignment” has very drastically shifted in what people think it means over the last two years. And I think a lot of that was straightforwardly the result of a decision *by* OpenAI to be, like, we are going to talk about our models being aligned, we’re going to talk [about] them being, like, clear examples of AI safety progress, in a way that I think bypassed, like, any sane epistemic process for checking those claims. Because they [involved], like, [OpenAI’s] PR, like, machine, to actually propagate them primarily. And I think that sucks, I think it’s quite deceptive in the sense that [now] many past documents that were talking about AI alignment, AI safety, are now going to be read [by] people as supporting something quite different from what it currently means on the margin for substantial chunks of the field. And it’s really sad, and I think makes people’s maps of the world worse.

I also have other complaints. [We] probably shouldn’t have settled on the terms “AI safety” or “AI alignment”, and so there’s a two-sided error here, where I also think, I remember the conversations around 2015, where people were trying to find a term for, you know, AI not killing everyone, that felt very palatable, felt very non-weird, felt like something that you could explain to people without them freaking out. And then indeed, the choice settled on something that in some sense was inherently, like, *chosen* to be ambiguous. So that when you hear it you can first imagine something pretty normal and relatively boring. And then kind of as you hear more about it, there’s a bit of a, like, bait and switch that actually, it’s about AI not killing literally everyone who’s alive.

Daniel Filan: That term being...

Oliver Habryka: “AI alignment.”

Daniel Filan: Okay. And “AI safety” particularly comes to mind here.

Oliver Habryka: Correct. Yes. Both “AI safety” and “AI alignment” have this property.

Real Estate, Part 2 [1:18:42]

Daniel Filan: So as you mentioned, perhaps partly for this reason, you decided to close Lightcone offices. I wonder, so there’s these offices, there’s also this Rose Garden Inn.

Oliver Habryka: Yep.

Daniel Filan: [What’s] the difference between? Like, you want to shut down the offices, but you don’t want to shut down the Inn?

Oliver Habryka: Well, number one, shutting down the Inn costs me, what do I know, like $2 million in transaction fees plus expected losses. So that’s a very different action.

Daniel Filan: How so?

Oliver Habryka: I mean, if you buy… Like, we bought a thing, we probably bought it slightly above market price, because we think we can get value out of it. And then you have to resell it. Reselling it would probably itself be an operation that would take staff time [and might] take on the order of one staff year or something, just in order to route, cycle through all the different potential buyers. It would involve substantial additional construction work in order to kind of make it feasible as a normal hotel operation again, because we substantially changed it away from normal hotel operation.

Daniel Filan: Okay.

Oliver Habryka: Yeah. And then, it is a very deeply irreversible decision, much more irreversible than the Lightcone offices. If I were to decide that something like Lightcone offices should exist again, I can do that in three months or whatever.

Daniel Filan: Okay. This office is essentially being rented WeWork space.

Oliver Habryka: Correct. So I think that’s one of the big ones. But also, having a space that is your own gives you a lot more freedom. The *marginal* cost of holding the Rose Garden Inn for a bit is much lower. Because it’s an owned building, generally property prices will appreciate to some degree. There’s some interest payments we have on the loan that we took out, but that interest payment is actually substantially lower than the rent we were paying on WeWork at a comparable capacity.

And so actually having it, thinking about what to do with it, and experimenting with it, actually, I think, is one of the [things] that I’m most excited about doing in order to kind of find out what kind of culture to build, what kind of community to build here in the Bay Area, *if* we decide that that’s the right plan at all.

So indeed, one of the things that we are thinking about doing next is, we kind of now have the Rose Garden Inn [running enough] that we can use two of the buildings as a retreat center. And so kind of one of the things that I’m most excited about doing in the near future is indeed to just run lots of events, encourage lots of other people to run lots of events there, and use this as a way to sample lots of different cultures, and get [kind] of a sense of where [these] different groups of people from [different] perspectives are gathering here. And kind of almost every weekend, they get to observe, like, what kind of people are they, does this have a snippet of the culture that I would really want to have in something that I’m building here long-term, and then kind of use that as kind of a petri dish to iterate on a culture that I then slowly grow [in] the rest of the space, that is going to be more permanent office and living space.

And I feel pretty excited about that. But I feel very non-rushed. I really don’t want to do a Lightcone-offices-like thing where we invited 200 people in the first week and really tried to just create critical mass from day one. And I’m thinking of something that’s much closer to, a new member, a new person who spends most of their day there, like, every month.

Daniel Filan: Okay. I mean, I guess it’s under experimentation, and you don’t know what it will look like. But if you had to guess, I’m wondering what kind of thing you might be imagining.

Oliver Habryka: For God’s sake, I really don’t know. I mean, at the end of the day, [if] I really had to make a guess of, “What is the most likely outcome?” I’m just like, I think indeed we’ll keep running lots of events there. And those events will be focused on the specific aspects of AI alignment and rationality that I’m most excited about. And in that sense, I will keep investing in some fraction of some subset of the existing ecosystem. I expect there will be some space for visitors who we would like to start thinking about AI alignment and a bunch of adjacent questions as well as rationality, who we would like to invite for a few months at it, like for one to two months at a time, who can stay at the Inn and kind of have there be like… Yep, just, like, interface with the ecosystem and interface with the idea in a space where they really have a lot of time and space to engage with things relatively seriously. And then have something that’s probably relatively similar in structure to the Lightcone offices [where] there’s a number of people whose work we want to support long-term, who come here every day, who build a culture of people who give each other feedback and make it so they don’t inevitably fall depressed.

Brainstorming Topics [1:23:18]

Daniel Filan: Okay. Um. So, we’re about at the end of the list of things I kind of wanted to ask. Is there anything that you wish I’d asked, or wish you could’ve had an opportunity to talk about?

Oliver Habryka: I don’t know. There’s lots of stuff that maybe better belongs in AXRP so you don’t, you know, force AI alignment discussion on the people in your other podcast. [Stuff] that I’m excited about talking marginally more about would [be], like, current approaches to AI alignment and talking a bit about prosaic, non-prosaic approaches. I’m currently very curious about just trying to understand to what degree, like, reward learning from human feedback is making positive or negative contributions to the world.

I would be interested in talking about, I mean, there’s [a whole conversation] around something that’s more of a first-principles approach. I feel like we talked a lot about [what are these features] of the extended community and LessWrong and this ecosystem that we are part of. But [I’m] pretty interested about a conversation that’s more, okay, from first principles, if you were trying to figure out how to develop an art of rationality, if you’re trying to figure out how to make humanity relate sensibly to, like, its long-term future and to this rapid technological change it’s in the middle of, how would you go about steering that.

I guess another thing is, man, I feel like I don’t get to have many conversations about all my libertarian intuitions. So there were definitely some, like, fights about libertarianism that I’m interested in.

Daniel Filan: Okay, sure. I think… I think I am most interested in the first-principles rationality thing. If people want to hear about AI alignment that can be on the other podcast. If people want to hear more about libertarianism, that can be in approximately every other episode of this podcast.

Oliver Habryka: *Laughs*

Rationality from First Principles [1:25:17]

Daniel Filan: [If] we were starting projects to kind of improve humanity’s relation to rationality and to the long-term future, what would we do?

Oliver Habryka: I mean, here’s one thing that I’ve been playing around with. So there’s this, you know, pretty crazy group of people I have all kinds of complicated feelings about. Man, I guess I’m talking about people again, but it’s an easy reference point to get started here, but I would prefer to not keep talking about people. But there’s this group around Michael Vassar, that are often called the Vassarites. Michael Vassar for a short while was CEO of the Singularity Institute, and kind of has been a thinker [in] the EA, rationality, AI alignment ecosystem.

And he has this core hypothesis, that I have found a decent amount of value engaging with, which is something like: one of the things that you most want to get right, that the world is most insane about, is credit allocation. Well, I find that hypothesis pretty interesting, where the [core argument] here is something like… [Or maybe] another perspective that [turns out] to be, I think, something quite similar, but has a very different framing to it, is being able to talk about deceptive behavior at all.

This is taking a relatively adversarial stance and what destroys people’s sanity. But I think… Like, here’s a conversational move that I find extremely hard to make. But if I knew how to make it, I think the world would be a lot better. And I would have a [much] better impact on the world.

Sometimes, I have a conversation with someone. And they hold a belief that I think is wrong, and I provide a number of counter-arguments, and then they’re like, “Okay, I can see how that belief is wrong.” Then they hold another belief that I think is wrong, and then I provide a number of counter-arguments, and then they’re like, “Okay, I’m also convinced of this.” And [you look at someone], there’s a number of errors that they make. And in each error, they [genuinely] believe a number of things or something, or [at least it looks like] they genuinely believe a number of things, you provide a number of counter-arguments, and they’re like, “Yep, those counter-arguments do indeed seem right.”

And then a question that I would often like to be able to ask is, “Hey, dude. Look at the last three conversations that we had. In the last three conversations, you said that you believed X and I provided a number of counter-arguments, each case relatively compelling. But it looks to me that in these last three conversations that we had, there was a systematic error in your reasoning that looked like you weren’t actually trying [to] arrive at the truth. It looks like there was some kind of bias, you were trying to, like, tell some kind of story, you were trying to hold some kind of thing constant that you didn’t actually feel comfortable bringing up in a conversation, that is… I wouldn’t call it your true reason for believing it. But that was something that was clearly biasing or shifting your underlying [knowledge generation process] or hypothesis generation process.”

That statement always results in absolutely terrible conversations afterwards, unless you’re in an extremely high-trust context. [On] the internet, it approximately always gets read as, “You’re an idiot.” I think it [also] almost always gets seen as, *sighs*, I mean, it is in some sense frequently an accusation of lying or something. Like it does mean that, you look at when someone’s beliefs are systematically biased in a certain direction or wanting to support a certain conclusion, and they [can keep pushing away] at various edges of it. But there’s a core generator of it that feels like it is trying to generate arguments for a wrong conclusion in a way that is trying to convince other people of a falsehood. That doesn’t necessarily need to be corresponding to an internal knowledge that you’re lying to people, but it does ultimately result in a world where [you would] really like to make people’s maps of the world worse for some reason or another.

And talking about this stuff [is] extremely hard. I don’t know how to do it. I only know how to do [it] in private conversations. And even here I feel a bit like I have to couch it in very abstract terms in order to even really think about it. In order for the conversation to not descend into, well, yes, of course, you shouldn’t randomly accuse people of lying in various circumstances.

Daniel Filan: [In] some sense, if I think about the LessWrong sequences, kind of an entry point into that is this idea of systemic biases in thinking which in some sense is very similar to the kind of thing you’re describing.

Oliver Habryka: I think for me usually (*chuckles*), like, this is a very weird fact. I still don’t really understand what happened. When I read the Sequences, I underwent a massive, a quite large shift in personality, where I used to get very angry (before I read the Sequences) at people [when they] did things that seemed dumb according to me. I think often feeling like, man, people are doing things in bad faith, or people… [In] some sense, I wanted to call people out on doing dumb stuff, or clearly they weren’t doing stuff for the common good or whatever.

And then after I read the Sequences, I chilled out a shitton, because my usual sense was yes, [I have grieved], I’ve come to accept the fact that humans are, like, crazy, [TIMESTAMP 1:30:30][unintelligible] monkeys barely capable of reasoning. And my relationship to people saying crazy things, or saying false things, in some important sense has mellowed out of being like, “Yep, humans are just kind of crazy. I understand that humans are biased.”

My relationship to it now is less of a need to violently assert that people have to be reasonable, but more of a sense of curiosity about trying to understand about where people are insane. And that has made, I think, a good chunk of these conversations more possible, where now I do feel like I can have a conversation and [be] like, “Yeah, man, maybe I am doing all the things that I’m doing primarily for signaling reasons, or various stuff like this.” In a way that I don’t think I would have been capable of holding that conversation before, myself.

And I’m *still* not capable of having that conversation with, like, my family or whatever. If I’m imagining having a conversation with my family being like, “Man, maybe you’re doing most of these things because you’re attached to an identity of being X.” I don’t really (*chuckles*) [expect] that to go particularly well.

Legible Signals [1:31:30]

Daniel Filan: So yeah, if I wanted to build a world where it was possible to talk about kinds of systemic mistakes, not within a conversation, but [as] a pattern of thinking or something. Where do you think you would start?

Oliver Habryka: I mean, [the obvious answer] is something like, well, start small, try to find a small group of people where you somehow are capable of talking about mistakes and systematic biases and stuff. An instance of this… I don’t look at that instance and [am] like, that’s obviously going to both work at a larger scale and is even a good idea in the first place. But I kind of quite like Bridgewater in this respect here.

Daniel Filan: What is Bridgewater?

Oliver Habryka: Bridgewater is a trading firm [which] has a relatively long history of making pretty successful trades in a bunch of different contexts. It is led by a guy called Ray Dalio. He’s also written a bunch of books that I like. And one of the, I think, most important aspects here is that it has an internal culture of very drastic transparency and honest feedback. Where, when you look at, both in cultural onboarding, when you join the organization, and also if you just look at the internal processes of Bridgewater, this kind of feedback is very strongly encouraged, where you are supposed to, like, give negative feedback frequently, commonly.

And [as in their more latest version] that I’m looking at with the most feeling of, “Man, are you sure that’s a good idea?” There’s a whole systematized game and gametized system where basically all Bridgewater employees have a phone app in which, at any given point, they can register positive or negative data points on the behavior of anyone else in the organization, on about 80 different dimensions, like I think it’s between 40 and 80 different dimensions, of virtues that according to the organization are the basis of good performance. And so it’s pretty common that in the middle of a talk, a bunch of people will be there on their laptops or their phones, and will be like, this person, man, I think I just observed the data point that this [was a 20 percentile data point or 10 percentile data point] about this person at bullshitting. But [this] other thing that they said was an 80 percentile data point at the skill of holding practicality and idealism in mind at the same time.

And [the] idea is that you get a lot of feedback, you get a lot of data, partially so that each individual data point doesn’t spiral into a huge discussion of whether someone is worthy or someone should be kicked out or whatever.

Daniel Filan: Yeah.

Oliver Habryka: And also there’s a bunch of algorithms on the backend that then do probably a tiny bit of PageRank-style stuff where the people who have high judgment, according to many people in the organization on [metric X will have their judgment on that metric X] matter more for other people and so on.

Daniel Filan: Yeah. And there I wonder: [the thing about having] a culture of rating people is [that] you need good raters, right?

Oliver Habryka: Yeah.

Daniel Filan: And [in] a conversational setting, I don’t know, people can just say why they think something and then you can pick it apart. I wonder [how good] people are at doing this rating, basically.

Oliver Habryka: I mean this is kind of always, like… There’s a quote making a reference to, like, the Vassarites. There’s one of my favorite quotes by Michael, I think he at some point put it on Twitter, that was trying to be [an] analogy to that thing about, “First they came for the Jews, but I didn’t do anything, and then they came for the...” I don’t remember the exact...

Daniel Filan: I think one of them is unions.

Oliver Habryka: Yeah. “Then they came for the unionists and I wasn’t a unionist so I didn’t do anything.” And [his version] was just tweet-length and it was, “First they came for the epistemology. Then we didn’t know what happened next.” And there’s the sense of, the first thing that tends to happen when this kind of stuff happens is, you have a status hierarchy. Going back to the earlier topic that we had, you have a status hierarchy and there we introduce a process to have people say things about how good other people are, and then I think that process about how good other people are is under a shitton of adversarial pressure.

Like in the context of the FTX situation, a proposal that I’ve discussed with a number of people is, “Fuck, man, why did people trust Sam? I didn’t trust Sam.” “What we should have done is,we should have just created a number, like, there’s 10 cards and these are the ‘actually trustworthy people’ and ‘high-integrity people’ cards, and we should have given them to 10 people in the EA community who we actually think are [highly] trustworthy and high-integrity people, so that it actually makes sense for you to trust them. And we should just have [high-legibility] signals of trust and judgment that are clearly legible to the world.”

To which my response was “Lol, that would have made this whole situation much worse.” I can guarantee you that if you [had] handed a number of people—in this ecosystem or any other ecosystem—the “This person has definitely good judgment and you should trust what they say.” [card. Then] in the moment somebody has that card and has that official role in the ecosystem, of course they will be [under a] shitton of adversarial pressure, for [them to] now endorse people who really care about getting additional resources, who really care about stuff.

Daniel Filan: The word “we” also seems funny there. I don’t know. [Sometimes] people use the word “we” in a way where it’s clear what the “we” is and sometimes it isn’t and, yeah, that tripped my alarm, maybe on purpose.

Oliver Habryka: Yeah, I think it was part of it, of something funny going on. In this case the specific proposal was, I was talking about this with [TIMESTAMP 1:37:17][unintelligible, a person name] and [unintelligible] was like, “I should hand out my 10 high-integrity cards, these people are actually trustworthy.” And then I’m like, “Well, there are two worlds. Either nobody cares about who you think is high-integrity and trustworthy, or people *do* care and now you’ve made the lives of everyone who you gave a high-integrity / trustworthy card a lot worse. Because now they’re just an obvious giant target, that if you successfully get one of the people of the high-integrity, high-trustworthy cards to endorse you, you have free reign and now challenging you becomes equivalent to challenging the “high-integrity, [high-trust] people” institution. Which sure seems like one of the hardest institutions to object to.

And I think we’ve seen this in a number of other places. I think there was one specific instance [the] local Berkeley rationality community, with the ACDC board.

Daniel Filan: What’s that, is that a rock band?

Oliver Habryka: Correct, that’s a rock band, which we recruited to handle community disputes in the Bay Area… Nah, that’s not true. There was a specific board set up by the Center for Applied Rationality, [where] CFAR kept being asked to navigate various community disputes, and they were like, “Look, man, we would like to run workshops, can we please do anything else?” And then they set up a board to be like, “Look, if you have community disputes in the Bay Area, go to this board. They will maybe do some investigative stuff, and then they will try to figure out what should happen, like, do mediation, maybe [speak about] who was actually in the right, who was in the wrong.”

And approximately the first thing that happened is that, like, one of the people who I consider most abusive in the EA community basically just captured that board, and [got] all the board members to endorse him quite strongly. And then when a bunch of people who were hurt by him came out, the board was like, “Oh, we definitely don’t think these [people who were abused] are saying anything correct. We trust the guy who abused everyone.”

Which is a specific example of, if you have an institution that is being given the power to [blame] and speak judgment on people, and try to create common knowledge about [what] is trustworthy and what is non-trustworthy, that institution is under a lot of pressure...

[We] can see similar things happening with HR departments all around the world. Where the official [purpose] of the HR department is to, you know, somehow make your staff happy and give them [ways to] escalate to management if their manager is bad. But most HR departments around the world are actually, like, a trap, where if you go to the HR department, probably the person you complained about is one of the first people to find out, and then you can be kicked out of the organization before your complaint can travel anywhere else.

It’s not true in all HR departments but it’s a common enough occurrence in HR departments that if you look at Hacker News and are like, “Should I complain to HR about my problems?”, like half of the commenters will be like, “Never talk to HR.” HR is the single most corrupt part of any organization. And I think this is the case because it also tends to be the place where hiring and firing [decisions get] made and therefore is under a lot of pressure.

Credit Allocation [1:40:15]

Daniel Filan: So yeah if we wanted to get credit allocation right, or somehow help ourselves break out of weird motivated cognition...

Oliver Habryka: Yeah, I mean, *sighs*… I have updated over the years that basic-needs security actually really helps a lot, but I feel confused about it. Definitely, if I was thinking about just, what are the biggest things that generally make people more sane here, my sense would be something like, they all have good BATNA’s, and those BATNA’s allow their basic needs to roughly be fulfilled.

Daniel Filan: BATNA’s being best alternatives to negotiated agreement?

Oliver Habryka: Correct.

Daniel Filan: So basically things [like] they can go home and not have to play ball and still get...

Oliver Habryka: Correct. And this reduces a lot of messy power dynamics. It reduces a lot of the threat that comes from someone attacking you. But usually, even if everyone has pretty good BATNA’s… When people care about achieving a shared goal, like, you’re building a startup together to change the world. Your BATNA now will always suck.

I don’t know how to create a good BATNA if you want to change the world. Like, I might be able to give you a good BATNA that you won’t starve when you go home and you won’t be out of a job, or you will have a family and friends. But I can’t give you a good BATNA about the question of, are you going to transform how personal finance works. The answer is, if this thing goes under, probably you won’t be able to transform how personal finance works. And the more you care about that, the more, probably, you’re going to relate with anxiety and all of these dynamics are going to pop up as you’re making decisions within your organization.

And given that you mostly want to staff organizations with people who very deeply care about [their missions], I think it’s going to be really hard to avoid these dynamics. But [satisfying] people’s basic needs still feels [like] it helps a lot.

Lessons from Economics [1:42:06]

Daniel Filan: If I think about the style of thought that is worried about people being kind of Machiavellian, and being willing to sort of deceive themselves, or whatever—the field that comes to mind is economics. [Or] if I think about mechanism design, [a lot of thought has been put into coming up] with ways of where people can just be, like, base and selfish, and good things still happen. I wonder how much of that… Do you think the LessWrong rationality community [has learned] enough of the lessons of the economics field?

Oliver Habryka: [The] economics field feels to me like it is decently good, it has a lot of theory about how to get people to *do* something and set up the incentives, assuming that you have an accurate map of the world. But I don’t think of the economics field, even though it has some [works,] for example Robin Hanson’s work on signaling that’s more about, what are the incentives on various pieces of information flowing around [the world].

But most of economics is just not that helpful for understanding [what] pieces of information will flow around, how can pieces of information be made accurate. I do think some of the mechanism design space here is helpful. There’s some work in forecasting about proper scoring rules and similar stuff.

Daniel Filan: I mean that’s Sturgeon’s law, right, 95% of everything is not good. We’ve got prediction markets...

Oliver Habryka: [Definitely], “Man, can we just solve all of this with prediction markets?” is definitely, frequently an intuition of mine. Definitely I’m [like, just create some kind of buck-passing system]… [I mean] it’s just really hard. Grounding things out is actually quite hard in prediction markets and in forecasting tournaments.

But I definitely *have* been kind of… One of the things that I [have found] most interest in thinking about since the FTX situation has been scandal markets. Where one person, I think it was Nathan Young, just after the FTX situation was like, “You know what I’m going to do? I’m just going to create a market for, like, every person who seems reasonably powerful in EA, and just have people forecast how likely they will turn out to be frauds, or how likely they are going to explode in some relatively large, violent way.”

Somewhat sadly, indeed, relatively immediately, a bunch of people were like, “I don’t know, man, that sounds really stressful and seems, like, it really just kind of cost me a bunch of reputation without gaining me anything. Can you please take all of them down?” And then he was like, “Okay, fine, I guess.” Which is, I guess, [another related thing], evidence about how you can’t have nice things.

But I have been thinking about [this and maybe] scandal markets and similar things actually would help a lot here. But I don’t know, it’s not like prediction markets are very robust to external attackers right now. Even financial markets have short squeezes and similar things. Even in domains where you would really expect that, like, you have these efficiency arguments in markets, almost all of them are in the long run. And in the short run, market dynamics are still often so that it’s *really* hard to interpret the current stock price of a thing as evidence about a straightforward proposition about the world.

Daniel Filan: I mean, well, for stock prices, [they’re not quite] supposed to be just straightforward facts about the world.

Oliver Habryka: Correct. I mean, well, in some sense...

Daniel Filan: Well, they’re sort of expected revenue streams, right?

Oliver Habryka: Correct. Like it should really be like, “Is this thing proportional to expected revenue streams?”

Daniel Filan: Yeah. But most propositions aren’t about [revenue streams from Microsoft], right?

Oliver Habryka: Yeah. But that’s kind of the thing. Like when you set up any financial instrument, figuring out what proposition in the real world it corresponds to, what [actual] piece of information in the real world it’s tracking, is extremely hard. And frequently I think of prediction markets as, we are going to create a financial instrument and then we’re going to slap the *label* on the financial instrument being, like, “This measures how likely this proposition is true.” But no, it’s just a label you slapped onto it. The financial instrument will… actually, [the exact resolution criterion] will really matter.

Whether the website that this thing is being hosted on will still be up will really matter. The relative value of money under different worlds will really matter. And so, like, [TIMESTAMP 1:46:35][unintelligible] are creating these financial instruments and then just slapping the label on: “This financial instrument corresponds to the likelihood that Trump gets elected.” But turns out no, indeed all the markets on Trump seem somewhat [unintelligible]. Like in many cases over the last, like, 10 years, [prediction markets were] somewhat obviously not particularly well-calibrated, because you just saw that that dumb money was flowing in and skewing the market price and the margins were not large enough to really correct the market price in a bunch of ways.

Daniel Filan: In a world where it’s very hard to bet large amounts of money on prediction markets.

Oliver Habryka: Correct.

Daniel Filan: And hard to make them.

Oliver Habryka: Yeah. And maybe, I agree, by changing the financial instrument that the thing de facto is; by trying to make it more robust to various alternative interpretations, you can make it more accurate. But sure, there’s a game, and it’s a very hard game.

And I think one of the things that I did update on, [over] the pandemic in particular, was my ability to be confident about what the performance of a financial instrument means in the world. Where just, like, many stock prices, many major macroeconomic global indicators, that I had labeled as, “I understand under what conditions this would go up and down” just definitely (*chuckles*) didn’t remotely do the things that I had expected.

And in retrospect, I have good explanations for why the things happened that they did. But I just made a large update on my personal humility and being like, if you give me a financial instrument and then try to make inferences about the world based on this state of that financial instrument. I, a lot of the time, am going to be like, “I have no idea what this means. There [are] like 15 different ways in which this thing could somehow resolve one way or another, or pay out to people on one side or the other, in ways that I didn’t at all foresee.”

Daniel Filan: What’s an example of an instrument doing something weird during the pandemic?

Oliver Habryka: I mean, definitely, like, [GDP, average stock returns] were really… I mean, that one was the obvious one. But man, it still took me by surprise...

Daniel Filan: What about average stock returns?

Oliver Habryka: I think basically, the markets correctly forecast, like… [When the] pandemic started, it seemed pretty obvious that the market would be really hurt. What happened is, the market went down a bit, and then it went up, a lot, even before, and then I was like, (*whispering*) “Why is the, I don’t know, why the fuck is the market going up?” And I was sitting there, over here in our living room, with Connor being like, “Why is the market, I just like, this doesn’t make sense. Is everyone expecting the pandemic to be over in two to three months? This seems pretty crazy.”

Like, we were trying to forecast the length of the pandemic on the basis of the performance of the stock market on various bonds and various other things where it seems you should be able to somehow extract [that state of the world]. And in the end, it turns out that the market was pricing in—probably is my guess [was what happened], is the market was pricing in large government intervention, that involved just a shitton of stimulus. And the market was much better at forecasting stimulus. And I had so much noise in my model of stimulus that, turns out the [stock market performance] was, for about one to two years, substantially tracking the [places] where government stimulus would happen and the places where, you know, large financial redistribution would happen, in a way that made extracting information [from it] about the object level territory extremely hard.

Daniel Filan: Why would… So on some level, it’s a bit weird that redistributing a bunch of cash would increase companies’ revenue flows, right? Because cash comes from somewhere, right?

Oliver Habryka: I mean, it comes from inflation. You can just print money. Like what we did is, we just printed a ton of money, approximately, and then we inflated everything, and then the organizations that got all the inflated money, their prices went up, and indeed the prices of approximately everything went up.

Daniel Filan: Presumably by inflation, you would kind of think.

Oliver Habryka: Yeah. And it *still* was the case that, [if] you were trying to be like, “Oh man, [I really expect] this pandemic to be longer. I gotta go short the market.” You were so fucked.

Daniel Filan: Unless you looked at, like, TIPS spreads or something.

Oliver Habryka: Yeah. Unless you do stuff that good financial analysts can probably, somehow, with many decades of experience and many years of experience, can figure out how to interpret the market and read in the bones of the market, like, [be better] at forecasting the future. But *I* personally realized that I did not have a deep enough understanding.

And my sense is also, even talking to other people that I do *know*, reading a bit of Zvi’s stuff, reading a bit of other people’s stuff who have financial trading experience, that no, even the people who have spent years of their [lives] trying to actually understand how the world will change on the basis of markets, have a relationship to [it] like, “I really don’t know how to interpret how markets relate to reality.” If I’m a successful trader, I’m usually a successful trader because I build a very specific edge and then somehow construct a thing that allows me to trade on just that specific edge. But most of that construction [doesn’t] have anything to do with the world. It’s just fancy financial instruments that somehow just insure me against all the other bad things that could happen.

Operationalizing Forecasting Questions [1:51:34]

Daniel Filan: It kind of reminds me, one of the famous things in the forecasting community is the difficulty of writing questions.

Oliver Habryka: Yeah.

Daniel Filan: Recently I had one… So I think this isn’t well-known in the wider world, but in the German Catholic Church, there’s currently this thing called the ‘Synodaler Weg’, the Synodal Path. Where there’s this thing called a Synod, where a bunch of bishops meet. They’re not doing a Synod. They’re doing things that are sort of [like Synod], but not exactly Synod. They have a Synodal Council. And I don’t know, a bunch of bishops, and a bunch of members of the German Central Committee of Catholics are coming up with resolutions. Like, “We think the catechism of the Catholic Church is wrong about homosexuality.” Like, “We think women should be able to be priests.” I’m not sure if they highlight how they’re saying something different from the catechism, but they are. And so, at least within [the sub-portion] of the Catholic world that is very plugged into Catholic news, there’s some sense of, “Oh, is there going to be a schism? Are we going to have, like, Protestantism 2.0? Germany again.”

Oliver Habryka: *Laughs*

Daniel Filan: Like, “What’s wrong with these guys?” But, yeah. And so, [as an] observer passing by, I was kind of interested in writing a Metaculus question about this. And they’re like, oh, how do you, [what would count] as a thing being in schism, where you have to be robust to, you know, various people being polite about it [etc.]. I don’t know, I got a [forecasting] question that, I think it’s an okay question, but...

Oliver Habryka: Yeah, [I made that update], kind of from Jacob Lagerros, who now works with me on Lightcone. [He] ran the AI Metaculus sub-forum thing for a while, together with Ben Goldhaber. And he would just work from my office, from the LessWrong office at the time. (*laughs*) So I would just see him every day hanging out with Ben Goldhaber for hours, being like, “Fuck. We just want to figure out whether GPUs are going to get cheaper, but it’s just *so hard* to figure out how to define what it means for GPUs to get cheaper. Because we don’t know *how* they will get cheaper. You can try to talk about average FLOPS, but this is just really hard, because the FLOPS stay the same, but then [the throughput] on the matrix operations increases, [from], like, four operations at the same time to eight operations at the same time. And this technically doubles all of your FLOPS, but almost no software can take advantage of it because you just don’t really usually have things in eight-form vectors, that’s relatively rare. But for some applications we might.”

And then you just have all of this stuff that’s extremely hard to define. And so they took a number of different approaches, but I just, for almost a full year, saw people in my office every day being like, “How the fuck do we define this?”

Daniel Filan: Yeah.

Oliver Habryka: Overall, giving up. *Laughs* Jacob overall was like, “This is too hard. I don’t know how to do this. I will go off and do other things because I think this operationalization question [of] forecasting is just too hard to actually get useful labor out of it.”

Mechanism Design and UI [1:54:45]

Daniel Filan: So stepping back a bit. If I think about, I don’t know, [vaguely economic ways of thinking] about how to [get] people to relate to common projects better. One kind of community that I think of is the radical exchange people. So this somewhat blossomed out of a book called Radical Markets with different ideas for how to have a generalized tax on owning stuff and different ways of voting and some hope that could kind of align everyone’s incentives to create a better world. I’m wondering, what do you think of that style of thinking?

Oliver Habryka: Yeah. [This] might be coming from [somewhat] of a perspective of my narrow world, but I talked to Jacob about this as well, who [was collaborating] with a number of people there during his Metaculus times. And his experience was definitely [like], they have *so* many good ideas in mechanism design, and *nobody* who understands UI design. And that is the central bottleneck of mechanism design.

He was just like, you have these people walking around [and] then you have this complicated five-step mechanism, and we just need people to input the data into this format. We just need people to assign probabilities to all of these hypotheses. And then you’re like, “But how are you going to do that?” And then you look at their UIs and [you] scream in horror. Like you would [try to use it] and you would obviously put in the wrong data because the format is far too complicated and you get no help [with] eliciting your actual beliefs and your actual preferences in this context.

Daniel Filan: Yeah.

Oliver Habryka: And generally, using almost anything in this space was super clunky. So in this space, I feel quite excited about stuff that’s taking a bunch of the ideas from mechanism design and just making it so that it doesn’t give you eye cancer when you try to look at it and use it.

I think that’s one of the things where [there’s] a bunch of interesting low-hanging fruit here. And we’ve done this a bit on LessWrong. For example, currently our review runs on quadratic voting. And, like, all the other quadratic voting UIs kind of suck. And [I] think we’ve now simplified it in a way that isn’t *that* overwhelming, but actually has pretty good incentives and pretty good aggregation properties. But that was a lot of work. It was a lot more work than, like, understanding the blog post about quadratic voting.

Wrapping Up [1:57:05]

Daniel Filan: So, we’ve been talking for quite a while. So I want to wrap up around now.

Oliver Habryka: Seems good.

Daniel Filan: [If] people were interested in this conversation and they want to hear more of your thoughts about stuff, how can they do that?

Oliver Habryka: Oh, [you] can look at my profile on LessWrong and EA Forum where I tend to write quite a bit.

Daniel Filan: How can they, what’s the name of your profile?

Oliver Habryka: Habryka. Just my last name.

I also sometimes write on Twitter. My guess is, just because Twitter is terrible, my Twitter thoughts are the worst ones. Um, but they’re also the most edgy ones. So, you know, that’s the place if you want weird random edgy thoughts because, you kind of have to be edgy if you want to fit into 240 characters.

And then, [I mean, I guess a] good chunk of the people who listen to this thing are people who can find me in-person somehow, at some point, at many of the places I will be, whether it’s EAG, or workshops I organize, or people visiting the Lightcone offices over the next two months, or whatever else we’re building over the next few months.

You can *also* message me on the LessWrong intercom, which historically has been quite successful. (*lowers voice*) At various points, um, my partner started messaging me on intercom because I would respond substantially faster there than I would on Messenger. Yeah, don’t abuse your power.

Also these days will be mostly [TIMESTAMP 1:58:32][unintelligible, person name]’s intercom. But if you want to chat with me about something, I’ve had fun conversations in my chat intercom support system. You can also message me by LessWrong DMs.

Daniel Filan: Alright. Well, thanks for talking today. And listeners, I hope you found this a useful episode.

Oliver Habryka: Thank you, Daniel.

What links here?

MondSemmel and RobertM

14 Feb 2023 2:38 UTC

104 points

9 comments72 min readLW link

Site Meta Lightcone Infrastructure Community Audio Transcripts

The Filan Cabinet Podcast with Oliver Habryka—Transcript

Introduction [0:00:00]

On LessWrong

The History of LessWrong [0:00:46]

The Value of LessWrong [0:08:49]

Reputation Management and Narratives [0:23:15]

Changing Your Mind on the Internet [0:28:34]

Trust Relationships [0:31:15]

On Forecasting and the Blogosphere [0:32:59]

On Communication and Essays [0:37:46]

Dialogues on LessWrong [0:40:09]

Lightcone Infrastructure [0:45:26]

Existential Risk [0:54:43]

Online and In-Person Communities [0:57:14]

Real Estate [1:04:01]

The LW /​ Rationality, AI Alignment, EA Ecosystem [1:05:54]

Immoral Behavior [1:08:56]

AI Alignment Memes [1:16:27]

Real Estate, Part 2 [1:18:42]

Brainstorming Topics [1:23:18]

Rationality from First Principles [1:25:17]

Legible Signals [1:31:30]

Credit Allocation [1:40:15]

Lessons from Economics [1:42:06]

Operationalizing Forecasting Questions [1:51:34]

Mechanism Design and UI [1:54:45]

Wrapping Up [1:57:05]

The LW / Rationality, AI Alignment, EA Ecosystem [1:05:54]