habryka(Oliver Habryka)

Karma: 32,480

Running Lightcone Infrastructure, which runs LessWrong. You can reach me at habryka@lesswrong.com

LessWrong’s (first) album: I Have Been A Good Bing

habryka and kave

1 Apr 2024 7:33 UTC

534 points

167 comments11 min readLW link

Shutting Down the Lightcone Offices

habryka and Ben Pace

14 Mar 2023 22:47 UTC

337 points

93 comments17 min readLW link

habryka 7 Sep 2023 8:17 UTC
317 points
171
in reply to: spencerg’s comment on: Sharing Information About Nonlinear
I don’t have all the context of Ben’s investigation here, but as someone who has done investigations like this in the past, here are some thoughts on why I don’t feel super sympathetic to requests to delay publication:
In this case, it seems to me that there is a large and substantial threat of retaliation. My guess is Ben’s sources were worried about Emerson hiring stalkers, calling their family, trying to get them fired from their job, or threatening legal action. Having things be out in the public can provide a defense because it is much easier to ask for help if the conflict happens in the open.
As a concrete example, Emerson has just sent me an email saying:
Given the irreversible damage that would occur by publishing, it simply is inexcusable to not give us a bit of time to correct the libelous falsehoods in this document, and if published as is we intend to pursue legal action for libel against Ben Pace personally and Lightcone for the maximum damages permitted by law. The legal case is unambiguous and publishing it now would both be unethical and gross negligence, causing irreversible damage.
For the record, the threat of libel suit and use of statements like “maximum damages permitted by law” seem to me to be attempts at intimidation. Also, as someone who has looked quite a lot into libel law (having been threatened with libel suits many times over the years), describing the legal case as “unambiguous” seems inaccurate and a further attempt at intimidation.
My guess is Ben’s sources have also received dozens of calls (as have I have received many in the last few hours), and I wouldn’t be surprised to hear that Emerson called up my board, or would otherwise try to find some other piece of leverage against Lightcone, Ben, or Ben’s sources if he had more time. While I am not that worried about Emerson, I think many other people are in a much more vulnerable position and I can really resonate with not wanting to give someone an opportunity to gather their forces (and in that case I think it’s reasonable to force the conflict out in the open, which is far from an ideal arena, but does provide protection against many types of threats and adversarial action).
Separately, the time investment for things like this is really quite enormous and I have found it extremely hard to do work of this type in parallel to other kinds of work, especially towards the end of a project like this, when the information is ready for sharing, and lots of people have strong opinions and try to pressure you in various ways. Delaying by “just a week” probably translates into roughly 40 hours of productive time lost, even if there isn’t much to do, because it’s so hard to focus on other things. That’s just a lot of additional time, and so it’s not actually a very cheap ask.
Lastly, I have also found that the standard way that abuse in the extended EA community has been successfully prevented from being discovered is by forcing everyone who wants to publicize or share any information about it to jump through a large number of hoops. Calls for “just wait a week” and “just run your posts by the party you are criticizing” might sound reasonable in isolation, but very quickly multiply the cost of any information sharing, and have huge chilling effects that prevent the publishing of most information and accusations. Asking the other party to just keep doing a lot of due diligence is easy and successful and keeps most people away from doing investigations like this.
As I have written about before, I myself ended up being intimidated by this for the case of FTX and chose not to share my concerns about FTX more widely, which I continue to consider one of the worst mistakes of my career.
My current guess is that if it is indeed the case that Emerson and Kat have clear proof that a lot of the information in this post is false, then I think they should share that information publicly. Maybe on their own blog, or maybe here on LessWrong or on the EA Forum. It is also the case that rumors about people having had very bad experiences working with Nonlinear are already circulating around the community and this is already having a large effect on Nonlinear, and as such, being able to have clear false accusations to respond against should help them clear their name, if they are indeed false.
I agree that this kind of post can be costly, and I don’t want to ignore the potential costs of false accusations, but at least to me it seems like I want an equilibrium of substantially more information sharing, and to put more trust in people’s ability to update their models of what is going on, and less paternalistic “people are incapable of updating if we present proof that the accusations are false”, especially given what happened with FTX and the costs we have observed from failing to share observations like this.
A final point that feels a bit harder to communicate is that in my experience, some people are just really good at manipulation, throwing you off-balance, and distorting your view of reality, and this is a strong reason to not commit to run everything by the people you are sharing information on. A common theme that I remember hearing from people who had concerns about SBF is that people intended to warn other people, or share information, then they talked to SBF, and somehow during that conversation he disarmed them, without really responding to the essence of their concerns. This can take the form of threats and intimidation, or the form of just being really charismatic and making you forget what your concerns were, or more deeply ripping away your grounding and making you think that your concerns aren’t real, and that actually everyone is doing the thing that seems wrong to you, and you are going to out yourself as naive and gullible by sharing your perspective.
[Edit: The closest post we have to setting norms on when to share information with orgs you are criticizing is Jeff Kauffman’s post on the matter. While I don’t fully agree with the reasoning within it, in there he says:
Sometimes orgs will respond with requests for changes, or try to engage you in private back-and forth. While you’re welcome to make edits in response to what you learn from them, you don’t have an obligation to: it’s fine to just say “I’m planning to publish this as-is, and I’d be happy to discuss your concerns publicly in the comments.”

[EDIT: I’m not advocating this for cases where you’re worried that the org will retaliate or otherwise behave badly if you give them advance warning, or for cases where you’ve had a bad experience with an org and don’t want any further interaction. For example, I expect Curzi didn’t give Leverage an opportunity to prepare a response to My Experience with Leverage Research, and that’s fine.]
This case seems to me to be fairly clearly covered by the second paragraph, and also, Nonlinear’s response to “I am happy to discuss your concerns publicly in the comments” was to respond with “I will sue you if you publish these concerns”, to which IMO the reasonable response is to just go ahead and publish before things escalate further. Separately, my sense is Ben’s sources really didn’t want any further interaction and really preferred having this over with, which I resonate with, and is also explicitly covered by Jeff’s post.
So in as much as you are trying to enforce some kind of existing norm that demands running posts like this by the org, I don’t think that norm currently has widespread buy-in, as the most popular and widely-quoted post on the topic does not demand that standard (I separately think the post is still slightly too much in favor of running posts by the organizations they are criticizing, but that’s for a different debate).]
What links here?

AI Timelines

habryka, Daniel Kokotajlo, Ajeya Cotra and Ege Erdil

10 Nov 2023 5:28 UTC

263 points

74 comments51 min readLW link

Express interest in an “FHI of the West”

habryka18 Apr 2024 3:32 UTC

262 points

41 comments3 min readLW link

The LessWrong Team is now Lightcone Infrastructure, come work with us!

habryka1 Oct 2021 1:20 UTC

247 points

71 comments4 min readLW link

Launching Lightspeed Grants (Apply by July 6th)

habryka7 Jun 2023 2:53 UTC

211 points

41 comments5 min readLW link

Lightcone Infrastructure/LessWrong is looking for funding

habryka14 Jun 2023 4:45 UTC

202 points

38 comments1 min readLW link

The Lighthaven Campus is open for bookings

habryka30 Sep 2023 1:08 UTC

198 points

18 comments5 min readLW link

(www.lighthaven.space)

My tentative best guess on how EAs and Rationalists sometimes turn crazy

habryka21 Jun 2023 4:11 UTC

180 points

106 comments8 min readLW link

habryka 17 Oct 2021 2:32 UTC
158 points
in reply to: AnnaSalamon’s comment on: My experience at and around MIRI and CFAR (inspired by Zoe Curzi’s writeup of experiences at Leverage)
The post explicitly calls for thinking about how this situation is similar to what is happening/happened at Leverage, and I think that’s a good thing to do. I do think that I do have specific evidence that makes me think that what happened at Leverage seemed pretty different from my experiences with CFAR/MIRI.
Like, I’ve talked to a lot of people about stuff that happened at Leverage in the last few days, and I do think that overall, the level of secrecy and paranoia about information leaks at Leverage seemed drastically higher than anywhere else in the community that I’ve seen, and I feel like the post is trying to draw some parallel here that fails to land for me (though it’s also plausible it is pointing out a higher level of information control than I thought was present at MIRI/CFAR).
I have also had my disagreements with MIRI being more secretive, and think it comes with a high cost that I think has been underestimated by at least some of the leadership, but I haven’t heard of people being “quarantined from their friends” because they attracted some “set of demons/bad objects that might infect others when they come into contact with them”, which feels to me like a different level of social isolation, and is part of the thing that happened in Leverage near the end. Whereas I’ve never heard of anything even remotely like this happening at MIRI or CFAR.
To be clear, I think this kind of purity dynamic is also present in other contexts, like high-class/low-class dynamics, and various other problematic common social dynamics, but I haven’t seen anything that seems to result in as much social isolation and alienation, in a way that seemed straightforwardly very harmful to me, and more harmful than anything comparable I’ve seen in the rest of the community (though not more harmful than what I have heard from some people about e.g. working at Apple or the U.S. military, which seem to have very similarly strict procedures and also a number of quite bad associated pathologies).
The other biggest thing that feels important to distinguish between what happened at Leverage and the rest of the community is the actual institutional and conscious optimization that has gone into PR control.
Like, I think Ben Hoffman’s point about “Blatant lies are the best kind!” is pretty valid, and I do think that other parts of the community (including organizations like CEA and to some degree CFAR) have engaged in PR control in various harmful but less legible ways, but I do think there is something additionally mindkilly and gaslighty about straightforwardly lying, or directly threatening adversarial action to prevent people from speaking ill of someone, in the way Leverage has. I always felt that the rest of the rationality community had a very large and substantial dedication to being very clear about when they denotatively vs. connotatively disagree with something, and to have a very deep and almost religious respect for the literal truth (see e.g. a lot of Eliezer’s stuff around the wizard’s code and meta honesty), and I think the lack of that has made a lot of the dynamics around Leverage quite a bit worse.
I also think it makes understanding the extent of the harm and ways to improve it a lot more difficult. I think the number of people who have been hurt by various things Leverage has done is really vastly larger than the number of people who have spoken out so far, in a ratio that I think is very different from what I believe is true about the rest of the community. As a concrete example, I have a large number of negative Leverage experiences between 2015-2017 that I never wrote up due to various complicated adversarial dynamics surrounding Leverage and CEA (as well as various NDAs and legal threats, made by both Leverage and CEA, not leveled at me, but leveled at enough people around me that I thought I might cause someone serious legal trouble if I repeat a thing I heard somewhere in a more public setting), and I feel pretty confident that I would feel very different if I had similarly bad experiences with CFAR or MIRI, based on my interactions with both of these organizations.
I think this kind of information control feels like what ultimately flips things into the negative for me, in this situation with Leverage. Like, I think I am overall pretty in favor of people gathering together and working on a really intense project, investing really hard into some hypothesis that they have some special sauce that allows them to do something really hard and important that nobody else can do. I am also quite in favor of people doing a lot of introspection and weird psychology experiments on themselves, and to try their best to handle the vulnerability that comes with doing that near other people, even though there is a chance things will go badly and people will get hurt.
But the thing that feels really crucial in all of this is that people can stay well-informed and can get the space they need to disengage, can get an external perspective when necessary, and somehow stay grounded all throughout this process. Which feels much harder to do in an environment where people are directly lying to you, or where people are making quite explicit plots to discredit you, or harm you in some other way, if you do leave the group, or leak information.
I do notice that in the above I make various accusations of lying or deception by Leverage without really backing it up with specific evidence, which I apologize for, and I think people reading this should overall not take comments like mine at face value before having heard something pretty specific that backs up the accusations in them. I have various concrete examples I could give, but do notice that doing so would violate various implicit and explicit confidentiality agreements I made, that I wish I had not made, and I am still figuring out whether I can somehow extract and share the relevant details, without violating those agreements in any substantial way, or whether it might be better for me to break the implicit ones of those agreements (which seem less costly to break, given that I felt like I didn’t really fully consent to them), given the ongoing pretty high cost.
What links here?

The Death of Behavioral Economics

habryka22 Aug 2021 22:39 UTC

155 points

24 comments1 min readLW link 2 reviews

(www.thebehavioralscientist.com)

Integrity and accountability are core parts of rationality

habryka15 Jul 2019 20:22 UTC

153 points

66 comments6 min readLW link 1 review

habryka 17 Oct 2021 18:51 UTC
138 points
in reply to: AnnaSalamon’s comment on: My experience at and around MIRI and CFAR (inspired by Zoe Curzi’s writeup of experiences at Leverage)
I feel like one really major component that is missing from the story above, in particular a number of the psychotic breaks, is to mention Michael Vassar and a bunch of the people he tends to hang out with. I don’t have a ton of detail on exactly what happened in each of the cases where someone seemed to have a really bad time, but having looked into it for a few hours in each case, I think all three of them were in pretty close proximity to having spent a bunch of time (and in some of the cases after taking psychedelic drugs) with Michael.
I think this is important because Michael has I think a very large psychological effect on people, and also has some bad tendencies to severely outgroup people who are not part of his very local social group, and also some history of attacking outsiders who behave in ways he doesn’t like very viciously, including making quite a lot of very concrete threats (things like “I hope you will be guillotined, and the social justice community will find you and track you down and destroy your life, after I do everything I can to send them onto you”). I personally have found those threats to very drastically increase the stress I experience from interfacing with Michael (and some others in his social group), and also my models of how these kinds of things happen have a lot to do with dynamics where this kind of punishment is expected if you deviate from the group norm.
I am not totally confident that Michael has played a big role in all of the bad psychotic experiences listed above, but my current best guess is that he has, and I do indeed pretty directly encourage people to not spend a lot of time with Michael (though I do think talking to him occasionally is actually great and I have learned a lot of useful things from talking to him, and also think he has helped me see various forms of corruption and bad behavior in my environment that I am genuinely grateful to have noticed, but I very strongly predict that I would have a very intensely bad experience if I were to spend more time around Michael, in a way I would not endorse in the long run).
What links here?
- mingyuan's comment on My experience at and around MIRI and CFAR (inspired by Zoe Curzi’s writeup of experiences at Leverage) by jessicata (19 Oct 2021 20:41 UTC; 358 points)
- Decentralized Exclusion by jefftk (13 Mar 2023 15:50 UTC; 23 points)

Integrity in AI Governance and Advocacy

habryka and OliviaJ

3 Nov 2023 19:52 UTC

134 points

57 comments23 min readLW link

How to (hopefully ethically) make money off of AGI

habryka, Zvi, Cosmos and NoahK

6 Nov 2023 23:35 UTC

127 points

75 comments32 min readLW link

habryka 26 Jan 2023 0:11 UTC
LW: 125 AF: 59
48
AF
on: Thoughts on the impact of RLHF research
RLHF is just not that important to the bottom line right now. Imitation learning works nearly as well, other hacky techniques can do quite a lot to fix obvious problems, and the whole issue is mostly second order for the current bottom line.
I am very confused why you think this, just right after the success of Chat-GPT, where approximately the only difference from GPT-3 was the presence of RLHF.
My current best guess is that Chat-GPT alone, via sparking an arms-race between Google and Microsoft, and by increasing OpenAIs valuation, should be modeled as the equivalent of something on the order of $10B of investment into AI capabilities research, completely in addition to the gains from GPT-3.
And my guess is most of that success is attributable to the work on RLHF, since that was really the only substantial difference between Chat-GPT and GPT-3. We also should not think this was overdetermined since 1.5 years passed since the release of GPT-3 and the release of Chat-GPT (with some updates to GPT-3 in the meantime, but my guess is no major ones), and no other research lab focused on capabilities had set up their own RLHF pipeline (except Anthropic, which I don’t think makes sense to use as a datapoint here, since it’s in substantial parts the same employees).
I have been trying to engage with the actual details here, and indeed have had a bunch of arguments with people over the last 2 years where I have been explicitly saying that RLHF is pushing on commercialization bottlenecks based on those details, and people believing this was not the case was the primary crux on whether RLHF was good or bad in those conversations.
The crux was importantly not that other people would do the same work anyways, since people at the same time also argued that their work on RLHF was counterfactually relevant and that it’s pretty plausible or likely that the work would otherwise not happen. I’ve had a few of these conversations with you as well (though in aggregate not a lot) and your take at the time was (IIRC) that it seemed quite unlikely that RLHF would have as big of an effect as it did have in the case of Chat-GPT (mostly via an efficiency argument that if that was the case, more capabilities-oriented people would work on it, and since they weren’t it likely isn’t a commercialization bottleneck), and so I do feel a bit like I want to call you out on that, though I might also be misremembering the details (some of this was online, so might be worth going back through our comment histories).
What links here?

habryka 23 Dec 2022 7:26 UTC
117 points
108
in reply to: CarlShulman’s comment on: Let’s think about slowing down AI
I think this comment is overstating the case for policymakers and the electorate actually believing that investing in AI is good for the world. I think the answer currently is “we don’t know what policymakers and the electorate actually want in relation to AI” as well as “the relationship of policymakers and the electorate is in the middle of shifting quite rapidly, so past actions are not that predictive of future actions”.
I really only have anecdata to go on (though I don’t think anyone has much better), but my sense from doing informal polls of e.g. Uber drivers, people on Twitter, and perusing a bunch of Subreddits (which, to be clear, is a terrible sample) is that indeed a pretty substantial fraction of the world is now quite afraid of the consequences of AI, both in a “this change is happening far too quickly and we would like it to slow down” sense, and in a “yeah, I am actually worried about killer robots killing everyone” sense. I think both of these positions are quite compatible with pushing for a broad slow down. There is also a very broad and growing “anti-tech” movement that is more broadly interested in giving less resources to the tech sector, whose aims are at least for a long while compatible with slowing down AGI progress.
My current guess is that policies that are primarily aimed at slowing down and/or heavily regulating AI research are actually pretty popular among the electorate, and I also expect them to be reasonably popular among policymakers, though I also expect their preferences to lag behind the electorate for a while. But again, I really think we don’t know, and nobody has run even any basic surveys on the topic yet.
Edit: Inspired by this topic/discussion, I ended up doing some quick google searches for AI opinion polls. I didn’t find anything great, but this Pew report has some stuff that’s pretty congruent with potential widespread support for AI regulation: https://www.pewresearch.org/internet/2022/03/17/how-americans-think-about-artificial-intelligence/
What links here?

The LessWrong 2022 Review

habryka5 Dec 2023 4:00 UTC

115 points

43 comments4 min readLW link

The LessWrong 2019 Review

habryka2 Dec 2020 11:21 UTC

113 points

47 comments9 min readLW link