I used to not actually believe in heavy-tailed impact. On some gut level I thought that early rationalists (and to a lesser extent EAs) had “gotten lucky” in being way more right than academic consensus about AI progress. I also implicitly believed that e.g. Thiel and Musk and so on kept getting lucky, because I didn’t want to picture a world in which they were actually just skillful enough to keep succeeding (due to various psychological blockers).
Now, thanks to dealing with a bunch of those blockers, I have internalized to a much greater extent that you can actually be good not just lucky. This means that I’m no longer interested in strategies that involve recruiting a whole bunch of people and hoping something good comes out of it. Instead I am trying to target outreach precisely to the very best people, without compromising much.
Relatedly, I’ve updated that the very best thinkers in this space are still disproportionately the people who were around very early. The people you need to soften/moderate your message to reach (or who need social proof in order to get involved) are seldom going to be the ones who can think clearly about this stuff. And we are very bottlenecked on high-quality thinking.
(My past self needed a lot of social proof to get involved in AI safety in the first place, but I also “got lucky” in the sense of being exposed to enough world-class people that I was able to update my mental models a lot—e.g. watching the OpenAI board coup close up, various conversations with OpenAI cofounders, etc. This doesn’t seem very replicable—though I’m trying to convey a bunch of the models I’ve gained on my blog, e.g. in this post.)
The people you need to soften/moderate your message to reach (or who need social proof in order to get involved) are seldom going to be the ones who can think clearly about this stuff.
I strongly agree with this. (I wrote a post about it years ago.[1])
Even of the people who were not “in early”, of the ones who I most respect, and who seem to me to be doing the most impressive work that I’m most grateful to have in the world, 0 of them needed hand-holding or “outreach” to get them on board.
Writing the sequences was an amazing, high quality intervention that continues to pay dividends to this day. I think writing on the internet about the things that you think are important is a fantastic strategy, at least if your intellectual taste is good.
The payoff of most of the “movement building” and “community building” seems much murkier to me. At least some of it was clearly positive, but I don’t know if it was positive on net (I think a smaller and more intense EA than the one we have in practice probably would have been better).
There’s selection bias in kinds of community building I observed, but it seems to me that community building was more effective to the extent that it was “just get together and try to do the thing” instead of “try to do outreach to get other people on board”.
eg
The best MIRIx groups > CFAR workshops[2] > EAGs > EA onboarding programs at universities.
The HP:MoR wrap parties seem to have pretty notably impactful though, and those were closer to outreach.
To be clear, CFAR workshops were always community building interventions, and fell far short of the standard that I would expect of a group working to seriously develop a science of human rationality, but they were still much more “contentful” and about making progress than most community building interventions are.
I agree with the main generator of this post (a small number of people produce a wildly disproportionate amount of the intellectual progress on hard problems) and one of the conclusions (don’t water down your messages at all, if people need watered down messages they are unlikely to be helpful) but I think there’s significant value in trying to communicate the hard problem of alignment broadly anyway because:
Filtering who are the best people is expensive and error-prone, so if you don’t put the correct models in general circulation even pretty great people might just not become aware of them
People who are highly competent but not highly confident seem to often run into people who have been misinformed and become less sure of their own positions, having more generally circulating models of the main threat models would help those people get less distracted
Planting lots of seeds can be relatively cheap.
Also, related anecdote, I ran ~8 retreats at my house covering around 60 people in 2022⁄23. I got a decent read on how much of the core stack of alignment concepts at least half of them had, and how often they made hopeful mistakes which were transparently going to fail based on not having picked up the core ideas from arbital or understood the top ~10 alignment related concepts clearly. There were only two who cleared this bar.
Also, relatedly, the people you left Bluedot to seem to not reliably be teaching people the core things they need to learn. They are friendly and receptive each time I get on calls with them and ask them to fix their courses, and often do fix some of the stuff, but some of the core generators there look to me like they’re just missing from the people picking course materials and lots of people are getting watered down versions of alignment because of this. Consider taking a skim through their courses and advising them on learning objectives etc, you’re probably the best-placed person to do this.
Being able to take future AI seriously as a risk seems to be highly correlated to being able to take COVID seriously as a risk in February 2000.
The key skill here may be as simple as being able to selectively turn off normalcy bias in the face of highly unusual news.
A closely related “skill” may be a certain general pessimism about future events, the sort of thing economists jokingly describe as “correctly predicting 12 out of the last 6 recessions.”
That said, mass public action can be valuable. It’s a notoriously blunt tool, though. As one person put it, “if you want to coordinate more than 5,000 people, your message can be about 5 words long.” And the public will act anyway, in some direction. So if there’s something you want to public to do, it can be worth organizing and working on communication strategies.
My political platform is, if you boil it down far enough, about 3 words long: “Don’t build SkyNet.” (As early as the late 90s, I joked about having a personal 11th commandment: “Thou shalt not build SkyNet.” One of my career options at that point was to potentially work on early semi-autonomous robotic weapon platform prototypes, so this was actually relevant moral advice.)
But I strongly suspect that once the public believes that companies might truly build SkyNet, their reaction will be “What the actual fuck?” and widespread public backlash. I expect lesser but still serious public backlash if AI agents ever advance beyond the current “clever white-collar intern” level of competence and start taking over jobs en masse.
The main limits of public action are that (1) public action is a blunt tool, and (2) the public needs to actually believe in an imminent risk. Right now AI risk mostly gets filed under “I hate AI slop” and “it’s a fun hypothetical bull session, with little impact on my life.” Once people actually start to take AI seriously, you will often see strong negative attitudes even from non-technical people.
Of course, public majorities of 60-80% of the population want lots of things that the US political system doesn’t give them. So organizing the public isn’t sufficient by itself, especially if your timelines are short. But if you assume a significant chance that timelines are closer to (say) 2035 than 2027, then some kinds of public outreach might be valuable, especially if the public starts to believe. This can create significant pressure on legislative coalitions and executive leadership. But it’s all pretty hit-or-miss. Luck would play a major role.
Part of the subtext here being the very best people (on the relevant dimensions) will naturally run into lesswrong, x-risk, etc, such that “out-reach” (in the sense of uni-organizing, advertising, etc) isn’t that valuable on the current margin.
To “the very best”, doing high quality research is often the best “out-reach”.
What about persuading politicians that AI safety is a cause that will win them votes? That requires very broad spectrum outreach to get as many ordinary people on board as possible.
This seems to assume that the quality of labor of a small, highly-selected number of researchers, can be more important than a much larger amount of somewhat lower-quality labor, from a much larger number of participants. Seems like a pretty dubious assumption, especially given that other strategies seem possible. E.g. using a larger pool of participants to produce more easily verifiable, more prosaic AI safety research now, even at the risk of lower quality, so as to allow for better alignment + control of the kinds of AI models which will in the future for the first time be able to automate the higher quality and maybe less verifiable (e.g. conceptual) research that fewer people might be able to produce today. Put more briefly: quantity can have a quality of its own, especially in more verifiable research domains.
Some of the claims around the quality of early rationalist / EA work also seem pretty dubious. E.g. a lot of the Yudkowsky-and-friends worldview is looking wildly overconfident and likely wrong.
Someone on the EA forum asked why I’ve updated away from public outreach as a valuable strategy. My response:
I used to not actually believe in heavy-tailed impact. On some gut level I thought that early rationalists (and to a lesser extent EAs) had “gotten lucky” in being way more right than academic consensus about AI progress. I also implicitly believed that e.g. Thiel and Musk and so on kept getting lucky, because I didn’t want to picture a world in which they were actually just skillful enough to keep succeeding (due to various psychological blockers).
Now, thanks to dealing with a bunch of those blockers, I have internalized to a much greater extent that you can actually be good not just lucky. This means that I’m no longer interested in strategies that involve recruiting a whole bunch of people and hoping something good comes out of it. Instead I am trying to target outreach precisely to the very best people, without compromising much.
Relatedly, I’ve updated that the very best thinkers in this space are still disproportionately the people who were around very early. The people you need to soften/moderate your message to reach (or who need social proof in order to get involved) are seldom going to be the ones who can think clearly about this stuff. And we are very bottlenecked on high-quality thinking.
(My past self needed a lot of social proof to get involved in AI safety in the first place, but I also “got lucky” in the sense of being exposed to enough world-class people that I was able to update my mental models a lot—e.g. watching the OpenAI board coup close up, various conversations with OpenAI cofounders, etc. This doesn’t seem very replicable—though I’m trying to convey a bunch of the models I’ve gained on my blog, e.g. in this post.)
I strongly agree with this. (I wrote a post about it years ago.[1])
Even of the people who were not “in early”, of the ones who I most respect, and who seem to me to be doing the most impressive work that I’m most grateful to have in the world, 0 of them needed hand-holding or “outreach” to get them on board.
Writing the sequences was an amazing, high quality intervention that continues to pay dividends to this day. I think writing on the internet about the things that you think are important is a fantastic strategy, at least if your intellectual taste is good.
The payoff of most of the “movement building” and “community building” seems much murkier to me. At least some of it was clearly positive, but I don’t know if it was positive on net (I think a smaller and more intense EA than the one we have in practice probably would have been better).
There’s selection bias in kinds of community building I observed, but it seems to me that community building was more effective to the extent that it was “just get together and try to do the thing” instead of “try to do outreach to get other people on board”.
eg
The best MIRIx groups > CFAR workshops[2] > EAGs > EA onboarding programs at universities.
The HP:MoR wrap parties seem to have pretty notably impactful though, and those were closer to outreach.
I keep thinking that I should crosspost this to LessWrong and the EA forum, but haven’t yet, since I need to rename it well.
If you, dear reader, think that I really should do that, bugging me about it seems likely to make it more likely to happen.
To be clear, CFAR workshops were always community building interventions, and fell far short of the standard that I would expect of a group working to seriously develop a science of human rationality, but they were still much more “contentful” and about making progress than most community building interventions are.
Your post is great, I encourage you to repost it.
I agree with the main generator of this post (a small number of people produce a wildly disproportionate amount of the intellectual progress on hard problems) and one of the conclusions (don’t water down your messages at all, if people need watered down messages they are unlikely to be helpful) but I think there’s significant value in trying to communicate the hard problem of alignment broadly anyway because:
Filtering who are the best people is expensive and error-prone, so if you don’t put the correct models in general circulation even pretty great people might just not become aware of them
People who are highly competent but not highly confident seem to often run into people who have been misinformed and become less sure of their own positions, having more generally circulating models of the main threat models would help those people get less distracted
Planting lots of seeds can be relatively cheap.
Also, related anecdote, I ran ~8 retreats at my house covering around 60 people in 2022⁄23. I got a decent read on how much of the core stack of alignment concepts at least half of them had, and how often they made hopeful mistakes which were transparently going to fail based on not having picked up the core ideas from arbital or understood the top ~10 alignment related concepts clearly. There were only two who cleared this bar.
Also, relatedly, the people you left Bluedot to seem to not reliably be teaching people the core things they need to learn. They are friendly and receptive each time I get on calls with them and ask them to fix their courses, and often do fix some of the stuff, but some of the core generators there look to me like they’re just missing from the people picking course materials and lots of people are getting watered down versions of alignment because of this. Consider taking a skim through their courses and advising them on learning objectives etc, you’re probably the best-placed person to do this.
Being able to take future AI seriously as a risk seems to be highly correlated to being able to take COVID seriously as a risk in February 2000.
The key skill here may be as simple as being able to selectively turn off normalcy bias in the face of highly unusual news.
A closely related “skill” may be a certain general pessimism about future events, the sort of thing economists jokingly describe as “correctly predicting 12 out of the last 6 recessions.”
That said, mass public action can be valuable. It’s a notoriously blunt tool, though. As one person put it, “if you want to coordinate more than 5,000 people, your message can be about 5 words long.” And the public will act anyway, in some direction. So if there’s something you want to public to do, it can be worth organizing and working on communication strategies.
My political platform is, if you boil it down far enough, about 3 words long: “Don’t build SkyNet.” (As early as the late 90s, I joked about having a personal 11th commandment: “Thou shalt not build SkyNet.” One of my career options at that point was to potentially work on early semi-autonomous robotic weapon platform prototypes, so this was actually relevant moral advice.)
But I strongly suspect that once the public believes that companies might truly build SkyNet, their reaction will be “What the actual fuck?” and widespread public backlash. I expect lesser but still serious public backlash if AI agents ever advance beyond the current “clever white-collar intern” level of competence and start taking over jobs en masse.
The main limits of public action are that (1) public action is a blunt tool, and (2) the public needs to actually believe in an imminent risk. Right now AI risk mostly gets filed under “I hate AI slop” and “it’s a fun hypothetical bull session, with little impact on my life.” Once people actually start to take AI seriously, you will often see strong negative attitudes even from non-technical people.
Of course, public majorities of 60-80% of the population want lots of things that the US political system doesn’t give them. So organizing the public isn’t sufficient by itself, especially if your timelines are short. But if you assume a significant chance that timelines are closer to (say) 2035 than 2027, then some kinds of public outreach might be valuable, especially if the public starts to believe. This can create significant pressure on legislative coalitions and executive leadership. But it’s all pretty hit-or-miss. Luck would play a major role.
Part of the subtext here being the very best people (on the relevant dimensions) will naturally run into lesswrong, x-risk, etc, such that “out-reach” (in the sense of uni-organizing, advertising, etc) isn’t that valuable on the current margin.
To “the very best”, doing high quality research is often the best “out-reach”.
What about persuading politicians that AI safety is a cause that will win them votes? That requires very broad spectrum outreach to get as many ordinary people on board as possible.
This seems to assume that the quality of labor of a small, highly-selected number of researchers, can be more important than a much larger amount of somewhat lower-quality labor, from a much larger number of participants. Seems like a pretty dubious assumption, especially given that other strategies seem possible. E.g. using a larger pool of participants to produce more easily verifiable, more prosaic AI safety research now, even at the risk of lower quality, so as to allow for better alignment + control of the kinds of AI models which will in the future for the first time be able to automate the higher quality and maybe less verifiable (e.g. conceptual) research that fewer people might be able to produce today. Put more briefly: quantity can have a quality of its own, especially in more verifiable research domains.
Some of the claims around the quality of early rationalist / EA work also seem pretty dubious. E.g. a lot of the Yudkowsky-and-friends worldview is looking wildly overconfident and likely wrong.