Hey Rob, thanks for writing this, and sorry for the slow response. In brief, I think you do misunderstand my views, in ways that Buck, Ryan and Habryka point out. I’ll clarify a little more.
Some areas where the criticism seems reasonable:
I think it’s fair to say that I worded the compute governance sentence poorly, in ways Habryka clarified.
I’m somewhat sympathetic to the criticism that there was a “missing mood” (cf e.g. here and here), given that a lot of people won’t know my broader views. I’m very happy to say: “I definitely think it will be extremely valuable to have the option to slow down AI development in the future,” as well as “the current situation is f-ing crazy”. (Though there was also a further vibe on twitter of “we should be uniting rather than disagreeing” which I think is a bad road to go down.)
Now, clarifying my position:
Here’s what I take IABI to be arguing (written by GPT5-Pro, on the basis of a pdf, in an attempt not to infuse my biases):
The book argues that building a superhuman AI would be predictably fatal for humanity and therefore urges an immediate, globally enforced halt to AI escalation—consolidating and monitoring compute under treaty, outlawing capability‑enabling research, and, if necessary, neutralizing rogue datacenters—while mobilizing journalists and ordinary citizens to press leaders to act.
And what readers will think the book is about (again written by GPT5-Pro):
A “shut‑it‑all‑down‑now” manifesto warning that any superintelligent AI will wipe us out unless governments ban frontier AI and are prepared to sabotage or bomb rogue datacenters—so the public and the press must demand it.
The core message of the book is not saying merely “AI x-risk is worryingly high” or “stopping or slowing AI development would be one good strategy among many.” I wouldn’t disagree with the former at all, and the latter disagreement would be more about the details.
Here’s a different perspective:
AI takeover x-risk is high, but not extremely high (e.g. 1%-40%). The right response is an “everything and the kitchen sink” approach — there are loads of things we can do that all help a bit in expectation (both technical and governance, including mechanisms to slow the intelligence explosion), many of which are easy wins, and right now we should be pushing on most of them.
This is my overall strategic picture. If the book had argued for that (or even just the “kitchen sink” approach part) then I might have disagreed with the arguments, but I wouldn’t feel, “man, people will come away from this with a bad strategic picture”.
(I think the whole strategic picture would include:
There are a lot of other existential-level challenges, too (including human coups / concentration of power), and ideally the best strategies for reducing AI takeover risk shouldn’t aggravate these other risks.
But I think that’s fine not to discuss in a book focused on AI takeover risk.)
This is also the broad strategic picture, as I understand it, of e.g. Carl, Paul, Ryan, Buck. It’s true that I’m more optimistic than they are (on the 80k podcast I say 1-10% range for AI x-risk, though it depends on what exactly you mean by that) but I don’t feel deep worldview disagreement with them.
With that in mind, some reasons why I think the promotion of the Y&S view could be meaningfully bad:
If it means more people don’t pursue the better strategy of focusing on the easier wins.
Or they end up making the wrong tradeoffs. (e.g. intense centralisation of AI development in a way that makes misaligned human takeover risk more likely)
Or people might lapse into defeatism: “Ok we’re doomed, then: a decades-long international ban will never happen, so it’s pointless to work on AI x-risk.” (We already see this reaction to climate change, given doomerist messaging there. To be clear, I don’t think that sort of effect should be a reason for being misleading about one’s views.)
Overall, I feel pretty agnostic on whether Y&S shouting their message is on net good for the world.
I think I’m particularly triggered by all this because of a conversation I had last year with someone who takes AI takeover risk very seriously and could double AI safety philanthropy if they wanted to. I was arguing they should start funding AI safety, but the conversation was a total misfire because they conflated “AI safety” with “stop AI development”: their view was that that will never happen, and they were actively annoyed that they were hearing what they considered to be such a dumb idea. My guess was that EY’s TIME article was a big factor there.
Then, just to be clear, here are some cases where you misunderstand me, just focusing on the most-severe misunderstandings:
he’s more or less calling on governments to sit back and let it happen
I really don’t think that!
He thinks feedback loops like “AIs do AI capabilities research” won’t accelerate us too much first.
I vibeswise disagree, because I expect massive acceleration and I think that’s *the* key challenge: See e.g. PrepIE, 80k podcast.
But there is a grain of truth in that my best guess is a more muted software-only intelligence explosion than some others predict. E.g. a best guess where, once AI fully automates AI R&D, we get 3-5 years of progress in 1 year (at current rates), rather than 10+ years’ worth, or rather than godlike superintelligence. This is the best analysis I know of on the topic. This might well be the cause of much of the difference in optimism between me and e.g. Carl.
(Note I still take the much larger software explosions very seriously (e.g. 10%-20% probability). And I could totally change my mind on this — the issue feels very live and open to me.)
Will thinks government compute monitoring is a bad idea
Definitely disagree with this one! In general, society having more options and levers just seems great to me.
he’s sufficiently optimistic that the people who build superintelligence will wield that enormous power wisely and well, and won’t fall into any traps that fuck up the future
Definitely disagree!
Like, my whole bag is that I expect us to fuck up the future even if alignment is fine!! (e.g. Better Futures)
He’s proposing that humanity put all of its eggs in this one basket
Definitely disagree! From my POV, it’s the IABI perspective that is closer to putting all the eggs in one basket, rather than advocating for the kitchen sink approach.
It seems hard to be more than 90% confident in the whole conjunction, in which case there’s a double-digit chance that the everyone-races-to-build-superintelligence plan brings the world to ruin.
But “10% chance of ruin” is not what EY&NS, or the book, is arguing for, and isn’t what I was arguing against. (You could logically have the view of “10% chance of ruin and the only viable way to bring that down is a global moratorium”, but I don’t know anyone who has that view.)
a conclusion like “things will be totally fine as long as AI capabilities trendlines don’t change.”
Also not true, though I am more optimistic than many on the takeover side of things.
to advocate that we race to build it as fast as possible
Also not true—e.g. I write here about the need to slow the intelligence explosion.
There’s a grain of truth in that I’m pretty agnostic on whether speeding up or slowing down AI development right now is good or bad. I flip-flop on it, but I currently lean towards thinking speed up at the moment is mildly good, for a few reasons: it stretches out the IE by bringing it forwards, means there’s more of a compute constraint and so the software-only IE doesn’t go as far, and means society wakes up earlier, giving more time to invest more in alignment of more-powerful AI.
(I think if we’d gotten to human-level algorithmic efficiency at the Dartmouth conference, that would have been good, as compute build-out is intrinsically slower and more controllable than software progress (until we get nanotech). And if we’d scaled up compute + AI to 10% of the global economy decades ago, and maintained it at that level, that also would have been good, as then the frontier pace would be at the rate of compute-constrained algorithmic progress, rather than the rate we’re getting at the moment from both algorithmic progress AND compute scale-up.)
In general, I think that how the IE happens and is governed is a much bigger deal than when it happens.
like, I still associate Will to some degree with the past version of himself who was mostly unconcerned about near-term catastrophes and thought EA’s mission should be to slowly nudge long-term social trends.
Which Will-version are you thinking of? Even in DGB I wrote about preventing near-term catastrophes as a top cause area.
I think Will was being unvirtuously cagey or spin-y about his views
Again really not intended! I think I’ve been clear about my views elsewhere (see previous links).
Ok, that’s all just spelling out my views. Going back, briefly, to the review. I said I was “disappointed” in the book — that was mainly because I thought that this was E&Y’s chance to give the strongest version of their arguments (though I understood they’d be simplified or streamlined), and the arguments I read were worse than I expected (even though I didn’t expect to find them terribly convincing).
Regarding your object-level responses to my arguments — I don’t think any of them really support the idea that alignment is so hard that AI takeover x-risk is overwhelmingly likely, or that the only viable response is to delay AI development by decades. E.g.
As Joe Collman notes, a common straw version of the If Anyone Builds It, Everyone Dies thesis is that “existing AIs are so dissimilar” to a superintelligence that “any work we do now is irrelevant,” when the actual view is that it’s insufficient, not irrelevant.
But if it’s a matter of “insufficiency”, the question is how one can be so confident that any work we do now (including with ~AGI assistance, including if we’ve bought extra time via control measures and/or deals with misaligned ~AGIs) is insufficient, such that the only thing that makes a meaningful difference to x-risk, even in expectation, is a global moratorium. And I’m still not seeing the case for that.
(I think I’m unlikely to respond further, but thanks again for the engagement.)
In general, I think that how the IE happens and is governed is a much bigger deal than when it happens.
(I don’t have much hope in trying to actually litigate any of this, but:)
Bro. It’s not governed, and if it happens any time soon it won’t be aligned. That’s the whole point.
The right response is an “everything and the kitchen sink” approach — there are loads of things we can do that all help a bit in expectation (both technical and governance, including mechanisms to slow the intelligence explosion), many of which are easy wins, and right now we should be pushing on most of them.
How do these small kitchen sinks add up to pushing back AGI by, say, several decades? Or add up to making an AGI that doesn’t kill everyone? My super-gloss of the convo is:
IABIED: We’re plummeting toward AGI at an unknown rate and distance; we should stop that; to stop that we’d have to do this really big hard thing; so we should do that.
You: Instead, we should do smaller things. And you’re distracting people from doing smaller things.
Is that right? Why isn’t “propose to the public a plan that would actually work” one of your small things?
Hi Will, one of your core arguments against IABIED was that we can test the models in a wide variety of environments or distributions. I wrote some thoughts why I think we can’t test it in environments that matter:
I think I’m particularly triggered by all this because of a conversation I had last year with someone who takes AI takeover risk very seriously and could double AI safety philanthropy if they wanted to. I was arguing they should start funding AI safety, but the conversation was a total misfire because they conflated “AI safety” with “stop AI development”: their view was that that will never happen, and they were actively annoyed that they were hearing what they considered to be such a dumb idea. My guess was that EY’s TIME article was a big factor there.
Mandatory check: was this billionaire a sociopath who made their money unethically or illegally (perhaps through crypto), like the last time you persuaded someone in this position to put tons of their philanthropy into AI safety?
(Perhaps you can show that they weren’t, but given your atrocious track record these datapoints shouldn’t really be taken seriously without double-checking.)
(As a suggestion, you could DM the name to me or anyone in this thread and have them report back their impression of whether the person is a crook or obviously unethical, without releasing the identity widely.)
Hey Rob, thanks for writing this, and sorry for the slow response. In brief, I think you do misunderstand my views, in ways that Buck, Ryan and Habryka point out. I’ll clarify a little more.
Some areas where the criticism seems reasonable:
I think it’s fair to say that I worded the compute governance sentence poorly, in ways Habryka clarified.
I’m somewhat sympathetic to the criticism that there was a “missing mood” (cf e.g. here and here), given that a lot of people won’t know my broader views. I’m very happy to say: “I definitely think it will be extremely valuable to have the option to slow down AI development in the future,” as well as “the current situation is f-ing crazy”. (Though there was also a further vibe on twitter of “we should be uniting rather than disagreeing” which I think is a bad road to go down.)
Now, clarifying my position:
Here’s what I take IABI to be arguing (written by GPT5-Pro, on the basis of a pdf, in an attempt not to infuse my biases):
And what readers will think the book is about (again written by GPT5-Pro):
The core message of the book is not saying merely “AI x-risk is worryingly high” or “stopping or slowing AI development would be one good strategy among many.” I wouldn’t disagree with the former at all, and the latter disagreement would be more about the details.
Here’s a different perspective:
This is my overall strategic picture. If the book had argued for that (or even just the “kitchen sink” approach part) then I might have disagreed with the arguments, but I wouldn’t feel, “man, people will come away from this with a bad strategic picture”.
(I think the whole strategic picture would include:
But I think that’s fine not to discuss in a book focused on AI takeover risk.)
This is also the broad strategic picture, as I understand it, of e.g. Carl, Paul, Ryan, Buck. It’s true that I’m more optimistic than they are (on the 80k podcast I say 1-10% range for AI x-risk, though it depends on what exactly you mean by that) but I don’t feel deep worldview disagreement with them.
With that in mind, some reasons why I think the promotion of the Y&S view could be meaningfully bad:
If it means more people don’t pursue the better strategy of focusing on the easier wins.
Or they end up making the wrong tradeoffs. (e.g. intense centralisation of AI development in a way that makes misaligned human takeover risk more likely)
Or people might lapse into defeatism: “Ok we’re doomed, then: a decades-long international ban will never happen, so it’s pointless to work on AI x-risk.” (We already see this reaction to climate change, given doomerist messaging there. To be clear, I don’t think that sort of effect should be a reason for being misleading about one’s views.)
Overall, I feel pretty agnostic on whether Y&S shouting their message is on net good for the world.
I think I’m particularly triggered by all this because of a conversation I had last year with someone who takes AI takeover risk very seriously and could double AI safety philanthropy if they wanted to. I was arguing they should start funding AI safety, but the conversation was a total misfire because they conflated “AI safety” with “stop AI development”: their view was that that will never happen, and they were actively annoyed that they were hearing what they considered to be such a dumb idea. My guess was that EY’s TIME article was a big factor there.
Then, just to be clear, here are some cases where you misunderstand me, just focusing on the most-severe misunderstandings:
I really don’t think that!
I vibeswise disagree, because I expect massive acceleration and I think that’s *the* key challenge: See e.g. PrepIE, 80k podcast.
But there is a grain of truth in that my best guess is a more muted software-only intelligence explosion than some others predict. E.g. a best guess where, once AI fully automates AI R&D, we get 3-5 years of progress in 1 year (at current rates), rather than 10+ years’ worth, or rather than godlike superintelligence. This is the best analysis I know of on the topic. This might well be the cause of much of the difference in optimism between me and e.g. Carl.
(Note I still take the much larger software explosions very seriously (e.g. 10%-20% probability). And I could totally change my mind on this — the issue feels very live and open to me.)
Definitely disagree with this one! In general, society having more options and levers just seems great to me.
Definitely disagree!
Like, my whole bag is that I expect us to fuck up the future even if alignment is fine!! (e.g. Better Futures)
Definitely disagree! From my POV, it’s the IABI perspective that is closer to putting all the eggs in one basket, rather than advocating for the kitchen sink approach.
But “10% chance of ruin” is not what EY&NS, or the book, is arguing for, and isn’t what I was arguing against. (You could logically have the view of “10% chance of ruin and the only viable way to bring that down is a global moratorium”, but I don’t know anyone who has that view.)
Also not true, though I am more optimistic than many on the takeover side of things.
Also not true—e.g. I write here about the need to slow the intelligence explosion.
There’s a grain of truth in that I’m pretty agnostic on whether speeding up or slowing down AI development right now is good or bad. I flip-flop on it, but I currently lean towards thinking speed up at the moment is mildly good, for a few reasons: it stretches out the IE by bringing it forwards, means there’s more of a compute constraint and so the software-only IE doesn’t go as far, and means society wakes up earlier, giving more time to invest more in alignment of more-powerful AI.
(I think if we’d gotten to human-level algorithmic efficiency at the Dartmouth conference, that would have been good, as compute build-out is intrinsically slower and more controllable than software progress (until we get nanotech). And if we’d scaled up compute + AI to 10% of the global economy decades ago, and maintained it at that level, that also would have been good, as then the frontier pace would be at the rate of compute-constrained algorithmic progress, rather than the rate we’re getting at the moment from both algorithmic progress AND compute scale-up.)
In general, I think that how the IE happens and is governed is a much bigger deal than when it happens.
Which Will-version are you thinking of? Even in DGB I wrote about preventing near-term catastrophes as a top cause area.
Again really not intended! I think I’ve been clear about my views elsewhere (see previous links).
Ok, that’s all just spelling out my views. Going back, briefly, to the review. I said I was “disappointed” in the book — that was mainly because I thought that this was E&Y’s chance to give the strongest version of their arguments (though I understood they’d be simplified or streamlined), and the arguments I read were worse than I expected (even though I didn’t expect to find them terribly convincing).
Regarding your object-level responses to my arguments — I don’t think any of them really support the idea that alignment is so hard that AI takeover x-risk is overwhelmingly likely, or that the only viable response is to delay AI development by decades. E.g.
But if it’s a matter of “insufficiency”, the question is how one can be so confident that any work we do now (including with ~AGI assistance, including if we’ve bought extra time via control measures and/or deals with misaligned ~AGIs) is insufficient, such that the only thing that makes a meaningful difference to x-risk, even in expectation, is a global moratorium. And I’m still not seeing the case for that.
(I think I’m unlikely to respond further, but thanks again for the engagement.)
(I don’t have much hope in trying to actually litigate any of this, but:)
Bro. It’s not governed, and if it happens any time soon it won’t be aligned. That’s the whole point.
How do these small kitchen sinks add up to pushing back AGI by, say, several decades? Or add up to making an AGI that doesn’t kill everyone? My super-gloss of the convo is:
IABIED: We’re plummeting toward AGI at an unknown rate and distance; we should stop that; to stop that we’d have to do this really big hard thing; so we should do that.
You: Instead, we should do smaller things. And you’re distracting people from doing smaller things.
Is that right? Why isn’t “propose to the public a plan that would actually work” one of your small things?
Hi Will, one of your core arguments against IABIED was that we can test the models in a wide variety of environments or distributions. I wrote some thoughts why I think we can’t test it in environments that matter:
https://www.lesswrong.com/posts/ke24kxhSzfX2ycy57/simon-lermen-s-shortform?commentId=hJnqec5AFjKDmrtsG
Mandatory check: was this billionaire a sociopath who made their money unethically or illegally (perhaps through crypto), like the last time you persuaded someone in this position to put tons of their philanthropy into AI safety?
(Perhaps you can show that they weren’t, but given your atrocious track record these datapoints shouldn’t really be taken seriously without double-checking.)
(As a suggestion, you could DM the name to me or anyone in this thread and have them report back their impression of whether the person is a crook or obviously unethical, without releasing the identity widely.)