Zach Robinson, relevant because heâs on the Anthropic LTBT and for other reasons, tweets:
âIf Anyone Builds It, Everyone Diesâ by @ESYudkowsky and @So8res is getting a lot of attention this week. As someone who leads an org working to reduce existential risks, Iâm grateful theyâre pushing AI safety mainstream. But I think theyâre wrong about doom being inevitable. đ§”
Donât get me wrongâI take AI existential risk seriously. But presenting doom as a foregone conclusion isnât helpful for solving the problem.
In 2022, superforecasters and AI researchers estimated the probability of existential catastrophic risk from AI by 2100 at around 0.4%-3%. A recent study found no correlation between near-term accuracy and long-term forecasts. TL;DR: predicting the future is really hard.
That doesnât mean we should throw the existential risk baby out with the âEveryone Diesâ bathwater. Most of us wouldnât be willing to risk a 3% chance (or even a 0.3% chance!) of the people we love dying.
But accepting uncertainty matters for navigating this complex challenge thoughtfully.
Accepting uncertainty matters for two big reasons.
First, it leaves room for AIâs transformative benefits. Tech has doubled life expectancy, slashed extreme poverty, and eliminated diseases over the past two centuries. AI could accelerate these trends dramatically.
But Yudkowsky and Soares dismiss these possibilities as âbeautiful dreams.â If weâre certain AI will kill us all, then all potential benefits get rounded to zero.
Second, focusing exclusively on extinction scenarios blinds us to other serious AI risks: authoritarian power grabs, democratic disruption through misinformation, mass surveillance, economic displacement, new forms of inequity. These deserve attention too.
People inspired by @EffectvAltruism have made real progress on AI safety while also mapping varied futures. Iâm personally encouraged by recent work from @willmacaskill and others at @forethought_org on putting society on a path toward flourishing with transformative AI.
This includes questions about AI consciousness and welfareâhow should we treat AI systems that might themselves suffer or flourish? These questions sound abstract but may become practical as AI systems become more sophisticated.
The policy debate surrounding AI development has become a false choice: stop all AI development (to prevent doom) vs. speed it up (for rewards).
I grew up participating in debate, so I know the importance of confidence. But there are two types: epistemic (based on evidence) and social (based on delivery). Despite being expressed in self-assured language, the evidence for imminent existential risk is far from airtight.
My take is this: We should take AI risk seriouslyâwith all its uncertaintiesâand work hard to bend development toward better outcomes.
On the object level, I think Zach is massively underrating AI takeover risk, and I think that his reference to the benefits of AI misses the point.
On the meta level, I think Zachâs opinions are relevant (and IMO concerning) for people who are relying on Zach to ensure that Anthropic makes good choices about AI risks. I donât think the perspective articulated in these tweets is consistent with him doing a good job there (though maybe this was just poor phrasing on his part, and his opinions are more reasonable than this).
Just to help people understand the context: The book really doesnât say that doom is inevitable. It goes out of its way like 4 times to say the opposite. I really donât have a good explanation of Zachâs comment that doesnât involve him not having read the book, and nevertheless making a tweet thread about it with a confidently wrong take. IMO the above really reads to me as if he workshopped some random LinkedIn-ish platitudes about the book to seem like a moderate and be popular on social media, without having engaged with the substance at all.
The book certainly claims that doom is not inevitable, but it does claim that doom is ~inevitable if anyone builds ASI using anything remotely like the current methods.
I understand Zach (and other âmoderatesâ) as saying no, even conditioned on basically YOLO-ing the current paradigm to superintelligence, its really uncertain (and less likely than not) that the resulting ASI would kill everyone.
I disagree with this position, but if I held it, I would be saying somewhat similar things to Zach (even having read the book).
Though I agree that engaging on the object level (beyond âpredictions are hardâ) would be good.
My guess is that theyâre doing the motte-and-bailey of âmake it seem to people who havenât read the book that it says that the ASI extinction is inevitable, that the book is just spreading doom and gloomâ, from which, if challenged, they could retreat to âno, I meant doom isnât inevitable even if we do build ASI using the current methodsâ.
Like, if someone means the latter (and has also read the book and knows that it goes to great lengths to clarify that we can avoid extinction), would they really phrase it as âdoom is inevitableâ, as opposed to e. g. âsafe ASI is impossibleâ?
Or maybe they havenât put that much thought into it and are just sloppy with language.
I disagree with this position, but if I held it, I would be saying somewhat similar things to Zach (even having read the book).
I wouldnât. I roughly agree with Zachâs background position (i.e. Iâm quite uncertain about the likelihood of extinction conditional on YOLO-ing the current paradigm*) but I still think his conclusions are wild. Quoting Zach:
First, it leaves room for AIâs transformative benefits. Tech has doubled life expectancy, slashed extreme poverty, and eliminated diseases over the past two centuries. AI could accelerate these trends dramatically.
The tradeoff isnât between solving scarcity at a high risk of extinction vs. never getting either of those things. Itâs between solving scarcity now at a high risk of extinction, vs. solving scarcity later at a much lower risk.
Second, focusing exclusively on extinction scenarios blinds us to other serious AI risks: authoritarian power grabs, democratic disruption through misinformation, mass surveillance, economic displacement, new forms of inequity. These deserve attention too.
Slowing down /â pausing AI development gives us more time to work on all of those problems. Racing to build ASI means not only are we risking extinction from misalignment, but weâre also facing a high risk of outcomes such as, for example, ASI being developed so quickly that governments donât have time to get a handle on whatâs happening and we end up with Sam Altman as permanent world dictator. (I donât think that particular outcome is that likely, itâs just an example.)
*although I think my conditional P(doom) is considerably higher than his
Slowing down /â pausing AI development gives us more time to work on all of those problems. Racing to build ASI means not only are we risking extinction from misalignment, but weâre also facing a high risk of outcomes such as, for example, ASI being developed so quickly that governments donât have time to get a handle on whatâs happening and we end up with Sam Altman as permanent world dictator.
This depends on what mechanism is used to pause. MIRI is proposing, among other things, draconian control over the worldwide compute supply. Whoever has such control has a huge amount of power to leverage over a transformative technology, which seems at least possibly (and to me, very likely) to increase the risk of getting a permanent world dictator, although the dictator in that scenario is perhaps more likely to be a head of state than the head of an AI lab.
Unfortunately, this means that there is no low risk path into the future, so I donât think the tradeoff is as straightforward as you describe:
The tradeoff isnât between solving scarcity at a high risk of extinction vs. never getting either of those things. Itâs between solving scarcity now at a high risk of extinction, vs. solving scarcity later at a much lower risk.
My preferred mechanism, and I think MIRIâs, would be an international treaty in which every country implements AI restrictions within its own borders. That means a head of state canât build dangerous AI without risking war. Itâs analogous to nuclear non-proliferation treaties.
I donât think I would call it low risk, but my guess is itâs less risky than the default path of âlet anyone build ASI with no regulationsâ.
My preferred mechanism, and I think MIRIâs, would be an international treaty in which every country implements AI restrictions within its own borders. That means a head of state canât build dangerous AI without risking war. Itâs analogous to nuclear non-proliferation treaties.
The control required within each country to enforce such a ban breaks the analogy to nuclear non-proliferation.
Uranium is an input to a general purpose technology (electricity), but it is not a general purpose technology itself, so it is possible to control its enrichment without imposing authoritarian controls on every person and industry in their use of electricity. By contrast, AI chips are themselves a general purpose technology, and exerting the proposed degree of control would entail draconian limits on every person and industry in society.
The relevant way in which itâs analogous is that a head of state canât build [dangerous AI /â nuclear weapons] without risking war (or sanctions, etc.).
The relevant way in which itâs analogous is that a head of state canât build [dangerous AI /â nuclear weapons] without risking war (or sanctions, etc.).
Fair enough, but China and the US are not going to risk war over that unless they believe doom is anywhere close to as certain as Eliezer believes it to be. And they are not going to believe that, in part because that level of certainty is not justified by any argument anyone including Eliezer has provided. And even if I am wrong on the inside view/âobject level to say that, there is enough disagreement about that claim among AI existential risk researchers that the outside view of a national government is unlikely to fully adopt Eliezerâs outlier viewpoint as its own.
But in return, we now have the tools of authoritarian control implemented within each participating country. And this is even if they donât use their control over the computing supply to build powerful AI solely for themselves. Just the regime required to enforce such control would entail draconian invasions into the lives of every person and industry.
I highly doubt you would say something as false as âdoom being inevitableâ without qualifiers!
Like, sure, maybe this is just really terrible miscommunication, but that itself also seems kind of crazy. Like, the above thread mentions no conditional. It does not say that âdoom is inevitable if we build ASIâ, or anything like that. It just claims that Nate + Eliezer say that âdoom is inevitableâ, no qualifiers.
I do think thereâs some amount of âthese guys are weirdo extremistsâ signaling implicit in stating that they think doom is inevitable, but I donât think it stems from not reading the book /â not understanding the conditional (the conditional is in the title!)
Yeah it goes out of its way to say the opposite, but if you know Nate and Eliezer the book gives the impression that their pdooms are still extremely high, and responding to the authorâs beliefs even when those arenât exactly the same as the text is sometimes correct, although not really in this case.
He also titled his review âAn Effective Altruism Take on IABIEDâ on LinkedIn. Given that Zach is the CEO of Centre for Effective Altruism, some readers might reasonably interpret this as Zach speaking for the EA community. Retitling the post to âBook Review: IABIEDâ or something else seems better.
The general pattern from Anthropic leadership is eliding entirely the possibility of Not Building The Thing Right Now. From that baseline, I commend Zach for at least admitting thatâs a possibility. Outright, itâs disappointing that he canât see the path of Donât Build It Right NowâAnd Then Build It Later, Correctly, or canât acknowledge its existence. He also doesnât really net benefits and costs. He just does the âWow! There sure are two sides. We should do good stuffâ shtick. Which is better than much of Darioâs rhetoric! Heâs cherrypicked a low p(doom) estimate, but I appreciate his acknowledgement that âMost of us wouldnât be willing to risk a 3% chance (or even a 0.3% chance!) of the people we love dying.â Correct! I am not willing to! âBut accepting uncertainty matters for navigating this complex challenge thoughtfully.â Yes. I have accepted my uncertainty of my loved onesâ survival, and I have been thoughtful, and the conclusion I have come to is that Iâm not willing to take that risk.
Tbc this is still a positive update for me on Anthropicâs leadership. To a catastrophically low level. Which is still higher than all other lab leaders.
But it reminds me of this world-class tweet, from @humanharlan, whom you should all follow. heâs like if roon werenât misaligned:
âAt one extreme: ASI, if not delayed, will very likely cause our extinction. Letâs try to delay it.
On the other: No chance it will do that. Donât try to delay it.
Nuanced, moderate take: ASI, if not delayed, is moderately likely to cause our extinction. Donât try to delay it.â
Zach Robinson, relevant because heâs on the Anthropic LTBT and for other reasons, tweets:
On the object level, I think Zach is massively underrating AI takeover risk, and I think that his reference to the benefits of AI misses the point.
On the meta level, I think Zachâs opinions are relevant (and IMO concerning) for people who are relying on Zach to ensure that Anthropic makes good choices about AI risks. I donât think the perspective articulated in these tweets is consistent with him doing a good job there (though maybe this was just poor phrasing on his part, and his opinions are more reasonable than this).
Just to help people understand the context: The book really doesnât say that doom is inevitable. It goes out of its way like 4 times to say the opposite. I really donât have a good explanation of Zachâs comment that doesnât involve him not having read the book, and nevertheless making a tweet thread about it with a confidently wrong take. IMO the above really reads to me as if he workshopped some random LinkedIn-ish platitudes about the book to seem like a moderate and be popular on social media, without having engaged with the substance at all.
The book certainly claims that doom is not inevitable, but it does claim that doom is ~inevitable if anyone builds ASI using anything remotely like the current methods.
I understand Zach (and other âmoderatesâ) as saying no, even conditioned on basically YOLO-ing the current paradigm to superintelligence, its really uncertain (and less likely than not) that the resulting ASI would kill everyone.
I disagree with this position, but if I held it, I would be saying somewhat similar things to Zach (even having read the book).
Though I agree that engaging on the object level (beyond âpredictions are hardâ) would be good.
My guess is that theyâre doing the motte-and-bailey of âmake it seem to people who havenât read the book that it says that the ASI extinction is inevitable, that the book is just spreading doom and gloomâ, from which, if challenged, they could retreat to âno, I meant doom isnât inevitable even if we do build ASI using the current methodsâ.
Like, if someone means the latter (and has also read the book and knows that it goes to great lengths to clarify that we can avoid extinction), would they really phrase it as âdoom is inevitableâ, as opposed to e. g. âsafe ASI is impossibleâ?
Or maybe they havenât put that much thought into it and are just sloppy with language.
Eliezer did write Death with Dignity which seems to assert that doom is inevitable, so the book not making that case, is a meaningful step.
I wouldnât. I roughly agree with Zachâs background position (i.e. Iâm quite uncertain about the likelihood of extinction conditional on YOLO-ing the current paradigm*) but I still think his conclusions are wild. Quoting Zach:
The tradeoff isnât between solving scarcity at a high risk of extinction vs. never getting either of those things. Itâs between solving scarcity now at a high risk of extinction, vs. solving scarcity later at a much lower risk.
Slowing down /â pausing AI development gives us more time to work on all of those problems. Racing to build ASI means not only are we risking extinction from misalignment, but weâre also facing a high risk of outcomes such as, for example, ASI being developed so quickly that governments donât have time to get a handle on whatâs happening and we end up with Sam Altman as permanent world dictator. (I donât think that particular outcome is that likely, itâs just an example.)
*although I think my conditional P(doom) is considerably higher than his
This depends on what mechanism is used to pause. MIRI is proposing, among other things, draconian control over the worldwide compute supply. Whoever has such control has a huge amount of power to leverage over a transformative technology, which seems at least possibly (and to me, very likely) to increase the risk of getting a permanent world dictator, although the dictator in that scenario is perhaps more likely to be a head of state than the head of an AI lab.
Unfortunately, this means that there is no low risk path into the future, so I donât think the tradeoff is as straightforward as you describe:
My preferred mechanism, and I think MIRIâs, would be an international treaty in which every country implements AI restrictions within its own borders. That means a head of state canât build dangerous AI without risking war. Itâs analogous to nuclear non-proliferation treaties.
I donât think I would call it low risk, but my guess is itâs less risky than the default path of âlet anyone build ASI with no regulationsâ.
The control required within each country to enforce such a ban breaks the analogy to nuclear non-proliferation.
Uranium is an input to a general purpose technology (electricity), but it is not a general purpose technology itself, so it is possible to control its enrichment without imposing authoritarian controls on every person and industry in their use of electricity. By contrast, AI chips are themselves a general purpose technology, and exerting the proposed degree of control would entail draconian limits on every person and industry in society.
The relevant way in which itâs analogous is that a head of state canât build [dangerous AI /â nuclear weapons] without risking war (or sanctions, etc.).
Fair enough, but China and the US are not going to risk war over that unless they believe doom is anywhere close to as certain as Eliezer believes it to be. And they are not going to believe that, in part because that level of certainty is not justified by any argument anyone including Eliezer has provided. And even if I am wrong on the inside view/âobject level to say that, there is enough disagreement about that claim among AI existential risk researchers that the outside view of a national government is unlikely to fully adopt Eliezerâs outlier viewpoint as its own.
But in return, we now have the tools of authoritarian control implemented within each participating country. And this is even if they donât use their control over the computing supply to build powerful AI solely for themselves. Just the regime required to enforce such control would entail draconian invasions into the lives of every person and industry.
I highly doubt you would say something as false as âdoom being inevitableâ without qualifiers!
Like, sure, maybe this is just really terrible miscommunication, but that itself also seems kind of crazy. Like, the above thread mentions no conditional. It does not say that âdoom is inevitable if we build ASIâ, or anything like that. It just claims that Nate + Eliezer say that âdoom is inevitableâ, no qualifiers.
I do think thereâs some amount of âthese guys are weirdo extremistsâ signaling implicit in stating that they think doom is inevitable, but I donât think it stems from not reading the book /â not understanding the conditional (the conditional is in the title!)
Yeah it goes out of its way to say the opposite, but if you know Nate and Eliezer the book gives the impression that their pdooms are still extremely high, and responding to the authorâs beliefs even when those arenât exactly the same as the text is sometimes correct, although not really in this case.
He also titled his review âAn Effective Altruism Take on IABIEDâ on LinkedIn. Given that Zach is the CEO of Centre for Effective Altruism, some readers might reasonably interpret this as Zach speaking for the EA community. Retitling the post to âBook Review: IABIEDâ or something else seems better.
The general pattern from Anthropic leadership is eliding entirely the possibility of Not Building The Thing Right Now. From that baseline, I commend Zach for at least admitting thatâs a possibility. Outright, itâs disappointing that he canât see the path of Donât Build It Right NowâAnd Then Build It Later, Correctly, or canât acknowledge its existence. He also doesnât really net benefits and costs. He just does the âWow! There sure are two sides. We should do good stuffâ shtick. Which is better than much of Darioâs rhetoric! Heâs cherrypicked a low p(doom) estimate, but I appreciate his acknowledgement that âMost of us wouldnât be willing to risk a 3% chance (or even a 0.3% chance!) of the people we love dying.â Correct! I am not willing to! âBut accepting uncertainty matters for navigating this complex challenge thoughtfully.â Yes. I have accepted my uncertainty of my loved onesâ survival, and I have been thoughtful, and the conclusion I have come to is that Iâm not willing to take that risk.
Tbc this is still a positive update for me on Anthropicâs leadership. To a catastrophically low level. Which is still higher than all other lab leaders.
But it reminds me of this world-class tweet, from @humanharlan, whom you should all follow. heâs like if roon werenât misaligned:
âAt one extreme: ASI, if not delayed, will very likely cause our extinction. Letâs try to delay it.
On the other: No chance it will do that. Donât try to delay it.
Nuanced, moderate take: ASI, if not delayed, is moderately likely to cause our extinction. Donât try to delay it.â