Here’s my shot at a simple argument for pausing AI.
We might soon hit a point of no return and the world is not at all ready.
A central point of no return is if we kick off a recursive automated AI R&D feedback loop (i.e., an intelligence explosion), where the AI systems get smarter and more capable, and humans are totally unable to keep up. I can imagine humans nominally still being in the loop but not actually understanding things, or being totally reliant on AIs explaining dumbed down versions of the new AI techniques being discovered.
There are other points of no return that are less discrete, such as if states become economically or militarily reliant on AI systems. Maybe due to competitive dynamics with other states, or just because the AIs are so damn useful and it would be too inconvenient to remove them from all the societal systems they are now a part of. See “The date of AI Takeover is not the day the AI takes over” for related discussion.
If we hit a point of no return and develop advanced AI (including superintelligent AI), this will come with a whole range of problems that the world is not ready for. I think any of these would be reasonable grounds for pausing until we can deal with them.[1]
Misalignment: We haven’t solved alignment, and it seems like by default we won’t. The majority of techniques for making AIs safer today will not scale to superintelligence. I think this makes Loss of Control a likely outcome (as in humans lose control over the entire future and almost all value is lost).
War and geopolitical destabilization: Advanced AI or the technologies it enables are politically destabilizing, such as removing states’ second-strike nuclear capabilities. States may go to war or perform preemptive strikes to avoid this.
Catastrophic misuse: Malicious actors or rogue states may gain access to AI (e.g., by stealing model weights, training the AI themselves, or using an open weights model), and use it to cause catastrophic harm. Current AIs are not yet at this level, but future AIs will likely be.
Authoritarianism and bad lock-in: AI could lead to unprecedented concentration of power, it might enable coups to be performed with relatively little support from human actors, and then entrench this concentrated power.
Gradual disempowerment: AIs could be more productive than humans, and economic competitive pressures mean that humans slowly lose power over time, to the point where we no longer have any effective control. This could happen even without any power seeking AI performing a power-grab.
The world is not on track to solve these problems. On the current trajectory of AI development, we will likely run head-first into these problems wildly unprepared.
Here’s my shot at a simple argument for pausing AI.
We might soon hit a point of no return and the world is not at all ready.
A central point of no return is if we kick off a recursive automated AI R&D feedback loop (i.e., an intelligence explosion), where the AI systems get smarter and more capable, and humans are totally unable to keep up. I can imagine humans nominally still being in the loop but not actually understanding things, or being totally reliant on AIs explaining dumbed down versions of the new AI techniques being discovered.
There are other points of no return that are less discrete, such as if states become economically or militarily reliant on AI systems. Maybe due to competitive dynamics with other states, or just because the AIs are so damn useful and it would be too inconvenient to remove them from all the societal systems they are now a part of. See “The date of AI Takeover is not the day the AI takes over” for related discussion.
If we hit a point of no return and develop advanced AI (including superintelligent AI), this will come with a whole range of problems that the world is not ready for. I think any of these would be reasonable grounds for pausing until we can deal with them.[1]
Misalignment: We haven’t solved alignment, and it seems like by default we won’t. The majority of techniques for making AIs safer today will not scale to superintelligence. I think this makes Loss of Control a likely outcome (as in humans lose control over the entire future and almost all value is lost).
War and geopolitical destabilization: Advanced AI or the technologies it enables are politically destabilizing, such as removing states’ second-strike nuclear capabilities. States may go to war or perform preemptive strikes to avoid this.
Catastrophic misuse: Malicious actors or rogue states may gain access to AI (e.g., by stealing model weights, training the AI themselves, or using an open weights model), and use it to cause catastrophic harm. Current AIs are not yet at this level, but future AIs will likely be.
Authoritarianism and bad lock-in: AI could lead to unprecedented concentration of power, it might enable coups to be performed with relatively little support from human actors, and then entrench this concentrated power.
Gradual disempowerment: AIs could be more productive than humans, and economic competitive pressures mean that humans slowly lose power over time, to the point where we no longer have any effective control. This could happen even without any power seeking AI performing a power-grab.
The world is not on track to solve these problems. On the current trajectory of AI development, we will likely run head-first into these problems wildly unprepared.
Somewhat adapted from our research agenda.