Orpheus16 comments on AI Governance to Avoid Extinction: The Strategic Landscape and Actionable Research Questions

Orpheus16 4 May 2025 18:59 UTC
11 points
0
I appreciate the articulation and assessment of various strategies. My comment will focus on a specific angle that I notice both in the report and in the broader ecosystem:
I think there has been a conflating of “catastrophic risks” and “extinction/existential risks” recently, especially among groups that are trying to influence policy. This is somewhat understandable– the difference between “catastrophic” and “existential” is not that big of a deal in most people’s minds. But in some contexts, I think it misses the fact that “existential [and thus by definition irreversible]” is actually a very different level of risk compared to “catastrophic [but something that we would be able to recover from.]”
This view seems to be (implicitly) expressed in the report summary, most notably the chart. It seems to me like the main frame is something like “if you want to avoid an unacceptable chance of catastrophic risk, all of these other options are bad.”
But not all of these catastrophic risks are the same, I think this is actually quite an important consideration, and I think even (some) policymakers would/will see this as an essential consideration as AGI becomes more salient.
Specifically, “war” and “misuse” seem very different than “extinction” or “total and irreversible civilizational collapse.”
- “War” is broad enough to encompass many outcomes (ranging from “conflict with <1M deaths” to “nuclear conflict in which civilization recovers” all the way to “nuclear conflict in which civilization does not recover.”) Note also that many natsec leaders already think the chance of a war between the US and China is at a level that would probably meet an intuitive bar for “unacceptable.” (I don’t have actual statistics on this but my guess is that >10% chance of war in the next decade is not an uncommon view. One plausible pathway that is discussed often is China invading Taiwan and US being committed to its defense).
- “Misuse” can refer to many different kinds of events (including $1B in damages from a cyberattack, 10M deaths, 1B deaths, or complete human extinction.) These are, of course, very different in terms of their overall impact, even though all of them are intuitively/emotionally stored as “very bad things that we would ideally avoid.”
It seems plausible to me that we will be in situations in which policymakers have to make tricky trade-offs between these different sources of risk, and my hope is that the community of people concerned about AI can distinguish between the different “levels” or “magnitudes” of different types of risks.
(My impression is that MIRI agrees with this, so this is more a comment on how the summary was presented & more a general note of caution to the ecosystem as a whole. I also suspect that the distinction between “catastrophic” and “existential/civilization-ending” will become increasingly more important as the AI conversation becomes more interlinked with the national security apparatus.)
Caveat: I have not read the full report and this comment is mostly inspired by the summary, the chart, and a general sense that many organizations other than MIRI are also engaging in this kind of conflation.
- Aaron_Scher 4 May 2025 21:00 UTC
  1 point
  0
  Parent
  I agree that the report conflates these two scales of risk. Fortunately, one nice thing about that table (Table 1 in the paper) is that readers can choose which of these risks they want to prioritize. I think more longtermist-oriented folks should probably weigh the badness of these as Loss on Control being the most bad, followed perhaps by Bad Lock-in, then Misuse and War. But obviously there’s a lot of variance within these.
  I agree that there *might* be some cases where policymakers will have difficult trade-offs to make about these risks. I’m not sure how likely I think this is, but I agree it’s a good reason we should keep this nuance insofar as we can. I guess it seems to me like we’re not anywhere near the right decision makers actually making these tradeoffs, nor near them having values that particularly up-weigh the long term future.
  I therefore feel okay about lumping these together in a lot of my communication these days. But perhaps this is the wrong call, idk.