Charbel-Raphaël comments on Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development

Charbel-Raphaël 15 May 2025 15:12 UTC
LW: 7 AF: 3
1
AF
While I concur that power concentration is a highly probable outcome, I believe complete disempowerment warrant deeper consideration, even under the assumptions you’ve laid out. Here are some thoughts on your specific points:
1. On Baseline Alignment: You suggest a baseline alignment where AIs are unlikely to engage in egregious lying or tampering (though you also flag 20% for scheming and 10% for unintentional egregious behavior even with prevention efforts, that’s already 30%-ish of risk). My concern is twofold:
  - Sufficiency of “Baseline”: Even if AIs are “baseline aligned” to their creators, this doesn’t automatically mean they are aligned with broader human flourishing or capable of compelling humans to coordinate against systemic risks. For an AI to effectively say, “You are messing up, please coordinate with other nations/groups, stop what you are doing” requires not just truthfulness but also immense persuasive power and, crucially, human receptiveness. Even if pausing AI was the correct thing to do, Claude is not going to suggest this to Anthropic folks for obvious reasons. As we’ve seen even with entirely human systems (Trump’s Administration and Tariff), possessing information or even offering correct advice doesn’t guarantee it will be heeded or lead to effective collective action.
  - Erosion of Baseline: The pressures described in the paper could incentivise the development or deployment of AIs where even “baseline” alignment features are traded off for performance or competitive advantage. The “AI police” you mention might struggle to keep pace or be defunded/sidelined if it impedes perceived progress or economic gains. “Innovation first!”, “Drill baby drill” “Plug baby plug” as they say
2. On “No strong AI rights before full alignment”: You argue that productive AIs won’t get human-like rights, especially strong property rights, before being robustly aligned, and that human ownership will persist.
  - Indirect Agency: Formal “rights” might not be necessary for disempowerment. An AI, or a network of AIs, could exert considerable influence through human proxies or by managing assets nominally owned by humans who are effectively out of the loop or who benefit from this arrangement. An AI could operate through a human willing to provide access to a bank account and legal personhood, thereby bypassing the need for its own “rights.”
3. On “No hot global war”:
  You express hope that we won’t enter a situation where a humanity-destroying conflict seems plausible.
  - Baseline Risk: While we all share this hope, current geopolitical forecasting (e.g., from various expert groups or prediction markets) often places the probability of major power conflict within the next few decades at non-trivial levels. For a war that makes more than 1M of deaths, some estimates hover around 25%. (But probably your definition of “hot global war” is more demanding)
  - AI as an Accelerant: The dynamics described in the paper – nations racing for AI dominance, AI-driven economic shifts creating instability, AI influencing statecraft – could increase the likelihood of such a conflict.
Responding to your thoughts on why the feedback loops might be less likely if your three properties hold:
- “Owners of capital will remain humans and will remain aware...able to change the user of that AI labor if they desire so.”
  - Awareness doesn’t guarantee the will or ability to act against strong incentives. AGI development labs are pushing forward despite being aware of the risks, often citing competitive pressures (“If we don’t, someone else will”). This “incentive trap” is precisely what could prevent even well-meaning owners of capital from halting a slide into disempowerment. They might say, “Stopping is impossible, it’s the incentives, you know,” even if their pDoom is 25% like Dario, or they might not give enough compute to their superalignment team.
- “Politicians...will remain aware...able to change what the system is if it has obviously bad consequences.”
  - The climate change analogy is pertinent here. We have extensive scientific consensus, an “oracle IPCC report”, detailing dire consequences, yet coordinated global action remains insufficient to meet the scale of the challenge. Political systems can be slow, captured by short-term interests, or unable to enact unpopular measures even when long-term risks are “obviously bad.” The paper argues AI could further entrench these issues by providing powerful tools for influencing public opinion or creating economic dependencies that make change harder.
- “Human consumers of culture will remain able to choose what culture they consume.”
  - You rightly worry about “brain-hacking.” The challenge is that “obviously bad” might be a lagging indicator. If AI-generated content subtly shapes preferences and worldviews over time, the ability to recognise and resist this manipulation could diminish before the situation becomes critical. I think that people are going to LOVE AI, and might take the trade to go faster and be happy and disempowered like some junior developers begin to do on Cursor.
As a meta point, the fact that the quantity and quality of discourse on this matter is so low, and the fact that people are continuing to say “LET’S GO WE ARE CREATING POWERFUL AIS, and don’t worry, we plan to align them, even if we don’t really know which type of alignment do we really need, and if this is even doable in time” while we have not rigorously assessed all those risks, is really not a good sign.
At the end of the day, my probability for something in the ballpark of gradual disempowerment / extreme power concentration and loss of democracy is 40%-ish, much higher than scheming (20%) leading to direct takeover (let’s say 10% post mitigation like control).
- Fabien Roger 17 Jun 2025 20:31 UTC
  LW: 2 AF: 2
  0
  AF Parent
  I think I have answered to some of your objections in answer to another comment.
  I think we would not resolve our disagreement easily in a comment thread: I feel like I am missing pieces of the worldview which make me make wrong predictions about our current world when I try to make back-predictions (e.g. why do we have so much prosperity now if coordination is so hard), and I am also disagreeing on some observations about the current world (e.g. my current understanding of the IPCC report is that it is much less doomy about our current trajectory than you seem to suggest). I’d be happy to chat in person at some point to sort this out!
  - Fabien Roger 17 Jun 2025 20:36 UTC
    LW: 4 AF: 5
    0
    AF Parent
    On a lighter note, I feel like many people here are much more sympathetic to “power concentration bad” when thinking about the gradual decline of democracy than when facing concerns about China winning the AI race. I think this is mostly vibes, I don’t think many people are actually making the mistake of choosing their terminal values based on whether it results in the conclusion “we should stop” vs “we should go faster” (+ there are some differences between the two scenarios), but I really wanted to make this meme:
    - yams 18 Jun 2025 17:22 UTC
      3 points
      2
      Parent
      I’m confused or don’t find this funny or don’t think it goes through at all or would like you to explain the joke.
      The counterfactual to China winning is the US winning, which still concentrates power; the more accurate version of the second panel would be “power concentration bad unless the US does it.”
      And for anyone who thinks power concentration is bad unless the US does it, they have to engage with the ‘AGI-enabled decline of democracy’ arguments in order to begin to advocate for their position seriously.
      I don’t think there are just ‘some differences’ here; I think this is a complete disanalogy.
      - Buck 18 Jun 2025 17:27 UTC
        4 points
        2
        Parent
        One reason to think that the US winning concentrates power less is that the US is a democracy with a strong tradition of maintaining individual rights and a reasonably strong history (over the last 80 years) of pursuing a world order where it benefits from lots of countries being pretty stable and not e.g. invading each other.
        yams 18 Jun 2025 17:39 UTC
        4 points
        3
        Parent
        Yes, this is the argument I was anticipating with:
        for anyone who thinks power concentration is bad unless the US does it, they have to engage with the ‘AGI-enabled decline of democracy’ arguments in order to begin to advocate for their position seriously
        I don’t think you just get it for free; I think you need to explain why you expect this to hold as shit gets crazy. It’s prima facie reasonable to think that the US is likely to act more responsibly than China here; I don’t think it’s prima facie reasonable to think that any actor is likely to act especially responsibly (e.g. in a way that sidesteps the concerns of the character from panel 1).
        Power concentration bad so power concentration bad, even if by a marginally more benevolent actor.
        (likely we disagree on the delta between future US and future China here, but I’d like to avoid arguing that point; I just mean to point out that the absolute reasonableness matters, and that flatly synonymizing the US with freedom, casting it as an actor incapable of the bad kind of concentration of power*, and China with authoritarianism, casting it as an actor obligated to commit the bad kind of concentration of power, is a mistake)
        The US loses and democracy loses are just not the same situation; the US is an example of a democracy; it’s not The Spirit Of Democracy Itself.
        *without addressing the gradual disempowerment arguments ahead of time
        Fabien Roger 20 Jun 2025 14:11 UTC
        3 points
        0
        Parent
        You don’t get it for free, but I think it’s reasonable to assume that P(concentrated power | US wins) is smaller than P(concentrated power | China wins) given that the later is close to 1 (except if you are very doomy about power concentration, which I am not), right? Not claiming the US is more reasonable, just that it’s more likely that a western democracy winning makes power concentration less likely to happen than if it’s a one-party state. It’s also possible I am overestimating P(concentrated power | China wins), I am not an expert in Chinese politics.
        yams 20 Jun 2025 15:56 UTC
        2 points
        0
        Parent
        I disagree on your assessment of both nations, and am pretty doomy about concentration of power.
        I think how a nation with a decisive strategic advantage treats the rest of the world has more to do with its decisive strategic advantage and its needs, and less to do with its flag, or even history.
        Anyway, my main point was structural: if panel 2 depends on holding very particular views regarding panel 1, the parallelism is lost and it hurts the joke.
        Thanks for the response!