Da_Peach comments on Power Lies Trembling: a three-book review

Da_Peach 20 Apr 2025 9:06 UTC
3 points
0
To sway public opinion about AI safety, let us consider the case of nuclear warfare—a domain where long-term safety became a serious institutional concern. Nuclear technology wasn’t always surrounded by protocols, safeguards, and watchdogs. In the early days, it was a raw demonstration of power: the bombs dropped on Hiroshima and Nagasaki were enough to show the sheer magnitude of destruction possible. That spectacle shocked the global conscience. It didn’t take long before nation after nation realized that this wasn’t just a powerful new toy, but an existential threat. As more countries acquired nuclear capabilities, the world recognized the urgent need for checks, treaties, and oversight. What began as an arms race slowly transformed into a field of serious, respected research and diplomacy—nuclear safety became a field in its own right.
The point is: public concern only follows recognition of risk. AI safety, like nuclear safety, will only be taken seriously when people see it as more than sci-fi paranoia. For that shift to happen, we need respected institutions to champion the threat. Right now, it’s mostly academics raising the alarm. But the public—especially the media and politicians—won’t engage until the danger is demonstrated or convincingly explained. Unfortunately for the AI safety issue, evidence of AI misalignment causing significant trouble will probably mean it’s too late.
Adding fuel to this fire is the fact that politicians aren’t gonna campaign about AI safety if the corpos in your country don’t want to & your enemies are already neck-to-neck in AI dev.
In my subjective opinion, we need the AI variant of Hiroshima. But I’m not too keen on this idea, for it is a rather dreadful thought.
Edit: I should clarify what I mean by “the AI variant of Hiroshima.” I don’t think a large-scale inhuman military operation is necessary (as I already said, I don’t want AI warfare). What I mean instead is something that causes significant damage & makes it to newspaper headlines worldwide. Examples: strong evidence that AI swayed the presidential election one way; a gigantic economic crash caused because of a rogue AI (not the AI bubble bursting); millions of jobs being lost in a short timeframe because of one revolutionary model, which then snaps because of misalignment; etc. There are still dreadful, but at least no human lives are lost & it gets the point across that AI safety is an existential issue.
- Knight Lee 20 Apr 2025 18:04 UTC
  1 point
  0
  Parent
  I think different kinds of risks have different “distributions” of how much damage they do. For example, the majority of car crashes causes no injuries (but damage to the cars), a smaller number causes injuries and some causes fatalities, and the worst ones can cause multiple fatalities.
  For other risks like structural failures (of buildings, dams, etc.) the distribution has a longer tail: in the worst case very many people can die. But the distribution still tapers off towards greater number of fatalities, and people sort have have a good idea of how bad it can get before the worst version happens.
  For risks like war, the distribution has an even longer tail, and people are often caught by surprise how bad they can get.
  But for AI risk, the distribution of damage caused is very weird. You have one distribution for AI causing harm due to its lack of common sense, where it might harm a few people, or possibly cause one death. Yet you have another distribution for AI taking over the world, with a high probability of killing everyone, a high probability of failing (and doing zero damage), and only a tiny bit of probability in between.
  It’s very very hard to learn from experience in this case. Even the biggest wars tend to surprise everyone (despite having a relatively more predictable distribution).
  - Da_Peach 26 Apr 2025 11:16 UTC
    2 points
    0
    Parent
    That’s a cool way to frame damage risks, but I think your distribution for AI damage is for ASI, not AGI. I think it’s very reasonable that an AGI-based system may cause the type of damage that I am talking about.
    Even if you believe that as soon as we achieve AGI, we’ll accelerate to ASI because AGI by definition is self-improving, it still takes time to train a model, and research is slow. I hope that the window b/w AGI & ASI is large enough for such a “Hiroshima event” to occur, so humanity wakes up to the risks of maligned AI systems.
    PS: Sorry for the late response, I was offline for a couple of days
    - Knight Lee 27 Apr 2025 8:43 UTC
      2 points
      0
      Parent
      No need to say sorry for that! On a forum, there is no expectation to receive a reply. If every reply obligated the recipient to make another reply, comment chains will drag on forever.
      You can freely wait a year before replying.
      I’m worried that once a “Hiroshima event” occurs, humanity won’t have another chance. If the damage is caused by the AGI/ASI taking over places, then the more power it obtains, the more it can obtain even more power, so it won’t stop at any scale.
      If the damage is caused by bad actors using an AGI to invent a very deadly technology, there is a decent chance humanity can survive, but it’s very uncertain. A technology can never be uninvented, and more and more people will know about it.
      - Kaj_Sotala 27 Apr 2025 10:17 UTC
        3 points
        0
        Parent
        You can freely wait a year before replying.
        Or more! (I was delighted to receive this reply.)