Petropolitan comments on Catastrophe through Chaos

Petropolitan 2 Feb 2025 15:52 UTC
5 points
0
This is a scenario I have been thinking for perhaps about three years. However you made an implicit assumption I wish was explicit: there is no warning shot.
I believe that with such a slow takeoff there is a very high probability of an AI alignment failure causing significant loss of life already at the TAI stage and that would significantly change the dynamics
- Marius Hobbhahn 2 Feb 2025 16:39 UTC
  2 points
  0
  Parent
  There are two sections that I think make this explicit:
  
  1. No failure mode is sufficient to justify bigger actions.
  2. Some scheming is totally normal.
  
  My main point is that even things that would seem like warning shots today, e.g. severe loss of life, will look small in comparison to the benefits at the time, thus not providing any reason to pause.
  - Petropolitan 3 Feb 2025 13:52 UTC
    1 point
    0
    Parent
    I don’t think the second point is anyhow relevant here while the first one is worded so that it might imply something on the scale of “AI assistant convinces a mentally unstable person to kill their partner and themselves”—not something that would be perceived as a warning shot by the public IMHO (have you heard there were at least two alleged suicides driven by GPT-J 6B? The public doesn’t seem to bother https://www.vice.com/en/article/man-dies-by-suicide-after-talking-with-ai-chatbot-widow-says/ https://www.nytimes.com/2024/10/23/technology/characterai-lawsuit-teen-suicide.html).
    I believe that dozens of people killed by misaligned AI in a single incident will be enough smoke in the room https://www.lesswrong.com/posts/5okDRahtDewnWfFmz/seeing-the-smoke for the metaphorical fire alarm to go off. What to do after that is a complicated political topic: for example, French voters has always believed that nuclear accidents look small in comparison to the benefits of the nuclear energy while Italian and German ones hold the opposite opinion. The sociology data available, AFAIK, generally indicates that people in many societies have certain fears regarding possible AI takeover and is quite unlikely to freak out less than it did after Chernobyl, but that’s hard to predict