jwfiredragon comments on Notes on fatalities from AI takeover

jwfiredragon 24 Sep 2025 20:23 UTC
4 points
1
we are deliberately seeking to build certainties of mind
I think “deliberately seeking to build” is the wrong way to frame the current paradigm—we’re growing the AIs through a process we don’t fully understand, while trying to steer the external behaviour in the hopes that this corresponds to desirable mind structures.
If we were actually building the AIs, I would be much more optimistic about them coming out friendly.
- TAG 24 Sep 2025 21:45 UTC
  2 points
  −2
  Parent
  Not fully understanding things is the default … even non AI software can’t be fully understood if it is complex enough. We already know how to probe systems we don’t understand apriori, through scientific experimentation. You don’t have to get alignment right first time, at least not without the foom/RRSI or incorrigibility assumptions.
  - jwfiredragon 25 Sep 2025 0:53 UTC
    1 point
    0
    Parent
    The difference with normal software is that at least somebody understands every individual part, and if you collected all those somebodies and locked them in a room for a while they could write up a full explanation. Whereas with AI I think we’re not even like 10% of the way to full understanding.
    Also, if you’re trying to align a superintelligence, you do have to get it right on the first try, otherwise it kills you with no counterplay.
    - TAG 25 Sep 2025 8:47 UTC
      3 points
      1
      Parent
      
      Also, if you’re trying to align a superintelligence, you do have to get it right on the first try, otherwise it kills you with no counterplay
      
      That has not been demonstrated.
      
      ( “Gestures towards IABIED”
      
      “Gestures towards critiques thereof”)