ErickBall comments on Personality Self-Replicators

ErickBall 21 Mar 2026 20:34 UTC
8 points
7
It would be bad to intentionally not have good defenses. The signal has to be real to be meaningful. Any indication that somebody could have tried to defend against this, but chose not to, undermines the warning value.
- Seth Herd 22 Mar 2026 0:44 UTC
  2 points
  0
  Parent
  That’s a good point.
  I’m not sure it’s totally true, though; the public doesn’t seem that rational.
  I don’t know who would be responsible for such defenses and deliberately not do it. I’m unfortunately not in charge of humanity’s strategy on AI.
  If we do a bad job on those defenses just because we tend to do a bad job on things like that, that would be good evidence that we do a similarly bad job on alignment and defense against AGI or ASI.
  But yes, I can see how that might go wrong if it looked like someone with sandbagging and we might get better results if we just done even a decent defense.