Simon Lermen comments on Anthropic’s focus on hyperstition

Simon Lermen 11 May 2026 21:47 UTC
3 points
0
That first sentence you point out isn’t written well and kind of says something different from the rest of this text, thanks for pointing this out.
I write this later:
To be clear, I think it is actually possible that some current misaligned behavior in AIs is caused by roleplaying from its pre-training distribution.
What my point is: Anthropic seems to consider hyperstition really important for alignment, including alignment of future superhuman AI. Hyperstition is a harmful argument to spread for the discourse. And doesn’t appear relevant to aligning actually dangerous, superhuman AI. It can totally explain some current weird misbehavior from AI.