nim comments on Take Precautionary Measures Against Superhuman AI Persuasion

nim 16 Jul 2025 20:49 UTC
2 points
0

the more time you spend arguing with someone, the more likely it becomes that you can influence their actions.

This works because “someones” retain memory across interactions. ChatGPT defaults to behaving like a “someone” in this regard. Claudes, and to my knowledge Geminis as well, go into each chat without knowledge from prior conversations with the user. I am not convinced that an entity with symptoms of anterograde amnesia can pose this type of threat.

What’s the threat model for how AI with an amnesiac implementation is expected to attempt to influence an individual across separate interaction contexts?