Raemon comments on Anthropic, and taking “technical philosophy” more seriously

Raemon 14 Mar 2025 1:40 UTC
4 points
2
I expect that AIs will be obedient when they initially become capable enough to convince governments that further AI development would be harmful (if it would in fact be harmful).
Seems like “the AIs are good enough at persuasion to persuade governments and someone is deploying them for that” is right when you need to be very high confidence they’re obedient (and, don’t have some kind of agenda). If they can persuade governments, they can also persuade you of things.
I also think it gets into a point where I’d sure feel way more comfortable if we had more satisfying answers to “where exactly are we supposed to draw the line between ‘informing’ and ‘manipulating’” (I’m not 100% sure what you’re imagining here tho)
- PeterMcCluskey 15 Mar 2025 4:13 UTC
  2 points
  0
  Parent
  I’m assuming that the AI can accomplish its goal by honestly informing governments. Possibly that would include some sort of demonstration that the of the AI’s power that would provide compelling evidence that the AI would be dangerous if it wasn’t obedient.
  
  I’m not encouraging you to be comfortable. I’m encouraging you to mix a bit more hope in with your concerns.