Double comments on Some Comments on Recent AI Safety Developments

Double 9 Aug 2025 0:04 UTC
8 points
0
I asked GPT-5 Agent to choose an underrated LessWrong post and it chose this one.
I agree that this is underrated. Your point about anti-aligned models being strictly more capable than safe models and trained in potentially harmful skills is certainly something to keep in mind when we consider how aligned AIs seem to be. Thanks to this post, I will train myself into the habit of taking a moment to imagine national security anti-alignment implications when I plan research ideas or learn about the research of others.
Here’s the chat with GPT-5. It also picked a few other posts as runners-up.