ryan_greenblatt comments on ryan_greenblatt’s Shortform

ryan_greenblatt 22 Apr 2026 17:20 UTC
113 points
87
It’s good that Anthropic’s system cards contain a lot of useful information on misalignment and risk. It’s also good they are putting out detailed sabotage risk reports that articulate their views. ^[1] I appreciate the hard work of many employees going into these reports.

I wish other companies released similarly informative reports. ^[2]
1. ↩︎
  I’m not necessarily claiming I agree with their risk reports. For instance, I have at least some moderately important disagreements with the Mythos Preview sabotage risk report update. But having lots of detail, including sufficient detail that I can get a decent sense of where I disagree with the analysis, is praiseworthy.
2. ↩︎
  I’m not claiming that this is the most leveraged thing for safety-motivated employees at other companies to work on, but I do think it would be good for AI companies to do better (without trading off against other safety efforts).