habryka comments on Anthropic is (probably) not meeting its RSP security commitments

habryka 19 Nov 2025 23:38 UTC
24 points
2
Jason Clinton very briefly responded saying that the May 14th update, which excluded sophisticated insiders from the threat model, addresses the concerns in this post, plus some short off-the-record comments.
Based on this and Holden’s comment, my best interpretation of Anthropic’s position on this issue is that they currently consider employees at compute providers to be “insiders” and executives “sophisticated insiders”, and the latter hereby excluded from Anthropic’s security commitments. They also likely furthermore think that compute providers do not have the capacity to execute attacks like this without very high chance of getting caught and so that this is not a threat model they are concerned about.
As I argue for in the post, defining basically all employees at compute providers to be “insiders” feels like an extreme stretch of the word, and I think has all kinds of other tricky implications for the RSP, but it’s not wholly inconsistent!
I think to bring Anthropic back into compliance with what I think is a common sense reading, I would suggest (in descending goodness):
1. updating the RSP with a carve-out for major compute providers,
2. use a different word from “insider” for the class that Anthropic is meaning to exclude in its May update,
3. or at the very least provide a clear definition of “insider” within the RSP (but largely I would really advocate against using the word “insider” at all here, since I don’t think people expect that to include compute provider employees).
I separately seem to disagree with Anthropic on this being a thing that executives at Google/Amazon/Microsoft could be motivated to do, and a thing that they would succeed at if they tried, but given the broad definition of “Insider” it doesn’t appear to be load bearing for the thesis in the OP. I have written a bit more about it anyways in this comment thread.