I’m really not following your argument here. Of course in many instances compute providers don’t offer zero trust relationships with those running on their systems. This is just not news. There’s a reason why we have an entire universe of compensating technical and non-technical controls to mitigate risk in such circumstances.
You have done zero analysis to identify any reason to believe that those compensating controls are insufficient. You could incredibly easily get me to flip sides in this discussion if you offered any of that, but simply saying that someone’s running zero trust isn’t sufficient. As a hypothetical, if Anthropic is expending meaningful effort to be highly confident that they can ensure that Amazon’s own security processes are securing against insiders, they would have substantial risk reduction (as long as they can have high confidence said processes are continuing to be executed.
Separately, though it probably cuts against my argument above [1], I would politely disagree with the perhaps-unintended implication in your comment above that “implement zero trust” is a sufficient definition of defenses to defend against compute providers like Amazon, MSFT, etc. After all, Anthropic’s proper threat modeling of them should include things like, “Amazon, Microsoft, etc. employ former nation-state hackers who considered attacking zero trust networks to be part of the cost of doing business.”
Anthropic has released their own whitepaper where they call out what kind of changes would need to be required. Can you please engage with my arguments?
I have now also heard from 1-2 Anthropic employees about this. The specific people weren’t super up-to-date on what Anthropic is doing here, and didn’t want to say anything committal, but nobody I talked to had a reaction that suggested that they thought it was likely that Anthropic is robust to high-level insider threats at compute providers.
Like, if you want you can take a bet with me here, I am happy to offer you 2:1 odds on the opinion of some independent expert we both trust on whether Anthropic is likely robust to that kind of insider threat. I can also start going into all the specific technical reasons, but that would require restating half of the state of the art of computer security, which would be a lot of work. I really think that not being robust here is a relatively straightforward inference that most people in the field will agree with (unless Anthropic has changed operating procedure substantially from what is available to other consumers in their deals with cloud providers, which I currently think is unlikely, but not impossible).
By the “whitepaper,” are you referring to the RSP v2.2 that Zach linked to, or something else? If so, I don’t understand how a generic standard can “call out what kind of changes would need to be required” to their current environment if they’re also claiming they meet their current standard.
Also, just to cut a little more to brass tacks here, can you describe the specific threat model that you think they are insufficiently responding to? By that, I don’t mean just the threat actor (insiders within their compute provider) and their objective to get weights, but rather the specific class or classes of attacks that you expect to occur, and why you believe that existing technical security + compensating controls are insufficient given Anthropic’s existing standards.
For example, AIUI the weights aren’t just sitting naively decrypted at inference, they’re running inside a fairly locked down trusted execution environment, with keys provided only as-needed (probably with an ephemeral keying structure?) from a HSM, and those trusted execution environments are operating inside a physical security perimeter of a data center that already is designed to mitigate insider risk. Which parts of this are you worried are attackable? To what degree are organizational boundaries between Anthropic and its compute providers salient to increasing this risk? Why should we expect that the compute providers don’t already have sufficient compensatory controls here, given that, e.g., these compute providers also provide classified compute to the US government that is secured at the Top Secret / SCI level and presumably therefore have best-in-class anti-insider-threat capabilities?
I’m extremely willing to buy into a claim that they’re not doing enough, but I would actually need to have an argument here that’s more specific.
Also, just to cut a little more to brass tacks here, can you describe the specific threat model that you think they are insufficiently responding to? By that, I don’t mean just the threat actor (insiders within their compute provider) and their objective to get weights, but rather the specific class or classes of attacks that you expect to occur, and why you believe that existing technical security + compensating controls are insufficient given Anthropic’s existing standards.
I don’t have a complaint that Anthropic is taking insufficient defenses here. I have a complaint that Anthropic said they would do something in their RSP that they are not doing.
The threat is really not very complicated, it’s basically:
A high-level Amazon executive decides they would like to have a copy of Claude’s weights
They physically go to an inference server and manipulate the hardware to dump the weights unencoded to storage (or read out the weights from a memory bus directly, or use one of the many attacks available to you if you have unlimited hardware access)
data center that already is designed to mitigate insider risk
Nobody in computer security I have ever talked to thinks that data in memory in normal cloud data centers would be secure against physical access from high-level executives who run the datacenter. One could build such a datacenter, but the normal ones are not that!
The top-secret cloud you linked to might be one of those, but my understanding is that Claude weights are deployed just to like normal Amazon datacenters, not only the top-secret governance ones.
I’m really not following your argument here. Of course in many instances compute providers don’t offer zero trust relationships with those running on their systems. This is just not news. There’s a reason why we have an entire universe of compensating technical and non-technical controls to mitigate risk in such circumstances.
You have done zero analysis to identify any reason to believe that those compensating controls are insufficient. You could incredibly easily get me to flip sides in this discussion if you offered any of that, but simply saying that someone’s running zero trust isn’t sufficient. As a hypothetical, if Anthropic is expending meaningful effort to be highly confident that they can ensure that Amazon’s own security processes are securing against insiders, they would have substantial risk reduction (as long as they can have high confidence said processes are continuing to be executed.
Separately, though it probably cuts against my argument above [1], I would politely disagree with the perhaps-unintended implication in your comment above that “implement zero trust” is a sufficient definition of defenses to defend against compute providers like Amazon, MSFT, etc. After all, Anthropic’s proper threat modeling of them should include things like, “Amazon, Microsoft, etc. employ former nation-state hackers who considered attacking zero trust networks to be part of the cost of doing business.”
[1] Scout mindset, etc.
Anthropic has released their own whitepaper where they call out what kind of changes would need to be required. Can you please engage with my arguments?
I have now also heard from 1-2 Anthropic employees about this. The specific people weren’t super up-to-date on what Anthropic is doing here, and didn’t want to say anything committal, but nobody I talked to had a reaction that suggested that they thought it was likely that Anthropic is robust to high-level insider threats at compute providers.
Like, if you want you can take a bet with me here, I am happy to offer you 2:1 odds on the opinion of some independent expert we both trust on whether Anthropic is likely robust to that kind of insider threat. I can also start going into all the specific technical reasons, but that would require restating half of the state of the art of computer security, which would be a lot of work. I really think that not being robust here is a relatively straightforward inference that most people in the field will agree with (unless Anthropic has changed operating procedure substantially from what is available to other consumers in their deals with cloud providers, which I currently think is unlikely, but not impossible).
By the “whitepaper,” are you referring to the RSP v2.2 that Zach linked to, or something else? If so, I don’t understand how a generic standard can “call out what kind of changes would need to be required” to their current environment if they’re also claiming they meet their current standard.
Also, just to cut a little more to brass tacks here, can you describe the specific threat model that you think they are insufficiently responding to? By that, I don’t mean just the threat actor (insiders within their compute provider) and their objective to get weights, but rather the specific class or classes of attacks that you expect to occur, and why you believe that existing technical security + compensating controls are insufficient given Anthropic’s existing standards.
For example, AIUI the weights aren’t just sitting naively decrypted at inference, they’re running inside a fairly locked down trusted execution environment, with keys provided only as-needed (probably with an ephemeral keying structure?) from a HSM, and those trusted execution environments are operating inside a physical security perimeter of a data center that already is designed to mitigate insider risk. Which parts of this are you worried are attackable? To what degree are organizational boundaries between Anthropic and its compute providers salient to increasing this risk? Why should we expect that the compute providers don’t already have sufficient compensatory controls here, given that, e.g., these compute providers also provide classified compute to the US government that is secured at the Top Secret / SCI level and presumably therefore have best-in-class anti-insider-threat capabilities?
I’m extremely willing to buy into a claim that they’re not doing enough, but I would actually need to have an argument here that’s more specific.
No, I am referring to this, which Zach linked in his shortform: https://assets.anthropic.com/m/c52125297b85a42/original/Confidential_Inference_Paper.pdf
I don’t have a complaint that Anthropic is taking insufficient defenses here. I have a complaint that Anthropic said they would do something in their RSP that they are not doing.
The threat is really not very complicated, it’s basically:
A high-level Amazon executive decides they would like to have a copy of Claude’s weights
They physically go to an inference server and manipulate the hardware to dump the weights unencoded to storage (or read out the weights from a memory bus directly, or use one of the many attacks available to you if you have unlimited hardware access)
Nobody in computer security I have ever talked to thinks that data in memory in normal cloud data centers would be secure against physical access from high-level executives who run the datacenter. One could build such a datacenter, but the normal ones are not that!
The top-secret cloud you linked to might be one of those, but my understanding is that Claude weights are deployed just to like normal Amazon datacenters, not only the top-secret governance ones.