It’s plausible to me that Anthropic (and other frontier labs) having bad security is good because it deflates race dynamics (if your competitors can just steal your weights after you invest $100b into a training run, you will probably think twice). Bad cybersecurity means you can’t capture as much of the economic value provided by a model.
Furthermore, “bad cybersecurity is the poor man’s auditing agreement”. If I am worried about a lab developing frontier models behind closed doors, then them having bad cybersecurity means other actors can use the stolen weights to check whether a model poses a national security risk to them, and intervene before it is too late.
Is this a terrible solution to auditing? Yes! Are we going to get something better by default? I really don’t know, I think not any time soon?
It’s plausible to me that Anthropic (and other frontier labs) having bad security is good because it deflates race dynamics (if your competitors can just steal your weights after you invest $100b into a training run, you will probably think twice). Bad cybersecurity means you can’t capture as much of the economic value provided by a model.
Furthermore, “bad cybersecurity is the poor man’s auditing agreement”. If I am worried about a lab developing frontier models behind closed doors, then them having bad cybersecurity means other actors can use the stolen weights to check whether a model poses a national security risk to them, and intervene before it is too late.
Is this a terrible solution to auditing? Yes! Are we going to get something better by default? I really don’t know, I think not any time soon?