I believe deeply in the existential importance of using AI to defend the United States and other democracies, and to defeat our autocratic adversaries.
Anthropic has therefore worked proactively to deploy our models to the Department of War
In my view this seems like a terrible idea, since we don’t understand nearly enough about what might end up causing misalignment in practice to be confident that explicitly training AI to aid in the killing of humans might not somehow increase the probability that it ends up interested in the killing of humans more generally.[1] I suppose I do think it seems even worse to train AI in both killing and mass surveillance, but I expect the badness of the former likely swamps the EV of refraining from the latter here.
In my view this seems like a terrible idea, since we don’t understand nearly enough about what might end up causing misalignment in practice to be confident that explicitly training AI to aid in the killing of humans might not somehow increase the probability that it ends up interested in the killing of humans more generally.[1] I suppose I do think it seems even worse to train AI in both killing and mass surveillance, but I expect the badness of the former likely swamps the EV of refraining from the latter here.
I don’t put much stock in the notion of AI personas, personally, but insofar as you do I think you should probably be especially worried about this.