come work on dangerous capability mitigations at Anthropic

Hi everyone,

TL;DR the Safeguards team at Anthropic is hiring ML experts to focus on dangerous capability mitigations—this is a very high impact ML Engineer /​ Research Scientist opportunity related to serious harm from advanced AI capabilities. We’re hiring multiple roles at varying levels of seniority including both IC and management.

The Safeguards team focuses on present day and near future harms. Since we activated our ASL 3 safety standard for Claude 4 Opus, in accordance with our Responsible Scaling Policy, this means that we need to do a great job of protecting the world in case Claude can provide uplift for dangerous capabilities. Safeguards builds constitutional classifiers and probes that run over every Opus conversational turn to look for and suppress dangerous content and defend against jailbreaks.

This is a hard modeling task! The boundary between beneficial and risky content is thin and complicated. In some domains, like cyberattacks, it’s extremely hard to distinguish between legitimate security work and harmful offense. Biology has educational use cases as well as nefarious bioweapon applications, as does chemistry and nuclear content. We need to combine understanding of the prompt, the response, account signals, and anything else that we can think of to stop disastrous outcomes while preserving all the immense value that Claude can bring.

As AI becomes more powerful, the benefits grow, but so do the risks. We’re entering into a new, riskier regime where bad actors could potentially cause very significant harm, due to more advanced capabilities and the rise of more agentic systems. This is why we triggered ASL 3 protections, and also why we need awesome ML folks to help solve these safety problems.

We’re hiring for multiple roles at varying levels of seniority. If you have a strong ML background, either applied or research, and an interest in working on difficult and impactful safety problems, we would love to have you! No specific safety experience needed. We’re hiring in San Francisco and New York, though for exceptional candidates would consider London or remote positions as well.

Apply here!

We also have a number of other roles in Safeguards (scroll down to the Safeguards section) and elsewhere, including nontechnical roles in Safety.

Come work with us!