I 100% endorse working on alignment and agree that it’s super important.
We do think that misuse mitigations at Anthropic can help improve things generally though race-to-the-top dynamics, and I can attest that while at GDM I was meaningfully influenced by things that Anthropic did.
Will Anthropic continue to publicly share research behind its safeguards, and say what kinds of safeguards they’re using? If so, I think there’s clear ways to spread this work to other labs
I 100% endorse working on alignment and agree that it’s super important.
We do think that misuse mitigations at Anthropic can help improve things generally though race-to-the-top dynamics, and I can attest that while at GDM I was meaningfully influenced by things that Anthropic did.
Will Anthropic continue to publicly share research behind its safeguards, and say what kinds of safeguards they’re using? If so, I think there’s clear ways to spread this work to other labs
We certainly plan to!