Dave Orr comments on come work on dangerous capability mitigations at Anthropic

Dave Orr 20 Aug 2025 23:34 UTC
8 points
−2
I 100% endorse working on alignment and agree that it’s super important.
We do think that misuse mitigations at Anthropic can help improve things generally though race-to-the-top dynamics, and I can attest that while at GDM I was meaningfully influenced by things that Anthropic did.
- Neel Nanda 21 Aug 2025 1:34 UTC
  3 points
  0
  Parent
  Will Anthropic continue to publicly share research behind its safeguards, and say what kinds of safeguards they’re using? If so, I think there’s clear ways to spread this work to other labs
  - Dave Orr 21 Aug 2025 2:09 UTC
    5 points
    0
    Parent
    We certainly plan to!