Zac Hatfield-Dodds comments on Unless its governance changes, Anthropic is untrustworthy

Zac Hatfield-Dodds 29 Nov 2025 9:07 UTC
1 point
1
I am so very tired of these threads, but I’ll chime in at least for this comment. Here’s last time, for reference.
- I continue to think that working at Anthropic—even in non-safety roles, I’m currently on the Alignment team but have worked on others too—is a great way to contribute to AI safety. Most people I talk to agree that they think the situation would be worse if Anthropic had not been founded or didn’t exist, including MIRI employees.
- I’m not interested in litigating an is-ought gap about whether “we” (human civilization?) “should” be facing such high risks from AI; obviously we’re not in such an ideal world, and so discussions from that implicit starting point are imo useless.
- I have a lot of non-public information too, which points in very different directions to the citations here. Several are from people who I know to have lied about Anthropic in the past; and many more are adversarially construed. For some I agree on an underlying fact and strongly disagree with the framing and implication.
- I continue to have written redlines which would cause me to quit in protest.
- Mikhail Samin 29 Nov 2025 9:24 UTC
  39 points
  8
  Parent
  I’m not interested in litigating an is-ought gap about whether “we” (human civilization?) “should” be facing such high risks from AI; obviously we’re not in such an ideal world, and so discussions from that implicit starting point are imo useless
  
  My post is about Anthropic being untrustworthy. If that was not the case, if Anthropic clearly and publicly was making the case for doing their work with full understanding of the consequences, if the leadership did not communicate contradictory positions to different people and was instead being honest and high-integrity, I could imagine a case being made for working at Anthropic on capabilities, to have a company that stays at the frontier and is able to get and publish evidence, and use its resources to slow down everyone on the planet.
  But we, instead, live in a world where multiple people showed me misleading personal messages from Jack Clark.
  One should be careful not to galaxy-brain themselves into thinking that it’s fine for people to be low-integrity.
  I don’t think the assumptions that you think I’m making are feeding into most of the post.
  Several are from people who I know to have lied about Anthropic in the past
  
  If you think I got any of the facts wrong, please do correct me on them. (You can reach out in private, and share information with me in private, and I will not share it further without permission.)
  I continue to have written redlines which would cause me to quit in protest.
  I appreciate you having done this.
- boazbarak 29 Nov 2025 14:53 UTC
  10 points
  −1
  Parent
  Despite working at a competitor, I am happy Anthropic exists. I worry about centralization of control- I want OpenAI to be successful but I don’t want it to be a monopoly. Competition can create incentives to cut corners, but it also enables a variety of ideas and approaches as well as collaboration when it comes to safety. (And there have been some such cross industry collaborations.) In particular I appreciate some of the great research on AI safety that has come from Anthropic.
  No company is perfect, we all make mistakes, which we should be criticized for. But I think many of the critiques (of both OpenAI and Anthropic) are based on unrealistic and naive world views.