Ben Pace comments on “The Urgency of Interpretability” (Dario Amodei)

Ben Pace 28 Apr 2025 21:05 UTC
14 points
11
I think that Anthropic is doing some neat alignment and control work, but it is also the company most effectively incentivizing people who care about existential risk to sell out, to endorse propaganda, silence themselves, and get on board with the financial incentives of massive monetization and capabilities progress. In this way I see it as doing more damage than OpenAI (though OpenAI used to have this mantle pre-Anthropic, while the Amodei siblings were there and with Christiano as researcher and Karnofsky on the board).
I don’t really know the relative numbers, in my mind the uncertainty I have spans orders of magnitude. The numbers are all negative.