Beth Barnes comments on Responsible Scaling Policy v3

Beth Barnes 1 Mar 2026 4:59 UTC
24 points
1
How I think about METR’s theory of change:
General principles:
- avoid world being taken by surprise by AI catastrophe—improve knowledge / understanding / science of assessing risk from AI systems—
independent/trustworthy/truthseeking/minimally-conflicted expert org existing is good—can advise world, be a counterbalance to AI companies; a nonprofit has slightly different affordances than govt here.

Strategy:
- try to continually answer question of “how dangerous are current / near-future AI systems”, and do research to be able to keep answering that question as well as possible
- be boring and neutral and straightforward, aim to explain not persuade

Some specific impact stories:
- at some point in future political willingness may be much higher, help channel that into more informed and helpful response
- independent technical review and redteaming of alignment + other mitigations, find issues companies have missed
- increase likelihood that misalignment incidents or other ‘warning shots’ are shared/publicized and analyzed well

I think that broad ToC has been pretty constant throughout METR’s existence, but my memory is not great so I wouldn’t be that surprised if I was framing it pretty differently in the past and e.g. emphasizing conditional commitments more highly.