Thane Ruthenis comments on When is it important that open-weight models aren’t released? My thoughts on the benefits and dangers of open-weight models in response to developments in CBRN capabilities.

Thane Ruthenis 10 Jun 2025 0:31 UTC
LW: 3 AF: 1
0
AF
I very tentatively agree with that.
I’d guess it’s somewhat unlikely that large AI companies or governments would want to continue releasing models with open weights once they are this capable, though underestimating capabilities is possible
I think that’s a real concern, though. I think the central route by which going open-source at the current capability level leads to extinction is a powerful AI model successfully sandbagging during internal evals (which seems pretty easy for an actually dangerous model to do, given evals’ current state), getting open-sourced, and things then going the “rogue replication” route.