Zach Stein-Perlman comments on Zach Stein-Perlman’s Shortform

Zach Stein-Perlman 26 Sep 2025 18:32 UTC
2 points
0
I don’t think it’s very new. iirc it’s suggested in Meta’s safety framework. But past evals stuff (see the first three bullets above) has been more like the model doesn’t have dangerous capabilities than the model is weaker than these specific other models. Maybe in part because previous releases have been more SOTA. I don’t recall past releases being like safe because weaker than other models.