I made a comment on that post on why for now, I think the thresholds are set high for good reason, and I think the evals not supporting company claims that they can’t do bioweapons/CBRN tasks are mostly failures of the evals, but also I’m confused on how Anthropic managed to rule out uplift risks for Claude Sonnet 4 but not Claude Opus 4:
I made a comment on that post on why for now, I think the thresholds are set high for good reason, and I think the evals not supporting company claims that they can’t do bioweapons/CBRN tasks are mostly failures of the evals, but also I’m confused on how Anthropic managed to rule out uplift risks for Claude Sonnet 4 but not Claude Opus 4:
https://www.lesswrong.com/posts/AK6AihHGjirdoiJg6/?commentId=mAcm2tdfRLRcHhnJ7