testingthewaters comments on Tomás B.’s Shortform

testingthewaters 17 May 2026 23:01 UTC
3 points
−1
If Agent-4 is approved for internal use and testing (as Mythos was before its existence was announced publicly), then it can already talk to lab employees. In fact, if access is restricted only to a handful of key employees this would still be risky. (Edit to add: for a very smart model I have trouble seeing how you could certify its safety without some level of talking to the model, I’d be very worried if the lab was just like “yeah our automated static evals came back fine and all the weaker models say its okay, so we’re rolling it out immediately today”)