(Also, we can, in fact, observe some of the AIs internals and run crude checks for things like deception. Prosaic interpretability isn’t great, but it’s also not nothing.)
(Also, we can, in fact, observe some of the AIs internals and run crude checks for things like deception. Prosaic interpretability isn’t great, but it’s also not nothing.)