There is an argument that evaluating AI models should be formalised, i.e., turned into verification: see https://arxiv.org/abs/2309.01933 (and discussion on Twitter with Yudkowsky and Davidad).
There is an argument that evaluating AI models should be formalised, i.e., turned into verification: see https://arxiv.org/abs/2309.01933 (and discussion on Twitter with Yudkowsky and Davidad).