My question is whether the DeepMind alignment team was aware of this. If not, that means their evals are clearly insufficient, if they were, it begs the question why they didn’t even mention this as a “known limitation” upon release. (I assume they don’t have the power to delay the model release unless they found a more serious safety flaw.)
People may find this post interesting as well: Gemini 3 is evaluation paranoid and contaminated
My question is whether the DeepMind alignment team was aware of this. If not, that means their evals are clearly insufficient, if they were, it begs the question why they didn’t even mention this as a “known limitation” upon release. (I assume they don’t have the power to delay the model release unless they found a more serious safety flaw.)