I agree. If Google wanted to join the commitments but not necessarily publish eval results by the time of external deployment, it should have clarified “we’ll publish within 2 months after external deployment” or “we’ll do evals on our most powerful model at least every 4 months rather than doing one round of evals per model” or something.
I agree. If Google wanted to join the commitments but not necessarily publish eval results by the time of external deployment, it should have clarified “we’ll publish within 2 months after external deployment” or “we’ll do evals on our most powerful model at least every 4 months rather than doing one round of evals per model” or something.