David Lindner comments on Testing Gemini models for scheming tendencies

David Lindner 1 Jun 2026 13:45 UTC
LW: 4 AF: 3
1
AF
We did not try this but if we did, I expect we’d similarly see higher numbers. But I think we should interpret this as a (potentially loose) upper bound on eval awareness. Telling a model that one of two trajectories is an evaluation will make it much easier for a model to recognize it compared to just encountering it individually.
I do think this methodology is a useful test of realism and we’ll probably do more of this in the future as we push realism more