Adele Lopez comments on Linch’s Shortform

Adele Lopez 17 Apr 2026 1:15 UTC
15 points
5
I’ve heard that GPT-4 base sometimes intuits that it’s non-human, but I haven’t replicated this myself or seen the actual logs. But if true, seems like decent evidence against both of these.

I suspect current models often “cheat” by simply looking at whether there is anything in context or not. Most evals are run with fresh instances, while typical use starts having more casual or irrelevant stuff in context pretty quickly.
- Linch 17 Apr 2026 21:05 UTC
  2 points
  0
  Parent
  I heard that too. Though frequency matters!