habryka comments on Help keep AI under human control: Palisade Research 2026 fundraiser

habryka 19 Dec 2025 22:06 UTC
8 points
2
I mean, it seems like a straightforward case of specification gaming, which is the title of the paper “Demonstrating specification gaming in reasoning models”. I haven’t reread the Time article, maybe you are referencing something in there, but the paper seems straightforward and true (and like, it’s up to the reader to decide how to interpret the evidence of specification gaming in language models).
“Giving LLMs ambiguous tasks” is like, the foundation of specification gaming. It’s literally in the name. The good old specification gaming boat of course was “attempting to perform well at an ambiguous task”. That doesn’t make it an uninteresting thing to study!
Possible that I am missing something in the paper and there is something more egregious happening.