Agreed! I may try to adopt your terminology. Have also seen folks use “cheating” instead of “task gaming”.
Here’s a footnote from the METR post discussing this distinction:
It’s common to use the term “reward hacking” to refer to any way models “cheat” on tasks, and this blogpost will use that convention. However, we think it’s helpful for conceptual clarity to remember the model doesn’t receive any reward corresponding to its scores on our evaluations. Rather, its behavior on our evaluations is evidence that it learned to “cheat” in training to receive higher reward, and since our evaluations are presumably similar to the training environments, that’s generalized to “cheating” on our evaluations.
I agree that these seem related!
https://x.com/cfgeek/status/2055705437845262477?s=46