They do test 4.1 and 4.1-mini which have high rates of reward hacking (though they test using data from 4o-mini), and find that reward hacking goes down:
We train different models in the OpenAI family on non-hacks generated by 4o-mini. We find re-contextualized training:
Boosts hacking for GPT-4o.
Decreases hacking for GPT-4.1-mini and GPT-4.1, which start out at high rates.
Nevertheless, re-contextualized training results in strictly higher hack rates than standard training for every base model.
They do test 4.1 and 4.1-mini which have high rates of reward hacking (though they test using data from 4o-mini), and find that reward hacking goes down: