Jozdien comments on Steering RL Training: Benchmarking Interventions Against Reward Hacking