Shashwat Saxena comments on Steering RL Training: Benchmarking Interventions Against Reward Hacking