ariaw comments on Steering RL Training: Benchmarking Interventions Against Reward Hacking