Caleb Biddulph comments on 2025-Era “Reward Hacking” Does Not Show that Reward Is the Optimization Target