Charlie Steiner comments on The Perils of Optimizing Learned Reward Functions