Charlie Steiner comments on “Behaviorist” RL reward functions lead to scheming