Steven Byrnes comments on “Behaviorist” RL reward functions lead to scheming