John_Maxwell comments on Reward function learning: the learning process