jessicata comments on Reward/​value learning for reinforcement learning