Rohin Shah comments on [AN #100]: What might go wrong if you learn a reward function while acting