Rohin Shah comments on Reward functions and updating assumptions can hide a multitude of sins