bayesian comments on Reinforcement Learner Wireheading