michaelcohen comments on Delegative Reinforcement Learning with a Merely Sane Advisor