mishka comments on How familiar is the Lesswrong community as a whole with the concept of Reward-modelling?