Max H comments on Reward is the optimization target (of capabilities researchers)