simon comments on Reward is the optimization target (of capabilities researchers)