james387 comments on Evaluating Superhuman Models with Consistency Checks