Typo: The description for table 2 states that “In total, 148 of our 169 tasks have humanbaselines, but we rely on researcher estimates for 21 tasks in HCAST.”. This is an incorrect sum; the right figure is 149 out of 170 tasks, per the table.
Typo: The description for table 2 states that “In total, 148 of our 169 tasks have human
baselines, but we rely on researcher estimates for 21 tasks in HCAST.”. This is an incorrect sum; the right figure is 149 out of 170 tasks, per the table.