Boaz Barak comments on Cole Wyeth’s Shortform

Boaz Barak 6 Dec 2025 22:22 UTC
22 points
0
I wouldn’t say that people in labs don’t care about benchmarks but I think the perception of the degree we care about it is exaggerated. Frontier labs are now a multi billion business with hundreds of millions of users. A normal user trying to decide if to use a model from provider A or B doesn’t know or care about benchmark results.

We do have reasons to care about of horizon tasks in general and tasks related to AI R&D in particular (as we have been open about) but the METR benchmark has nothing to do with it.