Roman Malov comments on Steve Byrnes’s Shortform

Roman Malov 10 May 2025 15:54 UTC
1 point
0
we’ve jumped to near human-baseline and slowed to a crawl at this level
A possible reason for that might be the fallibility of our benchmarks. It might be the case that for complex tasks, it’s hard for humans to see farther than their nose.