Rachel Shu comments on METR: Measuring AI Ability to Complete Long Tasks

Rachel Shu 20 Mar 2025 13:00 UTC
1 point
0
Possibly, but then you have to consider you can spin up possibly arbitrarily many instances of the LLM as well, in which case you might expect the trend to go even faster, as now you’re scaling on 2 axes, and we know parallel compute scales exceptionally well.

Parallel years don’t trade off exactly with years in series, but “20 people given 8 years” might do much more than 160 given one, or 1 given 160, depending on the task.