Indeed, the METR trendline alone doesn’t strongly distinguish between these hypotheses (unless it started to go superexponential). But other sources of evidence might help distinguish, e.g. we can plot graphs of performance as a function of tokens and see if the slope is changing over the years. I’m excited for more evidence along these lines to be collected.
Toby Ord’s recent work digs into this. For the case of OAI, i think it suggests most capability improvements are coming from continuing the curve for longer, but there is also some effect from a steeper curve
Indeed, the METR trendline alone doesn’t strongly distinguish between these hypotheses (unless it started to go superexponential). But other sources of evidence might help distinguish, e.g. we can plot graphs of performance as a function of tokens and see if the slope is changing over the years. I’m excited for more evidence along these lines to be collected.
Toby Ord’s recent work digs into this. For the case of OAI, i think it suggests most capability improvements are coming from continuing the curve for longer, but there is also some effect from a steeper curve
https://x.com/tobyordoxford/status/1999870642032967987?s=20