Coauthor of the paper here. It’s helpful to hear how people have interpreted our work, though I want to try to clear up a few things quickly.
Firstly, we don’t claim that data must make up the remaining gap, and we certainly are not claiming it explains capabilities improvements more broadly. We try to be pretty precise about the definition of algorithmic progress under the CEG framework, which is specifically computed using loss rather than downstream performance metrics (I agree, the relationship between these is not straightforward or well understood). We say explicitly that the CEG framework cannot capture a lot of important innovations for performance and efficiency (instruction fine-tuning, constitutional AI, parallelism, etc.). We also show highly counterintuitive, undesirable properties of the CEG framework which makes interpretation quite difficult.
Speaking for myself and not my coauthors here: It’s flattering you think our results are too good, though I’m not sure that was the intent of your comment :) I think the results would be interesting even if we over/undershot existing estimates by 10x. We find two important results, which are true regardless of our estimates’ alignment with the literature: scale seems to be necessary for much of the perceived algorithmic gains, and scale-dependence makes the framework for measuring these gains behave strangely. Those two facts are worth considering in their own right, and are true for even modest differences in scaling exponents across architectures. It is reaffirming that we recover so much of the other estimates, but strictly speaking, we could have found way more or way less, and our results would still raise important points.
Lastly, I think our results don’t point to “better data has made all the difference.” It really points to “a lot of new, specific things lead to a lot of new, specific differences in capabilities, and it’s hard to count those up using the same units.” I think that’s a really important (and difficult) direction for further research, which may shed light on how important data has been!
Coauthor of the paper here. It’s helpful to hear how people have interpreted our work, though I want to try to clear up a few things quickly.
Firstly, we don’t claim that data must make up the remaining gap, and we certainly are not claiming it explains capabilities improvements more broadly. We try to be pretty precise about the definition of algorithmic progress under the CEG framework, which is specifically computed using loss rather than downstream performance metrics (I agree, the relationship between these is not straightforward or well understood). We say explicitly that the CEG framework cannot capture a lot of important innovations for performance and efficiency (instruction fine-tuning, constitutional AI, parallelism, etc.). We also show highly counterintuitive, undesirable properties of the CEG framework which makes interpretation quite difficult.
Speaking for myself and not my coauthors here: It’s flattering you think our results are too good, though I’m not sure that was the intent of your comment :) I think the results would be interesting even if we over/undershot existing estimates by 10x. We find two important results, which are true regardless of our estimates’ alignment with the literature: scale seems to be necessary for much of the perceived algorithmic gains, and scale-dependence makes the framework for measuring these gains behave strangely. Those two facts are worth considering in their own right, and are true for even modest differences in scaling exponents across architectures. It is reaffirming that we recover so much of the other estimates, but strictly speaking, we could have found way more or way less, and our results would still raise important points.
Lastly, I think our results don’t point to “better data has made all the difference.” It really points to “a lot of new, specific things lead to a lot of new, specific differences in capabilities, and it’s hard to count those up using the same units.” I think that’s a really important (and difficult) direction for further research, which may shed light on how important data has been!