I see the case for focusing only on compute (since relatively objective), but it still seems important to try to factor in some amount of algorithmic progress to pretraining (which means that the cost to achieve GPT-6 level performance will be dropping over time).
The points on Epoch are getting outside of my expertise – I see my role as to synthesise what experts are saying. It’s good to know these critiques exist and it would be cool to see them written up and discussed.
Hmm interesting.
I see the case for focusing only on compute (since relatively objective), but it still seems important to try to factor in some amount of algorithmic progress to pretraining (which means that the cost to achieve GPT-6 level performance will be dropping over time).
The points on Epoch are getting outside of my expertise – I see my role as to synthesise what experts are saying. It’s good to know these critiques exist and it would be cool to see them written up and discussed.