Relevant metrics of performance are roughly linear in log-compute when compute is utilized effectively in the current paradigm for training frontier models.
From my perspective it looks like performance has been steadily advancing as you scale up compute and other resources.
(This isn’t to say that pretraining hasn’t had lower returns recently, but you made a stronger claim.)
I think one of the (many) reasons people have historically tended to miscommunicate/talk past each other so much about AI timelines, is that the perceived suddenness of growth rates depends heavily on your choice of time span. (As Eliezer puts it, “Any process is continuous if you zoom in close enough.”)
It sounds to me like you guys (Thane and Ryan) agree about the growth rate of the training process, but are assessing its perceived suddenness/continuousness relative to different time spans?
Relevant metrics of performance are roughly linear in log-compute when compute is utilized effectively in the current paradigm for training frontier models.
From my perspective it looks like performance has been steadily advancing as you scale up compute and other resources.
(This isn’t to say that pretraining hasn’t had lower returns recently, but you made a stronger claim.)
I think one of the (many) reasons people have historically tended to miscommunicate/talk past each other so much about AI timelines, is that the perceived suddenness of growth rates depends heavily on your choice of time span. (As Eliezer puts it, “Any process is continuous if you zoom in close enough.”)
It sounds to me like you guys (Thane and Ryan) agree about the growth rate of the training process, but are assessing its perceived suddenness/continuousness relative to different time spans?