Wow, crazy timing for the GPT-5 announcement! I’ll come back to that, but first the dates that you helpfully collected:
It’s not clear to me that this timeline points in the direction you are arguing. Exponentially increasing time between “step” improvements in models would mean that progress rapidly slows to the scale of decades. In practice this would probably look like a new paradigm with more low-hanging fruit overtaking or extending transformers.
I think your point is valid in the sense that things were already slowing down by GPT-3 → GPT-4, which makes my original statement at least potentially misleading. However, research and compute investment have also been ramping up drastically—I don’t know by exactly how much, but I would guess nearly an order of magnitude? So the wait times here may not really be comparable.
Anyway, this whole speculative discussion will soon (?) be washed out when we actually see GPT-5. The announcement is perhaps a weak update against my position, but really the thing to watch is whether it is a qualitative improvement on the scale of previous GPT-N → GPT-(N+1). If it is, then you are right that progress has not slowed down much. My standard is whether it starts doing anything important.
You’re right that there’s nuance here. The scaling laws involved mean exponential investment → linear improvement in capability, so yeah it naturally slows down unless you go crazy on investment… and we are, in fact, going crazy on investment. GPT-3 is pre-ChatGPT, pre-current paradigm, and GPT-4 is nearly so. So ultimately I’m not sure it makes that much sense to compare the GPT1-4 timelines to now. I just wanted to note that we’re not off-trend there.
Wow, crazy timing for the GPT-5 announcement! I’ll come back to that, but first the dates that you helpfully collected:
It’s not clear to me that this timeline points in the direction you are arguing. Exponentially increasing time between “step” improvements in models would mean that progress rapidly slows to the scale of decades. In practice this would probably look like a new paradigm with more low-hanging fruit overtaking or extending transformers.
I think your point is valid in the sense that things were already slowing down by GPT-3 → GPT-4, which makes my original statement at least potentially misleading. However, research and compute investment have also been ramping up drastically—I don’t know by exactly how much, but I would guess nearly an order of magnitude? So the wait times here may not really be comparable.
Anyway, this whole speculative discussion will soon (?) be washed out when we actually see GPT-5. The announcement is perhaps a weak update against my position, but really the thing to watch is whether it is a qualitative improvement on the scale of previous GPT-N → GPT-(N+1). If it is, then you are right that progress has not slowed down much. My standard is whether it starts doing anything important.
You’re right that there’s nuance here. The scaling laws involved mean exponential investment → linear improvement in capability, so yeah it naturally slows down unless you go crazy on investment… and we are, in fact, going crazy on investment. GPT-3 is pre-ChatGPT, pre-current paradigm, and GPT-4 is nearly so. So ultimately I’m not sure it makes that much sense to compare the GPT1-4 timelines to now. I just wanted to note that we’re not off-trend there.