Julian Bradshaw comments on My model of what is going on with LLMs

Julian Bradshaw 13 Feb 2025 9:22 UTC
11 points
5
soon when we were racing through GPT-2, GPT-3, to GPT-4. We just aren’t in that situation anymore
I don’t think this is right.

GPT-1: 11 June 2018
GPT-2: 14 February 2019 (248 days later)
GPT-3: 28 May 2020 (469 days later)
GPT-4: 14 March 2023 (1,020 days later)

Basically, wait until next model doubled every time. By that pattern, GPT-5 ought to come around September 20, 2028, but Altman said today it’ll be out within months. (and frankly, I think o1 qualifies as a sufficiently-improved successor model, and that released December 5, 2024, or really September 12, 2024, if you count o1-preview; either way, shorter than the GPT-3 to 4 gap)
- Thane Ruthenis 13 Feb 2025 18:46 UTC
  11 points
  3
  Parent
  GPT-5 ought to come around September 20, 2028, but Altman said today it’ll be out within months
  I don’t think what he said meant what you think it meant. Exact words:
  In both ChatGPT and our API, we will release GPT-5 as a system that integrates a lot of our technology, including o3
  The “GPT-5” he’s talking about is not the next generation of GPT-4, not an even bigger pretrained LLM. It is some wrapper over GPT-4.5/Orion, their reasoning models, and their agent models. My interpretation is that “GPT-5″ the product and GPT-5 the hypothetical 100x-bigger GPT model, are two completely different things.
  I assume they’re doing these naming shenanigans specifically to confuse people and create the illusion of continued rapid progress. This “GPT-5” is probably going to look pretty impressive, especially to people in the free tier, who’ve only been familiar with e. g. GPT-4o so far.
  Anyway, I think the actual reason the proper GPT-5 – that is, an LLM 100+ times as big as GPT-4 – isn’t out yet is because the datacenters powerful enough to train it are only just now coming online. It’ll probably be produced in 2026.
  - Julian Bradshaw 14 Feb 2025 0:19 UTC
    4 points
    0
    Parent
    It’s unclear exactly what the product GPT-5 will be, but according to OpenAI’s Chief Product Officer today it’s not merely a router between GPT-4.5/o3.
    swyx
    appreciate the update!!
    in gpt5, are gpt* and o* still separate models under the hood and you are making a model router? or are they going to be unified in some more substantive way?
    
    Kevin Weil
    Unified 👍
    - Thane Ruthenis 14 Feb 2025 0:31 UTC
      1 point
      −3
      Parent
      Fair enough, I suppose calling it an outright wrapper was an oversimplification. It still basically sounds like just the sum of the current offerings.
- Cole Wyeth 13 Feb 2025 15:17 UTC
  1 point
  0
  Parent
  Wow, crazy timing for the GPT-5 announcement! I’ll come back to that, but first the dates that you helpfully collected:
  It’s not clear to me that this timeline points in the direction you are arguing. Exponentially increasing time between “step” improvements in models would mean that progress rapidly slows to the scale of decades. In practice this would probably look like a new paradigm with more low-hanging fruit overtaking or extending transformers.
  I think your point is valid in the sense that things were already slowing down by GPT-3 → GPT-4, which makes my original statement at least potentially misleading. However, research and compute investment have also been ramping up drastically—I don’t know by exactly how much, but I would guess nearly an order of magnitude? So the wait times here may not really be comparable.
  Anyway, this whole speculative discussion will soon (?) be washed out when we actually see GPT-5. The announcement is perhaps a weak update against my position, but really the thing to watch is whether it is a qualitative improvement on the scale of previous GPT-N → GPT-(N+1). If it is, then you are right that progress has not slowed down much. My standard is whether it starts doing anything important.
  - Julian Bradshaw 13 Feb 2025 20:58 UTC
    2 points
    1
    Parent
    You’re right that there’s nuance here. The scaling laws involved mean exponential investment → linear improvement in capability, so yeah it naturally slows down unless you go crazy on investment… and we are, in fact, going crazy on investment. GPT-3 is pre-ChatGPT, pre-current paradigm, and GPT-4 is nearly so. So ultimately I’m not sure it makes that much sense to compare the GPT1-4 timelines to now. I just wanted to note that we’re not off-trend there.