ESRogs comments on GPT-4

ESRogs 15 Mar 2023 22:06 UTC
3 points
0
Still, this advance seems like a less revolutionary leap over GPT-3 than GPT-3 was over GPT-2, if Bing’s early performance is a decent indicator.
Seems like this is what we should expect, given that GPT-3 was 100x as big as GPT-2, whereas GPT-4 is probably more like ~10x as big as GPT-3. No?
EDIT: just found this from Anthropic:
We know that the capability jump from GPT-2 to GPT-3 resulted mostly from about a 250x increase in compute. We would guess that another 50x increase separates the original GPT-3 model and state-of-the-art models in 2023.
- Lost Futures 16 Mar 2023 0:43 UTC
  3 points
  0
  Parent
  Probably? Though it’s hard to say since so little information about the model architecture was given to the public. That said, PaLM is also around around 10x the size as GPT-3 and GPT-4 seems better than it (though this is likely due to GPT-4′s training following Chinchilla-or-better scaling laws).
  - ESRogs 16 Mar 2023 1:17 UTC
    2 points
    0
    Parent
    See my edit to my comment above. Sounds like GPT-3 was actually 250x more compute than GPT-2. And Claude / GPT-4 are about 50x more compute than that? (Though unclear to me how much insight the Anthropic folks had into GPT-4′s training before the announcement. So possible the 50x number is accurate for Claude and not for GPT-4.)