Lost Futures comments on GPT-4

Lost Futures 16 Mar 2023 0:43 UTC
3 points
0
Probably? Though it’s hard to say since so little information about the model architecture was given to the public. That said, PaLM is also around around 10x the size as GPT-3 and GPT-4 seems better than it (though this is likely due to GPT-4′s training following Chinchilla-or-better scaling laws).
- ESRogs 16 Mar 2023 1:17 UTC
  2 points
  0
  Parent
  See my edit to my comment above. Sounds like GPT-3 was actually 250x more compute than GPT-2. And Claude / GPT-4 are about 50x more compute than that? (Though unclear to me how much insight the Anthropic folks had into GPT-4′s training before the announcement. So possible the 50x number is accurate for Claude and not for GPT-4.)