Steven Byrnes comments on The nature of LLM algorithmic progress (v2)

Steven Byrnes 8 Feb 2026 22:06 UTC
1 point
0
Thanks!! Quick question while I think over the rest:
What data are you plotting? Where exactly did you get it (i.e., what references)?
And why is the 2021 one better than the 2023 ones? Normally we would expect the other way around, right? Does DeepMind have so much secret sauce that it’s worth more than 2 years of public knowledge? Or are the other two groups making rookie mistakes? Or am I misunderstanding the plot?
- Tao Lin 9 Feb 2026 1:04 UTC
  5 points
  0
  Parent
  Gopher
  Cerebras-GPT
  Pythia
  Why is Gopher better than Pythia or Cerebras? Mostly no comment, but I think Pythia and Cerebras weren’t making any super simple obvious mistake but were behind 2021-era DeepMind.