GRI comments on Ryan Kidd’s Shortform

GRI 17 Apr 2025 22:12 UTC
3 points
0

“Lots of very small experiments playing around with various parameters” … “then a slow scale up to bigger and bigger models”

This Dwarkesh timestamp with Jeff Dean & Noam Shazeer seems to confirm this.

“I’d also guess that the bottleneck isn’t so much on the number of people playing around with the parameters, but much more on good heuristics regarding which parameters to play around with.”

That would mostly explain this question as well: “If parallelized experimentation drives so much algorithmic progress, why doesn’t gdm just hire hundreds of researchers, each with small compute budgets, to run these experiments?”

It would also imply that it would be a big deal if they had an AI with good heuristics for this kind of thing.
- Garrett Baker 18 Apr 2025 7:58 UTC
  4 points
  0
  Parent
  Don’t double update! I got that information from that same interview!