“I’d also guess that the bottleneck isn’t so much on the number of people playing around with the parameters, but much more on good heuristics regarding which parameters to play around with.”
That would mostly explain this question as well:
“If parallelized experimentation drives so much algorithmic progress, why doesn’t gdm just hire hundreds of researchers, each with small compute budgets, to run these experiments?”
It would also imply that it would be a big deal if they had an AI with good heuristics for this kind of thing.
This Dwarkesh timestamp with Jeff Dean & Noam Shazeer seems to confirm this.
That would mostly explain this question as well: “If parallelized experimentation drives so much algorithmic progress, why doesn’t gdm just hire hundreds of researchers, each with small compute budgets, to run these experiments?”
It would also imply that it would be a big deal if they had an AI with good heuristics for this kind of thing.
Don’t double update! I got that information from that same interview!