You’re likely correct, but I’m not sure that’s relevant. For one, Chinchilla wasn’t announced until 2022, nearly two years after the release of GPT-3. So the slowdown is still apparent even if we assume OpenAI was nearly done training an undertrained GPT-4 (which I have seen no evidence of).
Moreover, the focus on efficiency itself is evidence of an approaching wall. Taking an example from the 20th century, machines got much more energy efficient after the 70s which is also when energy stopped getting cheaper. Why didn’t OpenAI pivot their attention to fine-tuning and efficiency after the release of GPT-2? Because GPT-2 was cheap to train and relied on a tiny fraction of all available data, sidelining their importance. Efficiency is typically a reaction to scarcity.
You’re likely correct, but I’m not sure that’s relevant. For one, Chinchilla wasn’t announced until 2022, nearly two years after the release of GPT-3. So the slowdown is still apparent even if we assume OpenAI was nearly done training an undertrained GPT-4 (which I have seen no evidence of).
Moreover, the focus on efficiency itself is evidence of an approaching wall. Taking an example from the 20th century, machines got much more energy efficient after the 70s which is also when energy stopped getting cheaper. Why didn’t OpenAI pivot their attention to fine-tuning and efficiency after the release of GPT-2? Because GPT-2 was cheap to train and relied on a tiny fraction of all available data, sidelining their importance. Efficiency is typically a reaction to scarcity.