Eli Tyre comments on Algorithmic Improvement Is Probably Faster Than Scaling Now

Eli Tyre 6 Apr 2024 19:52 UTC
4 points
0
Back in 2020, a group at OpenAI ran a conceptually simple test to quantify how much AI progress was attributable to algorithmic improvements. They took ImageNet models which were state-of-the-art at various times between 2012 and 2020, and checked how much compute was needed to train each to the level of AlexNet (the state-of-the-art from 2012). Main finding: over ~7 years, the compute required fell by ~44x. In other words, algorithmic progress yielded a compute-equivalent doubling time of ~16 months (though error bars are large in both directions).
Personally, I would be more interested in the reverse of this test: take all the prior state of the art models, and ask how long you need to train them in order to match the benchmark of current state of the art models.

Would that even work at all? Is there some (non-astronomically large) level of training which makes AlexNet as capable as current state of the art image recognition models?

The experiment that they did is a little like asking “at what age is an IQ 150 person able to do what an adult IQ 70 person is able to do?”. But a more interesting question is “How long does it take to make up for being IQ 70 instead of IQ 150?”