[Preprint] The Computational Limits of Deep Learning

Link post

“The Computational Limits of Deep Learning” by Neil C. Thompson, Kristjan Greenewald, Keeheon Lee, and Gabriel F. Manso

Links:

NB: This is a preprint and not peer-reviewed or accepted for publication as best I can tell, so more than usual you’ll have to make your own judgements about the quality of the results.

Abstract:

Deep learning’s recent history has been one of achievement: from triumphing over humans in the game of Go to world-leading performance in image recognition, voice recognition, translation, and other tasks. But this progress has come with a voracious appetite for computing power. This article reports on the computational demands of Deep Learning applications in five prominent application areas and shows that progress in all five is strongly reliant on increases in computing power. Extrapolating forward this reliance reveals that progress along current lines is rapidly becoming economically, technically, and environmentally unsustainable. Thus, continued progress in these applications will require dramatically more computationally-efficient methods, which will either have to come from changes to deep learning or from moving to other machine learning methods.

A few additional details: they look at papers in ML to see how much compute was required to get results, and extrapolate the trend lines to suggest we’re nearing the limits of what is economically feasible to do under the current regime. They believe this implies we’ll have to get more efficient if we want to see continued progress, such as by having more specialized and efficient hardware or by improving algorithms. My takeaway is that they believe most of the low hanging fruit in ML gains has already been picked, and additional gains in capabilities will not come as easily as past gains.

The straightforward implications for safety are that, if this is true, we are less near x-risk territory than it might appear we are if you were to only look at the “numerator” of the trend lines (what we can do) without consider the “denominator” of them (how much it costs). Not that we are necessary dramatically far from x-risk territory with ML, mind you, only that it’s not obviously very near term since the economic realities of deploying this technology will soon shift to naturally slow immediate progress without significant effort or innovation.