Interesting point. But could I check why you and @Vladimir_M are confident Moore’s law continues at all?
I’d have guessed maintaining the rate of gains in hardware efficiency will require exponentially increasing chip R&D spending and researcher hours.
But if total spending on chips has plateaued, then Nvidia etc.’s R&D spending will have also plateaued, which I think would imply hardware efficiency gains drop to near zero.
It’s an anchor, something concrete to adjust predictions around, the discussion in this thread is about implications of the anchor rather than its strength (so being confident in it isn’t really implied). Moore’s law about transistor count per die mostly stopped, but the historical trend seems to be surviving in its price-performance form (which should really be about compute per datacenter-level total cost of ownership). So maybe it keeps going as it did for decades, and specific predictions for what would keep Moore’s law going at any given time were always hard, even as it did continue. Currently this might be about advanced packaging (making the parts of a datacenter outside the chips cheaper per transistor).
If Moore’s law stops even for price-performance, then AI scaling slowdown gets even stronger in 2030-2050 than what this post explores. Also, growth in compute spending probably doesn’t completely plateau (progress in adoption alone would feed growth for many years), and that to some extent compensates for compute not getting cheaper as fast as it used to (if that happens).
I agree with all those points. The main point I’m making is just that hardware efficiency seems like it should also be a function of total compute capex, which would mean it moves in the same way as algo progress, which would further exaggerate a slow down. Do you basically agree with that?
We haven’t yet seen the efficiency fallout of the 2022-2030 rapid scaling, and semi advancements take many years. So if some kind of experience curve effect wakes up as a central factor in semi efficiency, then 2030s might be fine on Moore’s law front. But if it wasn’t legibly a major factor recently, it’s not obvious it must become all that important even with the unusual inputs from AI datacenter scaling.
My understanding is the hardware performance trend in FLOP/area of 1.35x/yr (precision and sparsity constant) has mostly been driven by transistor density ~1.25x/yr (transistors/area) and little by Nvidia chip design ~1.1x/yr (flop/transistor). I don’t think its true that the transistor density (traditional ‘Moore’s law’) has stopped, it seems like its only slowed down at most around 2x. My best guess conditional on no AGI by 2045, is that we see a smooth continuation of gradually slower transistor density improvement in the traditional paradigms (i.e., ‘2nm’ branded and ‘1.4nm’ branded nodes that use ASML’s new High-NA, and planned ‘Hyper-NA’) that i’d guess will average around 1.1-1.2x/yr, and/or might see some relatively more spikey progress from some paradigm shifts in computing and chip fabrication. So my overall, all things considered guess for FLOP/area is that it’ll average around 1.25x/yr through 2045 | no AGI. Then to get to price performance trend (FLOP/$), you have to multiply by the area/$ trend. I think with no AGI by 2045, you probably get a 4x one-time gain from Nvidia’s margin gradually going down, and then a steady trend of general manufacturing and economic efficiency bringing down manufacturing costs, naively like 5%/yr. That nets out at around a combined 1.1x/yr trend in area/$, so combined with the 1.25x/yr trend in FLOP/area, that’s why i expect the hardware price performance trend to average around 1.4x/yr through 2045 conditional on no AGI.
This makes sense, just my concern is the 1.25 trend in transistor density is itself driven by exponentially increasing investment into semiconductor research. Going forward, most of that will be via spending on AI chips. So if AI chip spending stops growing, so does R&D spending on increasing transistor density.
I could see an argument though that mobile phones etc. would be able to fund enough investment to keep transistor research growing.
4x one-time gain from Nvidia’s margin gradually going down
Nvidia margin is only about 25% of the datacenter though. See also the numbers on Nvidia TCO in the recent SemiAnalysis post on TPUs ($22bn per GW is “Nvidia GPU content”, out of $34bn of IT content, or about $50bn together with the buildings and infrastructure). And it’s not getting all the way down to 0%, though TPUs are already credibly competing (for running very large models, in the 20-100T total params range, Nvidia’s answer to Ironwood is Rubin Ultra NVL576, which will only arrive 2 years later).
This goes both ways, price-performance might keep improving for non-chip parts of datacenters even if it stalls for chips. I don’t think the trend can be reconstructed from concrete considerations this far in advance, only the abstract trend itself has anchor nature.
Interesting point. But could I check why you and @Vladimir_M are confident Moore’s law continues at all?
I’d have guessed maintaining the rate of gains in hardware efficiency will require exponentially increasing chip R&D spending and researcher hours.
But if total spending on chips has plateaued, then Nvidia etc.’s R&D spending will have also plateaued, which I think would imply hardware efficiency gains drop to near zero.
It’s an anchor, something concrete to adjust predictions around, the discussion in this thread is about implications of the anchor rather than its strength (so being confident in it isn’t really implied). Moore’s law about transistor count per die mostly stopped, but the historical trend seems to be surviving in its price-performance form (which should really be about compute per datacenter-level total cost of ownership). So maybe it keeps going as it did for decades, and specific predictions for what would keep Moore’s law going at any given time were always hard, even as it did continue. Currently this might be about advanced packaging (making the parts of a datacenter outside the chips cheaper per transistor).
If Moore’s law stops even for price-performance, then AI scaling slowdown gets even stronger in 2030-2050 than what this post explores. Also, growth in compute spending probably doesn’t completely plateau (progress in adoption alone would feed growth for many years), and that to some extent compensates for compute not getting cheaper as fast as it used to (if that happens).
I agree with all those points. The main point I’m making is just that hardware efficiency seems like it should also be a function of total compute capex, which would mean it moves in the same way as algo progress, which would further exaggerate a slow down. Do you basically agree with that?
We haven’t yet seen the efficiency fallout of the 2022-2030 rapid scaling, and semi advancements take many years. So if some kind of experience curve effect wakes up as a central factor in semi efficiency, then 2030s might be fine on Moore’s law front. But if it wasn’t legibly a major factor recently, it’s not obvious it must become all that important even with the unusual inputs from AI datacenter scaling.
interesting point
You should actually tag @Vladimir_Nesov instead of Vladimir M, as Vladimir Nesov was the original author.
Ah thanks.
My understanding is the hardware performance trend in FLOP/area of 1.35x/yr (precision and sparsity constant) has mostly been driven by transistor density ~1.25x/yr (transistors/area) and little by Nvidia chip design ~1.1x/yr (flop/transistor). I don’t think its true that the transistor density (traditional ‘Moore’s law’) has stopped, it seems like its only slowed down at most around 2x. My best guess conditional on no AGI by 2045, is that we see a smooth continuation of gradually slower transistor density improvement in the traditional paradigms (i.e., ‘2nm’ branded and ‘1.4nm’ branded nodes that use ASML’s new High-NA, and planned ‘Hyper-NA’) that i’d guess will average around 1.1-1.2x/yr, and/or might see some relatively more spikey progress from some paradigm shifts in computing and chip fabrication. So my overall, all things considered guess for FLOP/area is that it’ll average around 1.25x/yr through 2045 | no AGI. Then to get to price performance trend (FLOP/$), you have to multiply by the area/$ trend. I think with no AGI by 2045, you probably get a 4x one-time gain from Nvidia’s margin gradually going down, and then a steady trend of general manufacturing and economic efficiency bringing down manufacturing costs, naively like 5%/yr. That nets out at around a combined 1.1x/yr trend in area/$, so combined with the 1.25x/yr trend in FLOP/area, that’s why i expect the hardware price performance trend to average around 1.4x/yr through 2045 conditional on no AGI.
This makes sense, just my concern is the 1.25 trend in transistor density is itself driven by exponentially increasing investment into semiconductor research. Going forward, most of that will be via spending on AI chips. So if AI chip spending stops growing, so does R&D spending on increasing transistor density.
I could see an argument though that mobile phones etc. would be able to fund enough investment to keep transistor research growing.
Nvidia margin is only about 25% of the datacenter though. See also the numbers on Nvidia TCO in the recent SemiAnalysis post on TPUs ($22bn per GW is “Nvidia GPU content”, out of $34bn of IT content, or about $50bn together with the buildings and infrastructure). And it’s not getting all the way down to 0%, though TPUs are already credibly competing (for running very large models, in the 20-100T total params range, Nvidia’s answer to Ironwood is Rubin Ultra NVL576, which will only arrive 2 years later).
This goes both ways, price-performance might keep improving for non-chip parts of datacenters even if it stalls for chips. I don’t think the trend can be reconstructed from concrete considerations this far in advance, only the abstract trend itself has anchor nature.