RL can develop particular skills, and given that IMO has fallen this year, it’s unclear that further general capability improvement is essential at this point. If RL can help cobble together enough specialized skills to enable automated adaptation (where the AI itself will become able to prepare datasets or RL environments etc. for specific jobs or sources of tasks), that might be enough. If RL enables longer contexts that can serve the role of continual learning, that also might be enough. Currently, there is a lot of low hanging fruit, and little things continue to stack.
So if pre-training is slowing, AI companies lack any current method of effective compute scaling based solely around training compute and one-off costs.
It’s compute that’s slowing, not specifically pre-training, because the financing/industry can’t scale much longer. The costs of training were increasing about 6x every 2 years, resulting in 12x increase in training compute every 2 years in 2022-2026. Possibly another 2x on top of that every 2 years from adoption of reduced floating point precision in training, going from BF16 to FP8 and soon possibly to NVFP4 (likely it won’t go any further). A 1 GW system of 2026 costs an AI company about $10bn a year. There’s maybe 2-3 more years at this pace in principle, but more likely the slowdown will be gradually starting sooner, and then it’s Moore’s law (of price-performance) again, to the extent that it’s still real (which is somewhat unclear).
RL can develop particular skills, and given that IMO has fallen this year, it’s unclear that further general capability improvement is essential at this point. If RL can help cobble together enough specialized skills to enable automated adaptation (where the AI itself will become able to prepare datasets or RL environments etc. for specific jobs or sources of tasks), that might be enough. If RL enables longer contexts that can serve the role of continual learning, that also might be enough. Currently, there is a lot of low hanging fruit, and little things continue to stack.
It’s compute that’s slowing, not specifically pre-training, because the financing/industry can’t scale much longer. The costs of training were increasing about 6x every 2 years, resulting in 12x increase in training compute every 2 years in 2022-2026. Possibly another 2x on top of that every 2 years from adoption of reduced floating point precision in training, going from BF16 to FP8 and soon possibly to NVFP4 (likely it won’t go any further). A 1 GW system of 2026 costs an AI company about $10bn a year. There’s maybe 2-3 more years at this pace in principle, but more likely the slowdown will be gradually starting sooner, and then it’s Moore’s law (of price-performance) again, to the extent that it’s still real (which is somewhat unclear).