I think it’s overdetermined by Blackwell NVL72/NVL36 and long reasoning training that there will be no AI-specific “crash” until at least late 2026. Reasoning models want a lot of tokens, but their current use is constrained by cost and speed, and these issues will be going away to a significant extent. Already Google has Gemini 2.5 Pro (taking advantage of TPUs), and within a few months OpenAI and Anthropic will make reasoning variants of their largest models practical to use as well (those pretrained at the scale of 100K H100s / ~3e26 FLOPs, meaning GPT-4.5 for OpenAI).
The same practical limitations (as well as novelty of the technique) mean that long reasoning models aren’t using as many reasoning tokens as they could in principle, everyone is still at the stage of getting long reasoning traces to work at all vs. not yet, rather than scaling things like the context length they can effectively use (in products rather than only internal research). It’s plausible that contexts with millions of reasoning tokens can be put to good use, where other training methods failed to make contexts at that scale work well.
So later in 2025 there’s better speed and cost, driving demand in terms of the number of prompts/requests, and for early to mid-2026 potentially longer reasoning traces, driving demand in terms of token count. After that, it depends on whether capabilities get much better than Gemini 2.5 Pro. Pretraining scale in deployed models will only advance 2x-5x by mid-2026 compared to now (using 100K-200K Blackwell chip training systems built in 2025), which is not a large enough change to be very noticeable, so it’s not by itself sufficient to prevent a return of late 2024 vaguely pessimistic sentiment, and other considerations might get more sway with funding outcomes. But even then, OpenAI might get to ~$25bn annualized revenue that won’t be going away, and in 2027 or slightly earlier there will be models pretrained for ~4e27 FLOPs using the training systems built in 2025-2026 (400K-600K Blackwell chips, 0.8-1.4 GW, $22-35bn), which as a 10x-15x change (compared to the models currently or soon-to-be deployed in 2025) is significant enough to get noticeably better across the board, even if nothing substantially game-changing gets unlocked. So the “crash” might be about revenue no longer growing 3x per year, and so the next generation training systems built in 2027-2028 not getting to the $150bn scale they otherwise might’ve aspired to.
Thanks, I might be underestimating the impact of new Blackwell chips with improved computation.
I’m skeptical whether offering “chain-of-thought” bots to more customers will make a significant difference. But I might be wrong – especially if new model architectures would come out as well.
And if corporations throw enough cheap compute behind it plus widespread personal data collection, they can get to commercially very useful model functionalities. My hope is that there will be a market crash before that could happen, and we can enable other concerned communities to restrict the development and release of dangerously unscoped models.
But even then, OpenAI might get to ~$25bn annualized revenue that won’t be going away
the impact of new Blackwell chips with improved computation
It’s about world size, not computation, and has a startling effect that probably won’t occur again with future chips, since Blackwell sufficiently catches up to models at the current scale.
But even then, OpenAI might get to ~$25bn annualized revenue that won’t be going away
What is this revenue estimate assuming?
The projection for 2025 is $12bn at 3x/year growth (1.1x per month, so $1.7bn per month at the end of 2025, $3bn per month in mid-2026), and my pessimistic timeline above assumes that this continues up to either end of 2025 or mid-2026 and then stops growing after the hypothetical “crash”, which gives $20-36bn per year.
Thanks, I got to say I’m a total amateur when it comes to GPU performance. So will take the time to read your linked-to comment to understand it better.
I think it’s overdetermined by Blackwell NVL72/NVL36 and long reasoning training that there will be no AI-specific “crash” until at least late 2026. Reasoning models want a lot of tokens, but their current use is constrained by cost and speed, and these issues will be going away to a significant extent. Already Google has Gemini 2.5 Pro (taking advantage of TPUs), and within a few months OpenAI and Anthropic will make reasoning variants of their largest models practical to use as well (those pretrained at the scale of 100K H100s / ~3e26 FLOPs, meaning GPT-4.5 for OpenAI).
The same practical limitations (as well as novelty of the technique) mean that long reasoning models aren’t using as many reasoning tokens as they could in principle, everyone is still at the stage of getting long reasoning traces to work at all vs. not yet, rather than scaling things like the context length they can effectively use (in products rather than only internal research). It’s plausible that contexts with millions of reasoning tokens can be put to good use, where other training methods failed to make contexts at that scale work well.
So later in 2025 there’s better speed and cost, driving demand in terms of the number of prompts/requests, and for early to mid-2026 potentially longer reasoning traces, driving demand in terms of token count. After that, it depends on whether capabilities get much better than Gemini 2.5 Pro. Pretraining scale in deployed models will only advance 2x-5x by mid-2026 compared to now (using 100K-200K Blackwell chip training systems built in 2025), which is not a large enough change to be very noticeable, so it’s not by itself sufficient to prevent a return of late 2024 vaguely pessimistic sentiment, and other considerations might get more sway with funding outcomes. But even then, OpenAI might get to ~$25bn annualized revenue that won’t be going away, and in 2027 or slightly earlier there will be models pretrained for ~4e27 FLOPs using the training systems built in 2025-2026 (400K-600K Blackwell chips, 0.8-1.4 GW, $22-35bn), which as a 10x-15x change (compared to the models currently or soon-to-be deployed in 2025) is significant enough to get noticeably better across the board, even if nothing substantially game-changing gets unlocked. So the “crash” might be about revenue no longer growing 3x per year, and so the next generation training systems built in 2027-2028 not getting to the $150bn scale they otherwise might’ve aspired to.
Thanks, I might be underestimating the impact of new Blackwell chips with improved computation.
I’m skeptical whether offering “chain-of-thought” bots to more customers will make a significant difference. But I might be wrong – especially if new model architectures would come out as well.
And if corporations throw enough cheap compute behind it plus widespread personal data collection, they can get to commercially very useful model functionalities. My hope is that there will be a market crash before that could happen, and we can enable other concerned communities to restrict the development and release of dangerously unscoped models.
What is this revenue estimate assuming?
It’s about world size, not computation, and has a startling effect that probably won’t occur again with future chips, since Blackwell sufficiently catches up to models at the current scale.
The projection for 2025 is $12bn at 3x/year growth (1.1x per month, so $1.7bn per month at the end of 2025, $3bn per month in mid-2026), and my pessimistic timeline above assumes that this continues up to either end of 2025 or mid-2026 and then stops growing after the hypothetical “crash”, which gives $20-36bn per year.
Thanks, I got to say I’m a total amateur when it comes to GPU performance. So will take the time to read your linked-to comment to understand it better.