Thanks for explaining. I now agree that the current cost of inference isn’t a very good anchor for future costs in slowdown timelines.
I’m uncertain, but I still think OpenAI is likely to go bankrupt in slowdown timelines. Here are some related thoughts:
OpenAI probably won’t pivot to the slowdown in time.
They’d have < 3 years to do before running out of money.
Budgets are set in advance. So they’d have even less time.
All of the improvements you list cost time and money. So they’d need to continue spending on R&D, before that R&D has improved their cost of inference. In practice, they’d need to stop pushing the frontier even earlier, to have more time and money available.
There’s not many generations of frontier models left before Altman would need to halt scaling R&D.
Altman is currently be racing to AGI; and I don’t think it’s possible, on the slowdown hypothetical, for him to get enough evidence to convince him to stop in time.
Revenue (prices) may scale down alongside the cost of inference
Under perfect competition, widely-shared improvements in the production of a commodity will result in price decreases rather than profit increases.
There are various ways competition here is imperfect; but I think that imperfect competition benefits Google more than OpenAI. That’s really bad, since OpenAI’s finances are also much worse.
This still makes the cost of inference/cost of revenue estimates wrong; but OpenAI might not be able to make enough money to cover their debt and what you called “essential R&D”. Dunno.
Everybody but Anthropic are already locked in to much of their inference (and R&D) spending via capex.
AI 2027 assumes that OpenAI uses different chips specialized in training vs. inference.
The cost of the datacenter and the GPU’s it contains is fixed, and I believe it makes up most of the cost of inference today. OpenAI, via Project Stargate, is switching from renting GPU’s to building its own datacenters. (This may or may not bring down the cost of inference on its own, depending on what kind of discount OpenAI got from Azure.)
So for inference improvements to be a pure win, OpenAI’s usage needs to grow to the same extent. But in this scenario capabilities have stopped improving, so it probably won’t. Usage might even shrink if some people adopt AI in preparation for future capability improvements.
How to account for capex seems complicated:
We need to amortize the cost of the GPU’s themselves. GPU’s in heavy usage break down over time; OpenAI would need to make enough of a profit to repay the loans and to buy replacement GPU’s. (If they have lower utilization, their existing stock of GPU’s will last longer.) Also, if semiconductor progress continues it will make their existing GPU’s obsolete. I’m skeptical that semiconductor progress will continue at the same rate; but if it does, OpenAI will be locked into using obsolete and inefficient GPU’s.
Project Stargate is planning on spending 100 billion at first, 50 billion of which would be debt. CoreWeave is paying 11-14% annual interest on its debt; if OpenAI pays 10%, it would be locked into paying ~5 billion/year in interest. This, plus essential R&D, inference electricity, revenue share, and other costs, might or might not be doable; however, they’d almost certainly not have much money left over to buy new GPU’s, save up for a balloon payment, or make a profit.
Control over many datacenters is useful for coordinating a large training run, but otherwise it doesn’t mean you have to find a use for all of that compute all the time, since you could lease/sublease some for use by others (which at the level of datacenter buildings is probably not overly difficult technically, you don’t need to suddenly become a cloud provider yourself).
So the quesion is more about the global AI compute buildout not finding enough demand to pay for itself, rather than what happens with companies that build the datacenters or create the models, and whether these are the same companies. It’s not useful to let datacenters stay idle, even if that perfectly extends hardware’s lifespan (which seems to be several years), since progress in hardware means the time of current GPUs will be much less valuable in several years, plausibly 5x-10x less valuable. And TCO over a datacenter’s lifetime is only 10-20% higher than the initial capex. So in a slowdown timeline prices of GPU-time can drop all the way to maybe 20-30% of what they would need to be to pay for the initial capex, before the datacenters start going idle. This proportionally reduces cost of inference (and also of training).
Project Stargate is planning on spending 100 billion at first, 50 billion of which would be debt.
The Abilene site in 2026 only costs $22-35bn, and they’ve raised a similar amount for it recently, so the $100bn figure remains about as nebulous as the $500bn figure. For inference (where exclusive use of a giant training system in a single location is not necessary) they might keep using Azure, so there is probably no pressing need to build even more for now.
Though I think there’s unlikely to be an AI slowdown until at least late 2026, and they’ll need to plan to build more in 2027-2028, raising money for it in 2026, so it’s likely they’ll get to try to secure those $100bn even in the timeline where there’ll be an AI slowdown soon after.
You seem to be assuming that there’s not significant overhead or delays from negotiating leases, entering bankruptcy, or dealing with specialized hardware, which is very plausibly false.
If nobody is buying new datacenter GPU’s, that will cut GPU progress to ~zero or negative (because production is halted and implicit knowledge is lost). (It will also probably damage broader semiconductor progress.)
This proportionally reduces cost of inference (and also of training).
This reduces the cost to rent a GPU-hour, but it doesn’t reduce the cost to the owner. (OpenAI, and every frontier lab but Anthropic, will own much or all[1] of their own compute. So this doesn’t do much to help OpenAI in particular.)
I think you have a misconception about accounting. GPU depreciation is considered on an income statement, it is part of the operating expenses, subtracted from gross profit to get net profit. Depreciation due to obsolescence vs. breakdowns isn’t treated differently. If OpenAI drops its prices below the level needed to pay for that depreciation, they won’t be running a (net) profit. Since they won’t be buying new GPU’s, they will die in a few years, once their existing stock of GPU’s breaks down or becomes obsolete. To phrase it another way, if you reduce GPU-time prices 3-5x, the global AI compute buildout has not in fact paid for itself.
OpenAI has deals with CoreWeave and Azure; they may specify fixed prices; even if not, CoreWeave’s independence doesn’t matter here, as they also need to make enough money to buy new GPU’s/repay debt. (Azure is less predictable.)
The point of the first two paragraphs was to establish relevance and an estimate for the lowest market price of compute in case of a significant AI slowdown, a level at which some datacenters will still prefer to sell GPU-time rather than stay idle (some owners of datacenters will manage to avoid bankruptcy and will be selling GPU-time even with no hope of recouping capex, as long as it remains at an opex profit, assuming nobody will be willing to buy out their second hand hardware either). So it’s not directly about OpenAI’s datacenter situation, rather it’s a context in which OpenAI might find itself, which is with access to a lot of cheap compute from others.
I’m using “cost of inference” in a narrow sense of cost of running a model at a market price of the necessary compute, with no implications about costs of unfortunate steps taken in pursuit of securing inference capacity, such as buying too much hardware directly. In case of an AI slowdown, I’m assuming that inference compute will remain abundant, so securing the necessary capacity won’t be difficult.
I’m guessing one reason Stargate is an entity separate from OpenAI is to have an option to walk away from it if future finances of OpenAI can’t sustain the hardware Stargate is building, in which case OpenAI might need or want to find compute elsewhere, hence relevance of market prices of compute. Right now they are in for $18bn with Stargate specifically out of $30-40bn they’ve raised (depending on success of converting into a for-profit).
Thanks for explaining. I now agree that the current cost of inference isn’t a very good anchor for future costs in slowdown timelines.
I’m uncertain, but I still think OpenAI is likely to go bankrupt in slowdown timelines. Here are some related thoughts:
OpenAI probably won’t pivot to the slowdown in time.
They’d have < 3 years to do before running out of money.
Budgets are set in advance. So they’d have even less time.
All of the improvements you list cost time and money. So they’d need to continue spending on R&D, before that R&D has improved their cost of inference. In practice, they’d need to stop pushing the frontier even earlier, to have more time and money available.
There’s not many generations of frontier models left before Altman would need to halt scaling R&D.
Altman is currently be racing to AGI; and I don’t think it’s possible, on the slowdown hypothetical, for him to get enough evidence to convince him to stop in time.
Revenue (prices) may scale down alongside the cost of inference
Under perfect competition, widely-shared improvements in the production of a commodity will result in price decreases rather than profit increases.
There are various ways competition here is imperfect; but I think that imperfect competition benefits Google more than OpenAI. That’s really bad, since OpenAI’s finances are also much worse.
This still makes the cost of inference/cost of revenue estimates wrong; but OpenAI might not be able to make enough money to cover their debt and what you called “essential R&D”. Dunno.
Everybody but Anthropic are already locked in to much of their inference (and R&D) spending via capex.
AI 2027 assumes that OpenAI uses different chips specialized in training vs. inference.
The cost of the datacenter and the GPU’s it contains is fixed, and I believe it makes up most of the cost of inference today. OpenAI, via Project Stargate, is switching from renting GPU’s to building its own datacenters. (This may or may not bring down the cost of inference on its own, depending on what kind of discount OpenAI got from Azure.)
So for inference improvements to be a pure win, OpenAI’s usage needs to grow to the same extent. But in this scenario capabilities have stopped improving, so it probably won’t. Usage might even shrink if some people adopt AI in preparation for future capability improvements.
How to account for capex seems complicated:
We need to amortize the cost of the GPU’s themselves. GPU’s in heavy usage break down over time; OpenAI would need to make enough of a profit to repay the loans and to buy replacement GPU’s. (If they have lower utilization, their existing stock of GPU’s will last longer.) Also, if semiconductor progress continues it will make their existing GPU’s obsolete. I’m skeptical that semiconductor progress will continue at the same rate; but if it does, OpenAI will be locked into using obsolete and inefficient GPU’s.
Project Stargate is planning on spending 100 billion at first, 50 billion of which would be debt. CoreWeave is paying 11-14% annual interest on its debt; if OpenAI pays 10%, it would be locked into paying ~5 billion/year in interest. This, plus essential R&D, inference electricity, revenue share, and other costs, might or might not be doable; however, they’d almost certainly not have much money left over to buy new GPU’s, save up for a balloon payment, or make a profit.
Control over many datacenters is useful for coordinating a large training run, but otherwise it doesn’t mean you have to find a use for all of that compute all the time, since you could lease/sublease some for use by others (which at the level of datacenter buildings is probably not overly difficult technically, you don’t need to suddenly become a cloud provider yourself).
So the quesion is more about the global AI compute buildout not finding enough demand to pay for itself, rather than what happens with companies that build the datacenters or create the models, and whether these are the same companies. It’s not useful to let datacenters stay idle, even if that perfectly extends hardware’s lifespan (which seems to be several years), since progress in hardware means the time of current GPUs will be much less valuable in several years, plausibly 5x-10x less valuable. And TCO over a datacenter’s lifetime is only 10-20% higher than the initial capex. So in a slowdown timeline prices of GPU-time can drop all the way to maybe 20-30% of what they would need to be to pay for the initial capex, before the datacenters start going idle. This proportionally reduces cost of inference (and also of training).
The Abilene site in 2026 only costs $22-35bn, and they’ve raised a similar amount for it recently, so the $100bn figure remains about as nebulous as the $500bn figure. For inference (where exclusive use of a giant training system in a single location is not necessary) they might keep using Azure, so there is probably no pressing need to build even more for now.
Though I think there’s unlikely to be an AI slowdown until at least late 2026, and they’ll need to plan to build more in 2027-2028, raising money for it in 2026, so it’s likely they’ll get to try to secure those $100bn even in the timeline where there’ll be an AI slowdown soon after.
You seem to be assuming that there’s not significant overhead or delays from negotiating leases, entering bankruptcy, or dealing with specialized hardware, which is very plausibly false.
If nobody is buying new datacenter GPU’s, that will cut GPU progress to ~zero or negative (because production is halted and implicit knowledge is lost). (It will also probably damage broader semiconductor progress.)
This reduces the cost to rent a GPU-hour, but it doesn’t reduce the cost to the owner. (OpenAI, and every frontier lab but Anthropic, will own much or all[1] of their own compute. So this doesn’t do much to help OpenAI in particular.)
I think you have a misconception about accounting. GPU depreciation is considered on an income statement, it is part of the operating expenses, subtracted from gross profit to get net profit. Depreciation due to obsolescence vs. breakdowns isn’t treated differently. If OpenAI drops its prices below the level needed to pay for that depreciation, they won’t be running a (net) profit. Since they won’t be buying new GPU’s, they will die in a few years, once their existing stock of GPU’s breaks down or becomes obsolete. To phrase it another way, if you reduce GPU-time prices 3-5x, the global AI compute buildout has not in fact paid for itself.
OpenAI has deals with CoreWeave and Azure; they may specify fixed prices; even if not, CoreWeave’s independence doesn’t matter here, as they also need to make enough money to buy new GPU’s/repay debt. (Azure is less predictable.)
The point of the first two paragraphs was to establish relevance and an estimate for the lowest market price of compute in case of a significant AI slowdown, a level at which some datacenters will still prefer to sell GPU-time rather than stay idle (some owners of datacenters will manage to avoid bankruptcy and will be selling GPU-time even with no hope of recouping capex, as long as it remains at an opex profit, assuming nobody will be willing to buy out their second hand hardware either). So it’s not directly about OpenAI’s datacenter situation, rather it’s a context in which OpenAI might find itself, which is with access to a lot of cheap compute from others.
I’m using “cost of inference” in a narrow sense of cost of running a model at a market price of the necessary compute, with no implications about costs of unfortunate steps taken in pursuit of securing inference capacity, such as buying too much hardware directly. In case of an AI slowdown, I’m assuming that inference compute will remain abundant, so securing the necessary capacity won’t be difficult.
I’m guessing one reason Stargate is an entity separate from OpenAI is to have an option to walk away from it if future finances of OpenAI can’t sustain the hardware Stargate is building, in which case OpenAI might need or want to find compute elsewhere, hence relevance of market prices of compute. Right now they are in for $18bn with Stargate specifically out of $30-40bn they’ve raised (depending on success of converting into a for-profit).