How the AI Labs Make Profit (Maybe, Eventually)
I wrote this essay as a submission to Dwarkesh Patel’s blog prize, though I have been meaning to write this up for a while.
Usually, for a company to become profitable, they need to increase revenue, decrease costs, or some mixture of the two. For AI companies in their current form, I think there is a third way they can become profitable that looks like increasing revenue but is distinct from what they are currently doing. Namely, internal deployment where they spin up internal companies.
First, the AI companies currently aren’t facing a lot of pressure to become profitable. That’s partially the reason that OpenAI and Anthropic are the first companies to reach ~$900B valuation and be cash flow negative. They’ve had the luxury of not being profitable and focusing on growth because the market has been willing to fund their growth. This allows for ideologies within the companies to remain that eventually might not continue to fly, like “we are going post-economic, money won’t matter” or “we will build the machine god and ask it to make money”. But eventually, companies will be forced to become profitable. There is only about ~another round of capital left where the companies can remain unprofitable. Perhaps OpenAI/Anthropic could raise $250-500B at a $1.5-2.5T valuation, but it seems very unlikely that they could raise $1T+ at a $4T+ valuation.
It’s fairly hard to imagine AI labs doing much to cut costs to become profitable. They could prioritize developing and releasing smaller models, but it seems difficult to stay in the race without pushing the frontier. They could try to cut their research costs, but these are likely to increase as demand for larger and more intelligent models continues. With company ambitions and investor desires, it doesn’t seem like cutting will be the chosen method.
It is more plausible that the labs could increase their revenues by charging more. Many individual users are paying $2000/year/company, and some enterprises are likely paying $100M+/year. Some users would be willing to spend 10-100x. But price discrimination will be hard to determine for these users, and switching costs are low. It’s conceivable that a company can get ahead of others and charge a premium for its intelligence even if only on certain domains, but while there are theoretical arguments for this, it hasn’t happened yet, especially for any extended period. Overall, the main issue for AI companies being able to significantly increase revenues is that open source competitors can distill models and catch up to the frontier in 6-12 months. Also, competitors like Cursor serve frontier models and are able to collect data from the users on what patches users prefer, and train their models on that data, further disadvantaging the frontier companies. I’ve done some rough modelling, beyond this post, and I think it’s unlikely that companies are going to be able to monetize their models in this short amount of time to make their models profitable, especially as training costs keep increasing. It has also been suggested that perhaps companies will stop charging per token but charge for intelligence. But it’s hard to know how much the tokens are worth, and this is essentially just charging more for better models, which often won’t be worth it, and firms will prefer to pay much less for slightly less intelligence.
This leads to the final possibility that AI companies will begin to keep their models in-house and use them themselves to make a profit. This might take the form of partnerships with other firms, or the companies themselves will build companies within the company.
There are many industries with very large revenues that could benefit immensely from LLMs. I’ll briefly talk about quantitative trading, but the pharmaceutical industry and others can conceivably make great use of LLMs and other AI models.
Trading firms make a lot of money. Some firms make as much as $50B in net trading revenue per year; the industry earns ~$200B in net trading revenue. A lot of employees at AI firms come from trading firms, and there is thus a very natural fit. Certain trading strategies, like sentiment analysis on presswires or analysis of earnings reports, that already benefit a lot from using LLMs could become strategies that will be dominated not by traditional trading firms but the trading firms within AI companies.
It’s worth considering just how much more valuable this could be to the companies than releasing their models to the public. In trading and in other domains, the total value of alpha/edge is inversely proportional to the number of firms that have this edge. This is more radical than it first appears. It is not merely that if the number of entities that have a certain edge increases from one to two, each will get some fraction of the original edge. But rather, the total amount that the edge is worth goes down, and then it is split between the entities. In the context of AI companies, not only does this mean that the intelligence might be worth more if kept internally, in total, but also that they don’t need to share any of the value with the company that would be using the API.
There is already some evidence that this is happening. AI companies have internal models that they are using to develop the next generations, and they are keeping them longer internally before release, other than just safety testing. There are rumours that SSI is trading internally, labs are already working with trading firms, and Anthropic acquired Coefficient Bio, a company that could plausibly help them do AI-led drug discovery.
I think altogether, it is most likely that companies begin to make revenue from internal deployment, and there are a lot of incentives that push them in this direction. I think this has a lot of implications, particularly for those who are concerned about potential risks from AI systems. Namely, that a lot of the focus should be on internal deployment.
Credit: Ideas are my own, but two examples came from conversations with Ege Erdil.
If they go public, this level of funding can continue. There is a lot of demand for exposure to AI.
If Anthropic is making $44bn in annualized revenue (in some sense), that’s enough for maybe 3-4 GW of compute (at $12-15bn per GW per year), which they don’t physically have. To be unprofitable, it’s necessary to be able to get enough compute to spend the money on, so currently it’s possible to fail in the pursuit of unprofitability. (OpenAI probably didn’t fail.)
Anthropic’s current first-party inference plus R&D compute might be about 1-1.5 GW, that is they are only able to spend $12-25bn, annualized. They possibly have more capacity that’s not counted in this estimate, when serving via API from Vertex/Bedrock/Azure and leaving a greater part of the revenue with the clouds. Then it’s less than $44bn that remains for their own first-party inference plus R&D compute. SemiAnalysis estimates a gross margin of “over 70%”, which probably translates to annualized costs of only $12bn on serving models (if all inference was first-party), meaning a total of 1 GW of inference compute (Anthropic’s own dedicated compute plus the compute from the clouds). If they are using 0.5 GW of their own compute at a 72% gross margin, and 0.5 GW of compute from the clouds at a 30% gross margin (the rest goes to the clouds, and becomes a cost for Anthropic), that’s $22bn of gross profit in total out of the $44bn of revenue. To break even, they’d need 1 GW of R&D compute at $15bn per GW per year (on top of the 0.5 GW of first-party inference compute), which is a stretch. Though they’ll probably endeavor to restore the state of unprofitability as soon as they can.
I think it’s possible, but I don’t think it’s likely that we could have the most valuable companies in the world by market cap, raise trillions of dollars without being profitable, or have a plan to become profitable very soon. There is only so much capital, and particularly risk capital. The dynamic changes a lot when there isn’t much room for growth. In particular, there is about $100T of total revenue per year in the world right now, and so even if you capture all of it, with 100% gross margins and a P/E ratio of 20, you are only looking at a market cap of a quadrillion or a 200x at MC of $5T. Larger amounts of capital are much more risk-averse than smaller amounts of capital.
I think people are over-indexing on this here, every month they get insane growth and think this will occur forever. I wish these numbers were audited. There is also more to spend money on than raw compute but I do agree this is the bulk of it. They did just make a deal with SpaceX for a lot of compute they could feasibly spend a lot of money on. I don’t see how this negates any of my analysis above.
I suppose I would ask, do you think Anthropic is currently profitable (or was in April)?
The plan is they become profitable as soon as they stop growing, provided they manage to grow to the correct size and no more. The only reason they are unprofitable is that they are growing, the R&D compute is trying to match next year’s inference compute, rather than this year’s inference compute. A lot about future compute buildout efforts can in principle be canceled or delayed on a relatively short notice, significantly reducing the cost to keep the work already done at the half-completed datacenter sites useful for when it resumes later than planned. For this to be the actual option, the contracts expressing the commitments need to be sufficiently flexible, though in some ways that only shifts the backlash from unpredictability of the timing for the end of the LLM supercycle (assuming no AGI by 2028-2030, which is the time when rapid scaling of compute should run out of the immediately accessible TAM) from the AI companies down the supply chains.
That’s just 300 MW, which is maybe $4-5bn per year, not much of a dent in $44bn. Currently their problem is that they are not able to spend the money, because almost nobody has any extra compute (at a scale at all relevant to them) immediately ready to go. They can only spend more on future compute.
I don’t see the evidence they think this will occur forever. They think this will occur at least through 2027-2028, perhaps slower than so far and even slower in 2028, but still with significant growth (or perhaps keeping to 3x compute per year, thus 1-2 GW at end of 2025 become 10 GW by end of 2027 and more than that in 2028). They are ready to respond to the signs it’s slowing down, and maybe only need 2 years of notice to cancel excessive future buildouts cheaply, and 1 year of notice to delay future buildouts at a manageable cost (in a way that will make them useful when completed later).
I think it’s likely profitable (or was very recently) in the sense of run rate revenue exceeding run rate spending on all of the compute that’s currently online (all compute that is serving inference, plus all R&D compute, including training). This is not according to plan and will shortly be once again not so. But also, at any point where they are succeeding at being unprofitable, they can shift some R&D compute to inference and become profitable (making use of the 50-70% gross margin on serving tokens, which agrees with first-principles estimates), within weeks to months, as long as there is enough demand remaining to make use of the new inference compute shifted from R&D. And they would still be left with a reasonable amount of R&D compute to train models for the next year, if it turns out that next year they don’t actually need much more compute than they had this year (maybe less than 2x of what they had this year).
This is more the case when most of the compute serving their models is their own compute, so that it only costs them as much as it costs to build (annualized), rather than also whatever portion of their gross margin the clouds are taking when serving their models via Vertex/Bedrock/Azure. Thus some of the speed of growth in the buildouts is probably about shifting the inference compute from the indirect serving via clouds to the more directly contracted dedicated compute that’s cheaper for them (and will remain so).
What’s the source of the SSI rumor?
https://x.com/OHatTartine/status/2003910041649532983
I think this is the first public report of it, but I’ve heard it many times.
I should note, I think the major labs would be nuts if they aren’t spending time getting better at trading. There are a myriad of reasons to want to use LLMs in trading.
Why do companies that own shopping centres lease their units out to individual shops, instead of running shops themselves? Why do airports and railway stations lease out space to coffee shops, newsagents, etc. rather than operating coffee shops and newsagents themselves? Are these things different to selling AI access instead of doing whatever it is the companies buying AI access are doing?
(Perhaps they are! Perhaps a dedicated clothes shop or coffee shop has some advantage that a shopping centre or railway station can’t duplicate, but the companies that are renting AI access to run their businesses have no advantage that frontier AI labs couldn’t duplicate?)
If frontier models are only 12 months ahead of open models, how would the frontier labs get around the “If we can do it now the whole world can do it in 12 months” problem? Could a frontier lab build up enough of a running start in 12 months that you could never be caught? They couldn’t do this with AI development despite that being their speciality.
Finally—how much would the supply-demand equation for frontier AIs have to change for the labs to expect they could increase their value more by spending whatever resources they have on business ventures other than “developing frontier AI”? Would the equation change enough if AI development runs into diminishing returns or physical limits and stalls out, or if open AIs catch-up enough to saturate the market?
I’m not claiming that everything is of this shape. There are gains to specialization. In the shopping centres case, your core competency is probably the construction of real estate, or your big moat is all in owning the valuable real estate, so you can charge very high rents where most of the profits come from.
You would, in fact, get a discount on using your own real estate for other purposes, but others have more and better purposes, and you don’t specialize in this. This is because most things in shopping centres are either relatively low margins, all things considered, or have fairly low total markets ($<50M/year or w/e). The ability to make coffees is fairly commoditized. You can either implicitly pay the rent and run the business where you expect lots of revenue, or you can charge it out to someone else to do that.
There is a widespread joke about companies saying they are going to “make a bunch of alpha and sell it to hedge funds”. This is incredibly hard to do and basically never makes sense. If you have alpha, you should just use it yourself because it is more valuable that way.
Suppose you found an arbitrage between bitcoin across two exchanges that can make you $1M/year because there is $100M of volume and the arbitrage is for 1%. If you tell me about this strategy, I will be willing to close this spread further, and market make such that the edge is maybe 0.5% now. As more people enter the space, there is still $100M of volume, but the arbitrage between the exchanges is going to reduce further and further towards $0.
I’m not saying that all businesses are of this form. There are, of course, things you need other than AIs to run a lot of businesses. I’m saying there are some with extreme profit margin potential, such that the incentive is to keep it internally as opposed to sell it and widely distribute it. Perhaps they will still sell models for use by the public, but it’ll be 2nd-best models or models without all the intelligence needed for LLM-based trading on earnings reports or presswires.
I see what you are getting at. I think it is closer to 6 months. I think if you try to charge too much, you are simply going to run into the problem that you simply do not need that much intelligence for most use cases, and so you won’t be able to charge 10x as much for Opus 4.7>4.6 or w/e. You get to charge like 20% more, maybe.
I think Phil Tramell should get involved here. But “developing frontier AI” is not a business that makes money. Selling the frontier AI through API is the business currently. I’m saying that there is a new line of business that makes a lot of sense. Keeping your AI internally and using it to make money in a few specific industries, particuarly where having the best model all to yourself really pays dividends.
Following a recent episode of the All-In podcast, I think of it like this. Until this year, the people who were actually making money from the AI economy were the ones who make the hardware. In the wake of the AI coding revolution (the enterprise demand for Claude Code and its rivals), now the models are making money as well. The next step would be for the tokens to make money—that is, for the businesses who are paying for AI to write code, to actually profit from doing so.
I think, like a lot of what gets said on the All-In Podcast, this doesn’t make much sense.
The models aren’t yet making money. These companies are still cash-flow negative, and I am not expecting that to change for a while, at least another few months.
As for the tokens themselves making money, I think this is pretty ridiculous, but even taking them at their word, the companies that are buying tokens are already making money from them. Claude Code usage thus far has been incredibly subsidized by VC dollars and companies wouldn’t be paying for the tokens if they weren’t helping them make money. Companies aren’t paying for tokens just to get their employees used to coding with them. They are already making money from using AI (allbeit, a little).
Seems correct, and very important. However I expect that the first businesses that they will eat up will be software businesses, as has largely already been occurring.
IDK about quantitative trading, but managing real sector companies like pharmaceutical labs requires plenty of skills CEOs and boards of AI labs (and of software companies in general) just don’t have.
However, in February Anthropic hinted they are interested in transpiling legacy COBOL code, causing IBM shares to plunge. There surely is quite a lot of specialized competence and experience needed to disrupt the software sector, but plenty of people with both will be happy to work for OpenAI or Anthropic, and they speak the same IT jargon lab executives know well (as opposed to needing to explain the differences between Stage 1 and 2 clinical trials, for example), hence internal software companies seems more likely than anything related to non-IT
Quant trading is the one I know the best because I’ve been a quant trader. There are real skills that are required, other than just having great trading strategies or w/e but they can all be hired for. It’s also hard to overstate how many traders work at these labs. I think OAI and Anthropic probably have more former quants than any trading firm currently has. They can hire a lot of other necessary skills. They already have SWEs. I think the labs could spin up internal trading firms that are profitable, using their frontier AI, in under a year and be making $1B or more, profitably (after all expenses, including compute, salaries, etc.)