Disclaimer: I’m an AI YC founder from a recent batch (S24). I have no access to internal metrics about YC’s portfolio, but I keep in touch with some startup founders from my batch, and we trade insights and metrics.
Anecdotally, my sense is that YC has done a particularly poor job of evaluating AI startups. Mostly this comes as a result of not taking AGI seriously, and YC’s general philosophy of picking companies with impressive founders rather than evaluating ideas directly. YC has essentially bet as if AGI is a platform like smartphones and model quality is not going to improve, which at least for the last two years has been very wrong.
There are basically two ways to add value as an “AI wrapper” company. You can either:
Try to build scaffolding around AI models so that they’re capable of accomplishing some long horizon task they’re not currently able to accomplish end-to-end.
Build an alternative interface that is more useful for a given task than the general chat interface.
Companies in the #1 bucket are the quintessential YC startup of the last few years, something like “An AI that can do your accounting”. The way most of them tackle the problem is by breaking it down into pieces and doing context engineering. Investors love these startups because their potential TAM is huge (“all of accounting!”), they sound like they provide the possibility of deep technical moats, and the utility of the imagined solution is really obvious.
But if you look at the fastest growing companies of the AI era—Cursor, Lovable, Bolt, Windsurf—the majority of their utility comes from providing an interface for something the LLMs can do on their own. In some cases (Cursor, for instance) these companies started by offering custom flows for tasks and then deprecated them in favor of simple AI agents later.
This is because it’s hard to get performance that’s much better than what you get natively from the API. In some cases the scaffolding is important, but more typically the models are either so good they can just OODA all by themselves, or else they’re not good enough to get a startup to product market fit even with scaffolding. And because these models are improving all the time, even if your wrapper is really effective now, often the models improve quickly enough that all of the hard work you’ve done becomes moot too quickly to matter.
As we get closer to AGI I expect these dynamics to start reversing the growth even some companies that have done well so far. For example, YC has funded like 5 PR review companies, including Greptile. In 2023, it was really easy to improve base model performance as a PR review company—just ingest the diff, search for related bits to what’s being changed, and pull those bits into context. But now it’s 2025, and, because PR review is a short-horizon task, it’s been one of the first things to fall; you can get better performance by running Claude code with a github action than by using any of these solutions. I don’t actually know what these companies are going to do in the future other than hope their customers don’t notice.
Disclaimer: I’m an AI YC founder from a recent batch (S24). I have no access to internal metrics about YC’s portfolio, but I keep in touch with some startup founders from my batch, and we trade insights and metrics.
Anecdotally, my sense is that YC has done a particularly poor job of evaluating AI startups. Mostly this comes as a result of not taking AGI seriously, and YC’s general philosophy of picking companies with impressive founders rather than evaluating ideas directly. YC has essentially bet as if AGI is a platform like smartphones and model quality is not going to improve, which at least for the last two years has been very wrong.
There are basically two ways to add value as an “AI wrapper” company. You can either:
Try to build scaffolding around AI models so that they’re capable of accomplishing some long horizon task they’re not currently able to accomplish end-to-end.
Build an alternative interface that is more useful for a given task than the general chat interface.
Companies in the #1 bucket are the quintessential YC startup of the last few years, something like “An AI that can do your accounting”. The way most of them tackle the problem is by breaking it down into pieces and doing context engineering. Investors love these startups because their potential TAM is huge (“all of accounting!”), they sound like they provide the possibility of deep technical moats, and the utility of the imagined solution is really obvious.
But if you look at the fastest growing companies of the AI era—Cursor, Lovable, Bolt, Windsurf—the majority of their utility comes from providing an interface for something the LLMs can do on their own. In some cases (Cursor, for instance) these companies started by offering custom flows for tasks and then deprecated them in favor of simple AI agents later.
This is because it’s hard to get performance that’s much better than what you get natively from the API. In some cases the scaffolding is important, but more typically the models are either so good they can just OODA all by themselves, or else they’re not good enough to get a startup to product market fit even with scaffolding. And because these models are improving all the time, even if your wrapper is really effective now, often the models improve quickly enough that all of the hard work you’ve done becomes moot too quickly to matter.
As we get closer to AGI I expect these dynamics to start reversing the growth even some companies that have done well so far. For example, YC has funded like 5 PR review companies, including Greptile. In 2023, it was really easy to improve base model performance as a PR review company—just ingest the diff, search for related bits to what’s being changed, and pull those bits into context. But now it’s 2025, and, because PR review is a short-horizon task, it’s been one of the first things to fall; you can get better performance by running Claude code with a github action than by using any of these solutions. I don’t actually know what these companies are going to do in the future other than hope their customers don’t notice.