Notably, it’s also the date at which my model diverges from this forecast’s. That’s surprisingly later than I’d expected.
Concretely,
OpenBrain doubles down on this strategy with Agent-2. It is qualitatively almost as good as the top human experts at research engineering (designing and implementing experiments), and as good as the 25th percentile OpenBrain scientist at “research taste” (deciding what to study next, what experiments to run, or having inklings of potential new paradigms).
I don’t know that the AGI labs in early 2027 won’t be on a trajectory to automate AI R&D. But I predict that a system trained the way Agent-2 is described to be trained here won’t be capable of the things listed.
I guess I’m also inclined to disagree with parts of the world-state predicted by early 2026, though it’s murkier on that front. Agent-1′s set of capabilities seems very plausible, but what I’m skeptical regarding are the economic and practical implications (AGI labs’ revenue tripling and 50% faster algorithmic progress). As in,
People naturally try to compare Agent-1 to humans, but it has a very different skill profile. It knows more facts than any human, knows practically every programming language, and can solve well-specified coding problems extremely quickly. On the other hand, Agent-1 is bad at even simple long-horizon tasks, like beating video games it hasn’t played before.
Does that not constitute just a marginal improvement on the current AI models? What’s the predicted phase shift that causes the massive economic implications and impact on research?
I assume it’s the jump from “unreliable agents” to “reliable agents” somewhere between 2025 to 2026. It seems kind of glossed over; I think that may be an earlier point at which I would disagree. Did I miss a more detailed discussion of it somewhere in the supplements?
I’m skeptical regarding are the economic and practical implications (AGI labs’ revenue tripling and 50% faster algorithmic progress)
Notably, the trend in the last few years is that AI companies triple their revenue each year. So, the revenue tripling seems very plausible to me.
As far as 50% algorithmic progress, this happens using Agent-1 (probably with somewhat better post training then the original version) in around April 2026 (1 year from now). I think the idea is that by this point, you have maybe a 8-16 hour horizon length on relatively well contained benchmark tasks which allows for a bunch of the coding work to be automated including misc experiment running. (Presumably the horizon length is somewhat shorter on much messier tasks, but maybe by only like 2-4x or less.)
Note that this only speeds up overall AI progress by around 25% because AI R&D maybe only drives a bit more than half of progress (with the rest driven by scaling up compute).
Personally, I think 50% seems somewhat high given the level of capability and the amount of integration time, but not totally crazy. (I think I’d guess more like 25%? I generally think the speed ups they quote are somewhat too bullish.) I think I disagree more with the estimated current speed up of 13% (see April 2025). I’d guess more like 5% right now. If I bought that you get 13% now, I think that would update me most of the way to 50% on the later milestone.
Notably, the trend in the last few years is that AI companies triple their revenue each year
Hm, I admittedly only skimmed the Compute Forecast article, but I don’t think there’s much evidence for a trend like this? The “triples every year” statement seems to be extrapolated from two data points about OpenAI specifically (“We use OpenAI’s 2023 revenue of $1B and 2024 revenue around $4B to to piece together a short term trend that we expect to slow down gradually”, plus maybe this). I guess you can draw a straight line through two points, and the idea of this trend following straight lines doesn’t necessarily seem unconvincing a-priori… But is there more data?
50% algorithmic progress
Yeah, I concur with all of that: some doubts about 50% in April 2026, some doubts about 13% today, but seems overall not implausible.
I think the best source for revenue growth is this post from epoch. I think we only have the last 2 years really, (so “last few years” is maybe overstating it), but we do have revenue projections and we have more than 1 data point per year.
Also see FutureSearch’s report on a plausible breakdown for how OpenBrain hits $100B ARR by mid-2027.
I think if you condition on the capability progression in the scenario and look at existing subscription services generating in the $100B range, it feels very plausible intuitively, independently from the ‘tripling’ extrapolation.
Excellent work!
Notably, it’s also the date at which my model diverges from this forecast’s. That’s surprisingly later than I’d expected.
Concretely,
I don’t know that the AGI labs in early 2027 won’t be on a trajectory to automate AI R&D. But I predict that a system trained the way Agent-2 is described to be trained here won’t be capable of the things listed.
I guess I’m also inclined to disagree with parts of the world-state predicted by early 2026, though it’s murkier on that front. Agent-1′s set of capabilities seems very plausible, but what I’m skeptical regarding are the economic and practical implications (AGI labs’ revenue tripling and 50% faster algorithmic progress). As in,
Does that not constitute just a marginal improvement on the current AI models? What’s the predicted phase shift that causes the massive economic implications and impact on research?
I assume it’s the jump from “unreliable agents” to “reliable agents” somewhere between 2025 to 2026. It seems kind of glossed over; I think that may be an earlier point at which I would disagree. Did I miss a more detailed discussion of it somewhere in the supplements?
Notably, the trend in the last few years is that AI companies triple their revenue each year. So, the revenue tripling seems very plausible to me.
As far as 50% algorithmic progress, this happens using Agent-1 (probably with somewhat better post training then the original version) in around April 2026 (1 year from now). I think the idea is that by this point, you have maybe a 8-16 hour horizon length on relatively well contained benchmark tasks which allows for a bunch of the coding work to be automated including misc experiment running. (Presumably the horizon length is somewhat shorter on much messier tasks, but maybe by only like 2-4x or less.)
Note that this only speeds up overall AI progress by around 25% because AI R&D maybe only drives a bit more than half of progress (with the rest driven by scaling up compute).
Personally, I think 50% seems somewhat high given the level of capability and the amount of integration time, but not totally crazy. (I think I’d guess more like 25%? I generally think the speed ups they quote are somewhat too bullish.) I think I disagree more with the estimated current speed up of 13% (see April 2025). I’d guess more like 5% right now. If I bought that you get 13% now, I think that would update me most of the way to 50% on the later milestone.
Hm, I admittedly only skimmed the Compute Forecast article, but I don’t think there’s much evidence for a trend like this? The “triples every year” statement seems to be extrapolated from two data points about OpenAI specifically (“We use OpenAI’s 2023 revenue of $1B and 2024 revenue around $4B to to piece together a short term trend that we expect to slow down gradually”, plus maybe this). I guess you can draw a straight line through two points, and the idea of this trend following straight lines doesn’t necessarily seem unconvincing a-priori… But is there more data?
Yeah, I concur with all of that: some doubts about 50% in April 2026, some doubts about 13% today, but seems overall not implausible.
I think the best source for revenue growth is this post from epoch. I think we only have the last 2 years really, (so “last few years” is maybe overstating it), but we do have revenue projections and we have more than 1 data point per year.
Also see FutureSearch’s report on a plausible breakdown for how OpenBrain hits $100B ARR by mid-2027.
I think if you condition on the capability progression in the scenario and look at existing subscription services generating in the $100B range, it feels very plausible intuitively, independently from the ‘tripling’ extrapolation.