Jack Clark’s post on RSI essentially considers two claims: (1) routine AI R&D gets automated, and (2) AIs become capable of coming up with substantial new ideas for AI R&D. He cites evidence about the former, which seems plausibly in reach within a few years, even as soon as 2027-2028. Not seeing concrete signs giving the timing for the latter, he still settles on a guess of 60% by the end of 2028 for when both of these things arrive. I think even both of these things plausibly don’t trigger proper RSI, the same way human AI researchers haven’t triggered it yet, and the timing for automation of AI R&D in both of these senses doesn’t much help with figuring out the timing for proper RSI.
Automated routine AI R&D (whose arrival is more predictable given the capabilities of the current systems and the mundane extrapolations of the trends) is not proper RSI on its own, it’s more of a build system that automatically produces an AI ready for deployment using the current standard practices of the AI project (this includes finding/creating/preparing the training data). It makes the process faster and more convenient, but doesn’t keep substantially improving the quality of the result as you run it again and again, and doesn’t obviate the need for all the training time and compute that go into producing a frontier model, as well as R&D time and compute for experiments to resolve any issues.
Like any build system, it keeps irrecoverably breaking whenever you try to target a sufficiently unusual configuration, so that the human engineers need to fix things to enable the automation to proceed further, until the build succeeds (also passing the tests, which in this analogy includes routine capability and alignment evals). The automated build process still does almost all of the work, which is why all serious software projects use this. The process of developing AIs is behind the standard practice in this respect, and automation of routine AI R&D merely fixes this regression. (Newly built AIs also understand the new things that happened or were invented since the previous build, making the lineage of self-building AIs more cognitively self-sufficient in the ability to adapt to the changing world, an ability that is otherwise barely present in modern LLMs with frozen weights.)
The ability of AIs to come up with new ideas for improving AIs mostly matters once it meaningfully outstrips human ability (using the same R&D compute), and the timing for when that happens remains uncertain. The ability to come up with important new ideas at all, or to a similar extent as the researchers of an AI company, doesn’t automatically reduce the time to superintelligence.
The anchors for time to runaway RSI and superintelligence I find informative are (a) the total compute available to the largest AI company (which depends on mundane AI economics and on how quickly the datacenter supply chains are able to scale), and (b) the extent of practical constraints on model sizes that keep the frontier models from reaching the quality otherwise enabled by the available FLOPs (this depends on the AI accelerator systems, what kind of compute is being built). Automatically coming up with AI R&D ideas doesn’t directly affect these things (though it suggests unlocking of more TAM, funding larger compute buildouts), and so its impact on timing is in how much better than human researchers it works, not merely in being possible or comparable.
The change to being much better than human researchers is plausibly just one or two ideas away from where we are right now, but there is no straightforward thing to say about the timing for when these ideas get invented, or for when that happens on its own as a result of the remaining rapid compute scaling (with no need for new ideas). More compute for AI R&D makes it easier to find ideas that actually work, and if AIs can start contributing, then scaling of those AIs will result in better AI contributions. But the raw compute scaling speed won’t remain at the 2022-2026 levels for much longer, and the remaining constraints on model sizes in relation to available compute will probably get mostly lifted by 2028-2030. AIs that barely come up with new research ideas at that time don’t get much better at it after that on a predictable schedule.
This would be great news, but unfortunately there’s still Musk’s tweet from 8 Apr 2026 that says they’re training a 6T param model and a 10T param model (which are models in Opus to Mythos weight class, unless they have too few active params). This is the kind of thing that’s not true for the performative efforts at the other companies with in-principle sufficient compute that experiment with some kind of AI training, which don’t bother training big models.
The 300 MW of H100/H200 compute at Colossus 1 is mostly useful for smaller models (Sonnet class and below), and Colossus 2 is sufficient for SpaceX given the low demand for the Grok models. The cost to serve even smaller models is lower with GB200/GB300 NVL72 systems, so the gross margin gets better if you can find enough of such compute. Thus Anthropic is happy to take Colossus 1, since they can’t find enough compute, while SpaceX prefers Colossus 2 as the more profitable option. It possibly even means that the plans for more NVL72 compute at Colossus 2 or elsewhere are going well, so they can afford to plan for not needing Colossus 1 to serve the smaller models if the new bigger models win more demand.