Forecast: Recursively Self-improving AI for 2033

Context:

  • One way to measure how good LLMs are which is gaining traction and validity is the following:

    • Let T(t) be the time it takes for a human to the task t.

    • Empirically, a given LLM will be able to fully accomplish (without extra human correction) almost all tasks t such that T(t)<K and will not be able to accomplish, without human help, tasks f such that T(f)>K.

    • Thus we can measure how good an LLM by see what is the time it would take a human to accomplish the hardest tasks that it can do.

    • Let’s call the that skill level K.

Data: [1]

  • Moore’s Law for LLMs: every 6 months, LLMs double their K value.

  • K(Claude Opus 4.5) ~ 1-2 hours (i.e. Claude can currently do a one-human-hour-task in one shot without any human correction).

Figure from Measuring AI Ability to Complete Long Tasks

Reasoning:

  1. A minimal non-trivial amount of significant improvement to an AI system corresponds to roughly one NeurIPS (or other such conference) paper.

  2. Writing such a paper typically requires on the order of 1 year of human work.

  3. Using a 6 months doubling time of the LLM Moore’s Law, that means that in 7 years an LLM will be able to independently write a NeurIPS paper.

  4. Hence in 7 years, i.e. 2033, it will possible to create a recursively self-improving AI.

Initial Improvement Rate:

The relevant questions to ask to estimate the initial takeoff rate are the following:

What is the size of the improvement? In the reasoning, we already set that to the equivalent of a NeurIPS paper.

How much time will it take the AI to produce this improvement? Assuming that the bottleneck for producing this improvement is running the experiments, currently experiments to produce one NeurIPS paper typically take one week[2], thus we can expect the initial self-improvement rate to be roughly on the order of one NeurIPS paper per week. Which is not the hardest of takeoffs but also not the softest of takeoffs as we will see in the next paragraph. Also, this gives us a clue as to the limiting factor for containing ASI: limiting access to compute.

Hard or Soft Takeoff?

Epistemic status: while I feel very confident of the above, what follows seems much less certain to me.

Assuming, after reaching self-improvement that compute is not a limit, either due to access to enough compute or because the AI can find ways to make more efficient use of the compute it has, and assuming continued exponential increase, meaning that the first week it self-improves to the tune of one NeurIPS paper, the second week to the tune of 2 NeurIPS papers, the 3rd week 4 NeurIPS papers and so on. Then, on the 12th or 13th week it will improve to the tune of a whole NeurIPS conference (i.e. roughly 5000 papers).

This reasoning thus predicts the takeoff speed to be on the order of months rather than decades or minutes.

Erratum (feb. 1st): I made a mistake in the above striked-through reasoning, it is mistaken. Assuming exponential increase in capacity due to self-improvement we have that the capacity . From the above, we suppose is the year 2033. To determine the takeoff rate, we are interested in finding T. We already argued for the initial improvement rate to be 1 NeurIPS paper /​ week.

To isolate T from A, we can look at the current rate of improvement of AI due to human improvement , where is now, i.e start of 2026. One of our datapoints is that ~ 6 months. Furthermore, counting conferences like ICLR, ICML, AAAI, NeurIPS, we can estimate 20k papers/​year or 10k papers per 6 months. Which gives us ~ 10k paper. Now we can suppose, that before recursive self-improvement, . Which means . Putting this all together we get:

giving

Which would be a super slow takeoff of millions of years of doubling time.

This of course supposes that almost all advancement comes from increasing knowledge, and not from increasing compute power, which is debatable.

One could also argue with keeping the factor of in the expression for T. Presumably in 2033 there will not be papers published per year, but rather each paper’s impact will be multiplicative rather than additive. In which case we get a capacity doubling time of the self-improving AI to be

which sounds more realistic than a million years, but is still a slow takeoff.

  1. ^

    Source: A paper entitled something along the lines of “BRIDGE: Bridging Reasoning In Difficulty Gap between Entities” which will soon be published at ICLR 2026 but does not yet seem to be publicly available. This paper broadly agrees with a prior paper Measuring AI Ability to Complete Long Tasks which calculates 7 months doubling time instead of 6.

  2. ^