I’m not sure your model makes sense (e.g. is phase 3 really a distinct phase?) If it does, I’d guess that humans also have a Phase 3 though, and/or current AI agents might already not. It’s just that the transition from phase 2 to 3 happens farther out than it does for current AIs. In the future there will be AIs whose transition is farther out still, possibly infinite.
I think your prediction of superexponential growth isn’t really about “superexponential” growth, but instead of there being an outright discontinuity where the time-horizons go from a finite value to infinity.
I think you are talking about a different notion of time horizon than me. You are talking about the transition between phase 2 and 3 for a given human or AI, whereas I’m talking about the crossover point where humans start to be better than AIs, which currently exists but won’t always exist.
EDIT: I do think you make good points though, thanks for the comment.
Also fwiw it’s not true that my model assumes that any given AI’s ability to solve tasks continues to grow at the same rate when you give it more inference-time compute; the widget Claude built has settings for whether the underlying relationship is linear, asymptotic, or power law and qualitatively the results are similar in all three cases. (basically, if we assume that AIs get diminishing returns to more time budget, that’s fine, as long as we also assume humans get diminishing returns to more time budget, which seems reasonable in that case. AIs returns will diminish faster at first, but as they improve, the slope will increase until it matches and then surpasses the human slope.)
Imagine we’re trying to launch a rocket to put a satellite into a stable orbit around Earth. At launch, the fuel explodes. However, due to the rocket’s geometry and good fortune, it happens to act as a shaped charge, and it launches the payload, mostly unharmed, off the planet and onto exactly the orbit we wanted, Project Orion-style. Technically, our “rocket” tool has succeeded at its task.[1] But it did so through a mechanism that’s completely unrelated to its intended tool-like functionality. It did not succeed due to its “rocket” capabilities; that abstraction broke down. The fuel-storage tank did not need to be shaped as a rocket at all.
By analogy: I think there’s a wide spread of relatively-high-probability trajectories on which a given reasoning model should be modeled as “legitimately reasoning” using the capabilities present in it. However, there’s a much bigger set of trajectories on which that abstraction breaks down, the obvious example being a CoT consisting of a sequence of random symbols. And it is possible that there’s a CoT like this such that, after e. g. o1-preview emits it and the “[/thinking]” token, it would next start to output a formally correct proof of the Riemann Hypothesis. However, that success won’t be caused by the “reasoning” mechanism, much like an exploding rocket successfully putting the satellite into orbit won’t be caused by the rocket’s rocket-like functionality. The DL model did not need to be shaped as a reasoning LLM at all; it could’ve been a random initialization.
If we’re using LLMs to do formal math, we could set up an “outcome pump” that harnesses this random-generation mechanism to produce valid proofs, by resampling trajectories. But the tool’s efficiency would be abysmal on all problems the LLM can’t solve using its “legitimate” capabilities.
You may argue the transition between Phase 2 and Phase 3 is continuous/gradual, with the model (or a human) trying ever-more-wild guesses until it’s just trying things completely randomly. That’s plausible. I would argue the phases are still distinct, however. (As in, the “transitional grey area” where we can’t firmly say whether this is reasoning or guessing is much smaller than the areas where the system is clearly reasoning or clearly being random.)
the widget Claude built has settings for whether the underlying relationship is linear, asymptotic, or power law and qualitatively the results are similar in all three cases
Noted!
But I still think this doesn’t quite get at what I think you’re pointing at when you talk about superexponential time-horizon growth/infinite time horizons. Probably not worth getting into at this time, though.
I’m not sure your model makes sense (e.g. is phase 3 really a distinct phase?) If it does, I’d guess that humans also have a Phase 3 though, and/or current AI agents might already not. It’s just that the transition from phase 2 to 3 happens farther out than it does for current AIs. In the future there will be AIs whose transition is farther out still, possibly infinite.
I think you are talking about a different notion of time horizon than me. You are talking about the transition between phase 2 and 3 for a given human or AI, whereas I’m talking about the crossover point where humans start to be better than AIs, which currently exists but won’t always exist.
EDIT: I do think you make good points though, thanks for the comment.
Also fwiw it’s not true that my model assumes that any given AI’s ability to solve tasks continues to grow at the same rate when you give it more inference-time compute; the widget Claude built has settings for whether the underlying relationship is linear, asymptotic, or power law and qualitatively the results are similar in all three cases. (basically, if we assume that AIs get diminishing returns to more time budget, that’s fine, as long as we also assume humans get diminishing returns to more time budget, which seems reasonable in that case. AIs returns will diminish faster at first, but as they improve, the slope will increase until it matches and then surpasses the human slope.)
Imagine we’re trying to launch a rocket to put a satellite into a stable orbit around Earth. At launch, the fuel explodes. However, due to the rocket’s geometry and good fortune, it happens to act as a shaped charge, and it launches the payload, mostly unharmed, off the planet and onto exactly the orbit we wanted, Project Orion-style. Technically, our “rocket” tool has succeeded at its task.[1] But it did so through a mechanism that’s completely unrelated to its intended tool-like functionality. It did not succeed due to its “rocket” capabilities; that abstraction broke down. The fuel-storage tank did not need to be shaped as a rocket at all.
By analogy: I think there’s a wide spread of relatively-high-probability trajectories on which a given reasoning model should be modeled as “legitimately reasoning” using the capabilities present in it. However, there’s a much bigger set of trajectories on which that abstraction breaks down, the obvious example being a CoT consisting of a sequence of random symbols. And it is possible that there’s a CoT like this such that, after e. g. o1-preview emits it and the “[/thinking]” token, it would next start to output a formally correct proof of the Riemann Hypothesis. However, that success won’t be caused by the “reasoning” mechanism, much like an exploding rocket successfully putting the satellite into orbit won’t be caused by the rocket’s rocket-like functionality. The DL model did not need to be shaped as a reasoning LLM at all; it could’ve been a random initialization.
If we’re using LLMs to do formal math, we could set up an “outcome pump” that harnesses this random-generation mechanism to produce valid proofs, by resampling trajectories. But the tool’s efficiency would be abysmal on all problems the LLM can’t solve using its “legitimate” capabilities.
You may argue the transition between Phase 2 and Phase 3 is continuous/gradual, with the model (or a human) trying ever-more-wild guesses until it’s just trying things completely randomly. That’s plausible. I would argue the phases are still distinct, however. (As in, the “transitional grey area” where we can’t firmly say whether this is reasoning or guessing is much smaller than the areas where the system is clearly reasoning or clearly being random.)
Noted!
But I still think this doesn’t quite get at what I think you’re pointing at when you talk about superexponential time-horizon growth/infinite time horizons. Probably not worth getting into at this time, though.
Disclaimer: I am not sure this is actually physically possible without further abstraction-breaking in the form of thermodynamic miracles.