For what its worth, my view is that we’re very likely to be wrong about the specific details in both of the endings—they are obviously super conjunctive. I don’t think that there’s any way around this because we can be confident AGI is going to cause some ex-ante surprising things to happen.
Also, this is scenario is around 20th percentile timelines for me, my median is early 2030s (though other authors disagree with me). I also feel much more confident about the pre-2027 scenario than about the post 2027 scenario.
Is your disagreement that you think AGI will happen later, or that you think the effects of AGI on the world will look very different, or both? If its just the timelines, we might have fairly similar views.
My main disagreement is the speed, but not because I expect everything to happen more slowly by some constant factor. Instead I think there’s a missing mood here regarding the obstacles to building AGI, and the time to overcome those obstacles is not clear (which is why my timeline uncertainty is still ~in the exponent).
In particular, I think the first serious departure from my model of LLMs (linked above) is the neuralese section. It seems to me that for this to really work (in a way comparable to how human brains have recurrence) would require another breakthrough at least on the level of transformers if not harder. So, if the paper from Hao et al. is actually followed up on by future research that successfully scales, that would be a crux for me. Your explanation that the frontier labs haven’t adopted this for GPU utilization reasons seems highly implausible to me. These are creative people who want to ready AGI, and it seems obvious that the kind of tasks that arent conquered yet look a lot like the ones that need recurrence. Do you really think none of them have significantly invested in this (starting years ago when it become obvious this was a bottleneck)? The fact that we still need CoT at all tells me neuralese is not happening because we don’t know how to do it. Please refer to my post for more details on this intuition and its implications. In particular, I am not convinced this is the final bottleneck.
I also depart from certain other details latter, for instance I think we’ll have better theory by the time we need to align human level AI and “muddling through” by blind experimentation probably won’t work or be the actual path taken by surviving worlds.
My other points of disagreement seem less cruxy and are mostly downstream.
Re the recurrence/memory aspect, you might like this new paper which actually figured out how to use recurrent architectures to make a 1 minute Tom and Jerry cartoon video that was reasonably consistent, and in the tweet below, argues that somehow they managed to fix the training problems that come from training vanilla RNNs:
A note is that I actually expect AI progress to slow down for at least a year, and potentially up to 4-5 years due to the tariffs inducing a recession, but this doesn’t matter for the debate on whether LLMs can get to AGI.
I agree with the view that recurrence/hidden states would be a game-changer if they worked, because it allows the LLM to have a memory, and memoryless humans are way, way less employable than people who have memory, because it’s much easier to meta-learn strategies with memory.
That said, I’m both uncertain on the view that recurrence is necessary to get LLMs to learn better/have a memory/state that lasts beyond the context window, and also think that meta-learning over long periods/having a memory is probably the only hard bottleneck at this point that might not be solved (but is likely to be solved, if these new papers are anything to go by).
I basically agree with @gwern’s explanation of what LLMs are missing that makes them not AGIs (at least without a further couple of OOMs at the very least, and the worst case is they need exponential compute to get linear gains):
I only think one intervention is basically necessary at most, and one could argue that 0 new insights are needed.
The other part here is I basically disagree with this assumption, and more generally I have a strong prior that a lot of problems are solved by muddling through/using semi-dumb strategies that work way better than they have any right to:
I also depart from certain other details latter, for instance I think we’ll have better theory by the time we need to align human level AI and “muddling through” by blind experimentation probably won’t work or be the actual path taken by surviving worlds.
I think most worlds that survive AGI to ASI for at least 2 years, if not longer, will almost certainly include a lot of dropped balls and fairly blind experimentation (helped out by the AI control agenda), as well as the world’s offense-defense balance shifting to a more defensive equilibrium.
I do think most of my probability mass for AI that can automate all AI research is in the 2030s, but this is broadly due to the tariffs and scaling up new innovations taking some time, rather than the difficulty of AGI being high.
Edit: @Vladimir_Nesov has convinced me that the tariffs delay stuff only slightly, though my issue is with the tariffs causing an economic recession, causing AI investment to fall quite a bit for a while.
probability mass for AI that can automate all AI research is in the 2030s … broadly due to the tariffs and …
Without AGI, scaling of hardware runs into the financial ~$200bn individual training system cost wall in 2027-2029. Any tribulations on the way (or conversely efforts to pool heterogeneous and geographically distributed compute) only delay that point slightly (when compared to the current pace of increase in funding), and you end up in approximately the same place, slowing down to the speed of advancement in FLOP/s per watt (or per dollar). Without transformative AI, anything close to the current pace is unlikely to last into the 2030s.
For what its worth, my view is that we’re very likely to be wrong about the specific details in both of the endings—they are obviously super conjunctive. I don’t think that there’s any way around this because we can be confident AGI is going to cause some ex-ante surprising things to happen.
Also, this is scenario is around 20th percentile timelines for me, my median is early 2030s (though other authors disagree with me). I also feel much more confident about the pre-2027 scenario than about the post 2027 scenario.
Is your disagreement that you think AGI will happen later, or that you think the effects of AGI on the world will look very different, or both? If its just the timelines, we might have fairly similar views.
My main disagreement is the speed, but not because I expect everything to happen more slowly by some constant factor. Instead I think there’s a missing mood here regarding the obstacles to building AGI, and the time to overcome those obstacles is not clear (which is why my timeline uncertainty is still ~in the exponent).
In particular, I think the first serious departure from my model of LLMs (linked above) is the neuralese section. It seems to me that for this to really work (in a way comparable to how human brains have recurrence) would require another breakthrough at least on the level of transformers if not harder. So, if the paper from Hao et al. is actually followed up on by future research that successfully scales, that would be a crux for me. Your explanation that the frontier labs haven’t adopted this for GPU utilization reasons seems highly implausible to me. These are creative people who want to ready AGI, and it seems obvious that the kind of tasks that arent conquered yet look a lot like the ones that need recurrence. Do you really think none of them have significantly invested in this (starting years ago when it become obvious this was a bottleneck)? The fact that we still need CoT at all tells me neuralese is not happening because we don’t know how to do it. Please refer to my post for more details on this intuition and its implications. In particular, I am not convinced this is the final bottleneck.
I also depart from certain other details latter, for instance I think we’ll have better theory by the time we need to align human level AI and “muddling through” by blind experimentation probably won’t work or be the actual path taken by surviving worlds.
My other points of disagreement seem less cruxy and are mostly downstream.
Re the recurrence/memory aspect, you might like this new paper which actually figured out how to use recurrent architectures to make a 1 minute Tom and Jerry cartoon video that was reasonably consistent, and in the tweet below, argues that somehow they managed to fix the training problems that come from training vanilla RNNs:
https://test-time-training.github.io/video-dit/assets/ttt_cvpr_2025.pdf
https://arxiv.org/abs/2407.04620
https://x.com/karansdalal/status/1810377853105828092 (This is the tweet I pointed to for the claim that they solved the issue of training vanilla RNNs):
https://x.com/karansdalal/status/1909312851795411093 (Previous work that is relevant)
https://x.com/karansdalal/status/1909312851795411093 (Tweet of the current paper)
A note is that I actually expect AI progress to slow down for at least a year, and potentially up to 4-5 years due to the tariffs inducing a recession, but this doesn’t matter for the debate on whether LLMs can get to AGI.
I agree with the view that recurrence/hidden states would be a game-changer if they worked, because it allows the LLM to have a memory, and memoryless humans are way, way less employable than people who have memory, because it’s much easier to meta-learn strategies with memory.
That said, I’m both uncertain on the view that recurrence is necessary to get LLMs to learn better/have a memory/state that lasts beyond the context window, and also think that meta-learning over long periods/having a memory is probably the only hard bottleneck at this point that might not be solved (but is likely to be solved, if these new papers are anything to go by).
I basically agree with @gwern’s explanation of what LLMs are missing that makes them not AGIs (at least without a further couple of OOMs at the very least, and the worst case is they need exponential compute to get linear gains):
https://www.lesswrong.com/posts/deesrjitvXM4xYGZd/?commentId=hSkQG2N8rkKXosLEF
I only think one intervention is basically necessary at most, and one could argue that 0 new insights are needed.
The other part here is I basically disagree with this assumption, and more generally I have a strong prior that a lot of problems are solved by muddling through/using semi-dumb strategies that work way better than they have any right to:
I think most worlds that survive AGI to ASI for at least 2 years, if not longer, will almost certainly include a lot of dropped balls and fairly blind experimentation (helped out by the AI control agenda), as well as the world’s offense-defense balance shifting to a more defensive equilibrium.
I do think most of my probability mass for AI that can automate all AI research is in the 2030s, but this is broadly due to the tariffs and scaling up new innovations taking some time, rather than the difficulty of AGI being high.
Edit: @Vladimir_Nesov has convinced me that the tariffs delay stuff only slightly, though my issue is with the tariffs causing an economic recession, causing AI investment to fall quite a bit for a while.
Without AGI, scaling of hardware runs into the financial ~$200bn individual training system cost wall in 2027-2029. Any tribulations on the way (or conversely efforts to pool heterogeneous and geographically distributed compute) only delay that point slightly (when compared to the current pace of increase in funding), and you end up in approximately the same place, slowing down to the speed of advancement in FLOP/s per watt (or per dollar). Without transformative AI, anything close to the current pace is unlikely to last into the 2030s.