I’ll just pose the mandatory comment about long-horizon reasoning capacity potentially being a problem for something like agent-2. There’s some degree in which the delay of that part of the model gives pretty large differences in distribution of timelines here.
Just RL and Bitter Lesson it on top of the LLM infrastructure is honestly like a pretty good take on average but it feels like that there a bunch of unknown unknowns there in terms of ML? There’s a view that states that there is 2 or 3 major scientific research problems to go through at that point which might just slow down development enough that we get a plateau before we get to the later parts of this model.
Why I’m being persistent with this view is because the mainstream ML community in things such as Geometric Deep Learning or something like MARL, RL and Reasoning are generally a bit skeptical of some of the underlying claims of what LLMs + RL can do (at least when I’ve talked to them at conferences, the vibe here is like 60-70% of people at least but do beware their incentives as well) and they point towards reasoning challenges like specific variations of blockworld or underlying architectural constrains within the model architectures. (For blocks world the basic reasoning tasks are starting to be solved according to benchmarks but the more steps involved we have, the worse it gets.)
I think the rest of the geopolitical modelling is rock solid and that you generally make really great assumptions. I would also want to see more engagement with these sorts of skeptics.
People like: Subbarao Kambhampati, Michael Bronstein, Peter Velickovic, Bruno Gavranovic or someone like Lancelot Da Costa (among others!) are all really great researchers from different fields that I believe will tell you different things that are a bit off with the story that you’re proposing? These are not obvious reasons either and I can’t tell you a good story about how inductive biases in data types implictly frame RL problems to make certain types of problems hard to solve and I can’t really evaluate to which extent their models versus your models are true.
So, if you want my vote for this story (which you obviously do, it is quite important after all(sarcasm)) then maybe going to the next ICML and just engaging in debates with these people might be interesting?
I also apologize in advance if you’ve already taken this into account, it does kind of feel like that these are different worlds and it seems like the views clash which might be an important detail.
The type of research I did was very reasoning-heavy. It’s architecture research in which you think hard about how to mathematically guarantee that your network obeys some symmetry constraints appropriate for a domain and data source.
As a researcher in that area, you have a very strong incentive to claim that a special sauce is necessary for intelligence, since providing special sauces is all you do. As such, my prior is to believe that these researchers don’t have any interesting objection to continued scaling and “normal” algorithmic improvements to lead to AGI and then superintelligence.
It might still be interesting to engage when the opportunity arises, but I wouldn’t put extra effort into making such a discussion happen.
I definetely see your point in how the incentives here are skewed. I would want to ask you what you think of the claims about inductive biases and difficulty of causal graph learning for transformers? A guess is that you could just add it on top of the base architecture as a MOA model with RL in it to solve some problems here but that feels like people from the larger labs might not realise that at first?
Also, I wasn’t only talking about GDL, there’s like two or three other disciplines that also have some ways they believe that AGI will need other sorts of modelling capacity.
Some of the organisation taking explicit bets from other directions are:
Symbolica is more on the same train as GDL but from a category theory perspective, the TL;DR of their take is that it takes other types of reasoning capacity in order to combine various data types into one model and that transformers aren’t expressive nor flexible enough to support this.
For Verses, I think you should think ACS & Jan Kulveit Active Inference models & lack of planning with self in mind due to lacking information about what the self-other boundary is for auto-encoders compared to something that has an action-perception loop.
I might write something up on this if you think it might be useful.
Thanks for these further pointers! I won’t go into detail, I will just say that I take the bitter lesson very seriously and that I think most of the ideas you mention won’t be needed for superintelligence. Some intuitions why I take typical arguments for limits of transformers not very seriously:
If you hook up a transformer to itself with a reasoning scratchpad, then I think it can in principle represent any computation, beyond what would be possible in a single forward pass.
On causality: Once we change to the agent-paradigm, transformers naturally get causal data since they will see how the “world responds” to their actions.
General background intuition: Humans developed general intelligence and a causal understanding of the world by evolution, without anyone designing us very deliberately.
Subbarao Kambhampati, Michael Bronstein, Peter Velickovic, Bruno Gavranovic or someone like Lancelot Da Costa
I don’t recognize any of these names. I’m guessing they are academics who are not actually involved with any of the frontier AI efforts, and who think for various technical reasons that AGI is not imminent?
edit: OK, I looked them up, Velickovic is at DeepMind, I didn’t see a connection to “Big AI” for any of the others, but they are all doing work that might matter to the people building AGI. Nonetheless, if their position is that current AI paradigms are going to plateau at a level short of human intelligence, I’d like to see the argument. AIs can still make mistakes that are surprising to a human mind—e.g. in one of my first conversations with the mighty Gemini 2.5, it confidently told me that it was actually Claude Opus 3. (I was talking to it in Google AI Studio, where it seems to be cut off from some system resources that would make it more grounded in reality.) But AI capabilities can also be so shockingly good, that I wouldn’t be surprised if they took over tomorrow.
I’ll just pose the mandatory comment about long-horizon reasoning capacity potentially being a problem for something like agent-2. There’s some degree in which the delay of that part of the model gives pretty large differences in distribution of timelines here.
Just RL and Bitter Lesson it on top of the LLM infrastructure is honestly like a pretty good take on average but it feels like that there a bunch of unknown unknowns there in terms of ML? There’s a view that states that there is 2 or 3 major scientific research problems to go through at that point which might just slow down development enough that we get a plateau before we get to the later parts of this model.
Why I’m being persistent with this view is because the mainstream ML community in things such as Geometric Deep Learning or something like MARL, RL and Reasoning are generally a bit skeptical of some of the underlying claims of what LLMs + RL can do (at least when I’ve talked to them at conferences, the vibe here is like 60-70% of people at least but do beware their incentives as well) and they point towards reasoning challenges like specific variations of blockworld or underlying architectural constrains within the model architectures. (For blocks world the basic reasoning tasks are starting to be solved according to benchmarks but the more steps involved we have, the worse it gets.)
I think the rest of the geopolitical modelling is rock solid and that you generally make really great assumptions. I would also want to see more engagement with these sorts of skeptics.
People like: Subbarao Kambhampati, Michael Bronstein, Peter Velickovic, Bruno Gavranovic or someone like Lancelot Da Costa (among others!) are all really great researchers from different fields that I believe will tell you different things that are a bit off with the story that you’re proposing? These are not obvious reasons either and I can’t tell you a good story about how inductive biases in data types implictly frame RL problems to make certain types of problems hard to solve and I can’t really evaluate to which extent their models versus your models are true.
So, if you want my vote for this story (which you obviously do, it is quite important after all(sarcasm)) then maybe going to the next ICML and just engaging in debates with these people might be interesting?
I also apologize in advance if you’ve already taken this into account, it does kind of feel like that these are different worlds and it seems like the views clash which might be an important detail.
I worked on geometric/equivariant deep learning a few years ago (with some success, leading to two ICLR papers and a patent, see my google scholar: https://scholar.google.com/citations?user=E3ae_sMAAAAJ&hl=en).
The type of research I did was very reasoning-heavy. It’s architecture research in which you think hard about how to mathematically guarantee that your network obeys some symmetry constraints appropriate for a domain and data source.
As a researcher in that area, you have a very strong incentive to claim that a special sauce is necessary for intelligence, since providing special sauces is all you do. As such, my prior is to believe that these researchers don’t have any interesting objection to continued scaling and “normal” algorithmic improvements to lead to AGI and then superintelligence.
It might still be interesting to engage when the opportunity arises, but I wouldn’t put extra effort into making such a discussion happen.
Interesting!
I definetely see your point in how the incentives here are skewed. I would want to ask you what you think of the claims about inductive biases and difficulty of causal graph learning for transformers? A guess is that you could just add it on top of the base architecture as a MOA model with RL in it to solve some problems here but that feels like people from the larger labs might not realise that at first?
Also, I wasn’t only talking about GDL, there’s like two or three other disciplines that also have some ways they believe that AGI will need other sorts of modelling capacity.
Some of the organisation taking explicit bets from other directions are:
https://www.symbolica.ai/
https://www.verses.ai/genius
Symbolica is more on the same train as GDL but from a category theory perspective, the TL;DR of their take is that it takes other types of reasoning capacity in order to combine various data types into one model and that transformers aren’t expressive nor flexible enough to support this.
For Verses, I think you should think ACS & Jan Kulveit Active Inference models & lack of planning with self in mind due to lacking information about what the self-other boundary is for auto-encoders compared to something that has an action-perception loop.
I might write something up on this if you think it might be useful.
Thanks for these further pointers! I won’t go into detail, I will just say that I take the bitter lesson very seriously and that I think most of the ideas you mention won’t be needed for superintelligence. Some intuitions why I take typical arguments for limits of transformers not very seriously:
If you hook up a transformer to itself with a reasoning scratchpad, then I think it can in principle represent any computation, beyond what would be possible in a single forward pass.
On causality: Once we change to the agent-paradigm, transformers naturally get causal data since they will see how the “world responds” to their actions.
General background intuition: Humans developed general intelligence and a causal understanding of the world by evolution, without anyone designing us very deliberately.
I don’t recognize any of these names. I’m guessing they are academics who are not actually involved with any of the frontier AI efforts, and who think for various technical reasons that AGI is not imminent?
edit: OK, I looked them up, Velickovic is at DeepMind, I didn’t see a connection to “Big AI” for any of the others, but they are all doing work that might matter to the people building AGI. Nonetheless, if their position is that current AI paradigms are going to plateau at a level short of human intelligence, I’d like to see the argument. AIs can still make mistakes that are surprising to a human mind—e.g. in one of my first conversations with the mighty Gemini 2.5, it confidently told me that it was actually Claude Opus 3. (I was talking to it in Google AI Studio, where it seems to be cut off from some system resources that would make it more grounded in reality.) But AI capabilities can also be so shockingly good, that I wouldn’t be surprised if they took over tomorrow.