Thanks for these further pointers! I won’t go into detail, I will just say that I take the bitter lesson very seriously and that I think most of the ideas you mention won’t be needed for superintelligence. Some intuitions why I take typical arguments for limits of transformers not very seriously:
If you hook up a transformer to itself with a reasoning scratchpad, then I think it can in principle represent any computation, beyond what would be possible in a single forward pass.
On causality: Once we change to the agent-paradigm, transformers naturally get causal data since they will see how the “world responds” to their actions.
General background intuition: Humans developed general intelligence and a causal understanding of the world by evolution, without anyone designing us very deliberately.
Thanks for these further pointers! I won’t go into detail, I will just say that I take the bitter lesson very seriously and that I think most of the ideas you mention won’t be needed for superintelligence. Some intuitions why I take typical arguments for limits of transformers not very seriously:
If you hook up a transformer to itself with a reasoning scratchpad, then I think it can in principle represent any computation, beyond what would be possible in a single forward pass.
On causality: Once we change to the agent-paradigm, transformers naturally get causal data since they will see how the “world responds” to their actions.
General background intuition: Humans developed general intelligence and a causal understanding of the world by evolution, without anyone designing us very deliberately.