Will Automating AI R&D not work for some reason, or will it not lead to vastly superhuman superintelligence within 2 years of “~100% automation” for some reason?
My current main guess is that it will more-or-less work, and then it will not lead to vastly superhuman superintelligence.
Specifically: I expect that the current LLM paradigm is sufficient to automate its own in-paradigm research, but that this paradigm is AGI-incomplete. Which means it’s possible to “skip to the end” by automating it to superhuman speeds, but what lies at its end won’t be AGI.
Like, much of the current paradigm is “make loss go down/reward go up by trying various recombinations of slight variations on a bunch of techniques, constructing RL environments, and doing not-that-deep math research”. That means the rewards are verifiable across the board, so there’s “no reason” why RLVR + something like AlphaEvolve won’t work for automating it. But it’s still possible that you can automate ~all of the AI research that’s currently happening at the frontier labs, and still fail to get to AGI.
(Though it’s possible that what lies at the end will be a powerful-enough non-AGI AI tool that it’ll make it very easy for the frontier labs to then use it to R&D an actual AGI, or take over the world, or whatever. This is a subtly different cluster of scenarios, though.)
(Though it’s possible that what lies at the end will be a powerful-enough non-AGI AI tool that it’ll make it very easy for the frontier labs to then use it to R&D an actual AGI, or take over the world, or whatever. This is a subtly different cluster of scenarios, though.)
Why do you say that? They seem functionally very similar, leaving aside that that second possibility suggests we’re in for a terrifying fast takeoff, for the reasons outlined above about lumpiness of innovations.
My current main guess is that it will more-or-less work, and then it will not lead to vastly superhuman superintelligence.
Specifically: I expect that the current LLM paradigm is sufficient to automate its own in-paradigm research, but that this paradigm is AGI-incomplete. Which means it’s possible to “skip to the end” by automating it to superhuman speeds, but what lies at its end won’t be AGI.
Like, much of the current paradigm is “make loss go down/reward go up by trying various recombinations of slight variations on a bunch of techniques, constructing RL environments, and doing not-that-deep math research”. That means the rewards are verifiable across the board, so there’s “no reason” why RLVR + something like AlphaEvolve won’t work for automating it. But it’s still possible that you can automate ~all of the AI research that’s currently happening at the frontier labs, and still fail to get to AGI.
(Though it’s possible that what lies at the end will be a powerful-enough non-AGI AI tool that it’ll make it very easy for the frontier labs to then use it to R&D an actual AGI, or take over the world, or whatever. This is a subtly different cluster of scenarios, though.)
Why do you say that? They seem functionally very similar, leaving aside that that second possibility suggests we’re in for a terrifying fast takeoff, for the reasons outlined above about lumpiness of innovations.