One intuition I think people have about AGI coming very soon is that if some loop is closed, or some synergy is realized, then that sets off the RSI / opens the floodgates. Now, I do fairly strongly think that’s how things work, but I think that many people have too low a bar for what they consider to plausible set off such a chain reaction. An intuition pump I’d offer: Consider incremental / cookie clicker games (such as the paperclips one, which I won’t link to because mild infohazard / timewaster). Such games basically consist in wireheading on the sense of ”...woah I just made a breakthrough that unlocks synergy / feedback / recursive growth / exponential growth / unbounded growth / automated progress / …”. But what you eventually hopefully learn is that each breakthrough quickly saturates the value of what it provides, and you’re still stuck, basically, just at a slightly higher level / on a slightly different dimension. (Incidentally, this is my top empathetic guess for people who think human intelligence is near the cieling, or that FOOM won’t happen.) This is just an intuition pump and I think it will break down eventually, but not necessarily automatically the first time or the fifth time.
(Even if you use the js console, you still have to locate each new button and in some cases pressing pattern separately. inb4 “an LLM could beat this autonomously” yeah ok fine but AGI research is harder)
The red pill is that even humans are not an upper bound for how hard this can be, that even a fully human equivalent AI doesn’t yet close an RSI loop that goes FOOM, that it would still take a lot of time after that, even when humans no longer have anything to contribute. This is actually a popular view, for people who say AGI remains a normal technology and just keeps scaling the economy, with maybe 20% growth per year rather than a doubling of industry every few days until the Sun is eaten, with a Sun-scale amount of probes soon en route to distant galaxies.
On the other hand, evolution doesn’t have a mind, so reaching even the human level is not necessary to close a loop that goes on to automatically reach human capabilities and then goes further, the only question is speed and feasibility. I think automated sample efficient learning (that adapts to any consideration that comes up) is plausibly the last piece, with RLVR already sample efficient (with respect to the data defining tasks) and able to do the cognitive heavy lifting, and pretraining already able to form a coherent picture of everything that’s been discovered so far.
Automation of routine AI R&D (merely carrying out all the plumbing of training data preparation and model training that is currently done by humans at AI companies, rather than inventing anything new at the object level of this process) is plausibly a straightforward way of getting there. RLVR-trained agents will plausibly be able to manage this soon, and it’s no doubt being explicitly attempted at every step where it’s feasible to attempt. The Sonnet-Opus-Mythos story suggests that the 100T param models of 2028-2029 might well suffice to manage every single routine step.
There still remains the possibility that the advancements from a closed loop initially remain slow. I expect cognitive self-sufficiency for AI quickly generates a hoard of conceptual inventions, something like the scientific literature that’s generated much faster, which pushes through any remaining hobblings that would otherwise promise to keep some of the other things slow for a while.
Interesting points… I don’t think it’s right to say that RLVR does all or even most of the cognitive heavy lifting; it does some of it but not other of it. I agree with your “plausibly”s, but we might put pretty different probabilities, IDK.
My suspicion would be that human-level (in the relevant dimensions) actually is special.
The chimp-human boundary goes from useless for going faster than evolution to eminently useful. But LLMs can talk and solve IMO problems, while chimps can’t, so I wouldn’t count on LLMs not already being beyond this boundary. LLMs merely need to somehow become an engine of a closed loop that works towards stronger cognitive capabilities, without necessarily themselves possessing such capabilities, or even broad human-level capabilities. Evolution is too slow to usefully do this within modern compute, but some LLM-juggling process could be much faster. And humans, when not part of the closed loop of human culture and civilization, remain as useless as chimps in reaching for superintelligence.
(RLVR is clearly deficient in the jaggedness of its results in practice, but that’s plausibly a problem of RLVR training data not being bitter-pilled. And conceptual invention might need many steps of using RLVR-trained reasoning to formulate new RLVR tasks for training the next step. So automation of generation of training data for RLVR, and of its application in training, might compensate for these issues well enough.)
(such as the paperclips one, which I won’t link to because mild infohazard / timewaster)
True.. Such a fun game though.. Maybe I can play cookie clicker for 1 hour.. and see how far I’m able to get in that short time… while waiting for my training runs to end..
I think that’s the fundamental question. Does LLMs’ ability to autonomously perform basic hyperparameter search get them far enough that they can perform architecture optimization? Does that get them far enough that they can pursue new paradigms for language modeling[1]? If it takes 1 intelligence to go from 1 to 2, but 2.5 intelligence to go from 2 to 3, then 2 is where you stop.
The practical answer is that, from our perspective, “as smart as a human engineer across all relevant domains” gets us to “the best AI humans will ever be able to create” quite a bit quicker than we’d otherwise get there, and without the need for any further input from human engineers.
“as smart as a human engineer across all relevant domains”
I dispute that LLMs are like this; I think they and their training have a bunch of performance capability and not much ability to generate those de novo.
gets us to “the best AI humans will ever be able to create” quite a bit quicker than we’d otherwise get there,
Maybe; to some extent I’d expect this to hit various walls, though not sure; Amdahl’s law; and IDK how people get very confident of this.
Just thinking out loud here wondering how true this is, because of course incremental games are not quite the real world, and having unbelievable hours of ‘content’ often with stalling and offline time is the norm. Things are quite complex, but if you buy that it’s “easy” for someone to make money in a guru-style way (which I can get if people don’t, because of how many get rich quick scheme course scams there are) you probably believe more in RSI. Because you believe “oh you can use the money to easily automatically make more money”. The real world is of course complex and most jobs require a lot of manual “prove you are human” efforts in some indirect way, dealing with a lot of proprietary software.
In an incremental game you’re also stuck in a “log scale” sort of way. When you go from 10^10 to 10^12 it’s just numbers that change. But in some sort of proto-AGI system this could be very well seen as 100x of… something. That could represent ‘foom’ way more than it may appear on a log scale.
It is odd to think about, because we do seemingly have stuff like “100x in compute”, it just hasn’t seemed like the pieces have been put together for a kind of power-getting system, with computer use and command line use still seeming a little bit of a prototype compared to where it could be. This “100x” could mean nothing or everything depending on what it represents, like if it was “100x copies” for some botnet computer virus using a zero-day that could be the most relevant thing, whereas even something like “100x money” may not be scalable or dead-end without a good way to use it (in the same way as an incremental, lol).
Where I have doubts about FOOM/RSI is that LLMs seem to me in many ways a fundamentally different type of intelligence than organic life.
Psychometrics shows that general intelligence improves human abilities across a broad range of domains. If you take this view and apply it to AI it doesn’t quite work, I leverage AI very very heavily at work, and sometimes it is phenomenal, often it is not, and occasionally it makes mistakes a grade schooler would not (I’m using Opus4.6). The ”intelligence” is very unevenly distributed and skewed towards verifiable domains.
I tend to see LLMs as a grab bag of heuristics and concepts. And I see general intelligence as effectively pattern matching both within a domain and across domains. RLVF enhances the base models ability to pattern match within a domain (programming) but doesn’t seem to extend evenly outside of it.
I tend to land with Steve Byrnes that this particular architecture is unlikely to scale to AGI (I use a definition of a system capable enough to serve as a drop in replacement for all remote workers), although it could definitely replace a large percentage of them.
I do not hold these views with high confidence however, and am always open to having my mind changed.
One intuition I think people have about AGI coming very soon is that if some loop is closed, or some synergy is realized, then that sets off the RSI / opens the floodgates. Now, I do fairly strongly think that’s how things work, but I think that many people have too low a bar for what they consider to plausible set off such a chain reaction. An intuition pump I’d offer: Consider incremental / cookie clicker games (such as the paperclips one, which I won’t link to because mild infohazard / timewaster). Such games basically consist in wireheading on the sense of ”...woah I just made a breakthrough that unlocks synergy / feedback / recursive growth / exponential growth / unbounded growth / automated progress / …”. But what you eventually hopefully learn is that each breakthrough quickly saturates the value of what it provides, and you’re still stuck, basically, just at a slightly higher level / on a slightly different dimension. (Incidentally, this is my top empathetic guess for people who think human intelligence is near the cieling, or that FOOM won’t happen.) This is just an intuition pump and I think it will break down eventually, but not necessarily automatically the first time or the fifth time.
(Even if you use the js console, you still have to locate each new button and in some cases pressing pattern separately. inb4 “an LLM could beat this autonomously” yeah ok fine but AGI research is harder)
The red pill is that even humans are not an upper bound for how hard this can be, that even a fully human equivalent AI doesn’t yet close an RSI loop that goes FOOM, that it would still take a lot of time after that, even when humans no longer have anything to contribute. This is actually a popular view, for people who say AGI remains a normal technology and just keeps scaling the economy, with maybe 20% growth per year rather than a doubling of industry every few days until the Sun is eaten, with a Sun-scale amount of probes soon en route to distant galaxies.
On the other hand, evolution doesn’t have a mind, so reaching even the human level is not necessary to close a loop that goes on to automatically reach human capabilities and then goes further, the only question is speed and feasibility. I think automated sample efficient learning (that adapts to any consideration that comes up) is plausibly the last piece, with RLVR already sample efficient (with respect to the data defining tasks) and able to do the cognitive heavy lifting, and pretraining already able to form a coherent picture of everything that’s been discovered so far.
Automation of routine AI R&D (merely carrying out all the plumbing of training data preparation and model training that is currently done by humans at AI companies, rather than inventing anything new at the object level of this process) is plausibly a straightforward way of getting there. RLVR-trained agents will plausibly be able to manage this soon, and it’s no doubt being explicitly attempted at every step where it’s feasible to attempt. The Sonnet-Opus-Mythos story suggests that the 100T param models of 2028-2029 might well suffice to manage every single routine step.
There still remains the possibility that the advancements from a closed loop initially remain slow. I expect cognitive self-sufficiency for AI quickly generates a hoard of conceptual inventions, something like the scientific literature that’s generated much faster, which pushes through any remaining hobblings that would otherwise promise to keep some of the other things slow for a while.
Interesting points… I don’t think it’s right to say that RLVR does all or even most of the cognitive heavy lifting; it does some of it but not other of it. I agree with your “plausibly”s, but we might put pretty different probabilities, IDK.
My suspicion would be that human-level (in the relevant dimensions) actually is special.
The chimp-human boundary goes from useless for going faster than evolution to eminently useful. But LLMs can talk and solve IMO problems, while chimps can’t, so I wouldn’t count on LLMs not already being beyond this boundary. LLMs merely need to somehow become an engine of a closed loop that works towards stronger cognitive capabilities, without necessarily themselves possessing such capabilities, or even broad human-level capabilities. Evolution is too slow to usefully do this within modern compute, but some LLM-juggling process could be much faster. And humans, when not part of the closed loop of human culture and civilization, remain as useless as chimps in reaching for superintelligence.
(RLVR is clearly deficient in the jaggedness of its results in practice, but that’s plausibly a problem of RLVR training data not being bitter-pilled. And conceptual invention might need many steps of using RLVR-trained reasoning to formulate new RLVR tasks for training the next step. So automation of generation of training data for RLVR, and of its application in training, might compensate for these issues well enough.)
(such as the paperclips one, which I won’t link to because mild infohazard / timewaster)
True.. Such a fun game though.. Maybe I can play cookie clicker for 1 hour.. and see how far I’m able to get in that short time… while waiting for my training runs to end..
I think that’s the fundamental question. Does LLMs’ ability to autonomously perform basic hyperparameter search get them far enough that they can perform architecture optimization? Does that get them far enough that they can pursue new paradigms for language modeling[1]? If it takes 1 intelligence to go from 1 to 2, but 2.5 intelligence to go from 2 to 3, then 2 is where you stop.
The practical answer is that, from our perspective, “as smart as a human engineer across all relevant domains” gets us to “the best AI humans will ever be able to create” quite a bit quicker than we’d otherwise get there, and without the need for any further input from human engineers.
I’m not suggesting that this is the exact trajectory.
I dispute that LLMs are like this; I think they and their training have a bunch of performance capability and not much ability to generate those de novo.
Maybe; to some extent I’d expect this to hit various walls, though not sure; Amdahl’s law; and IDK how people get very confident of this.
Just thinking out loud here wondering how true this is, because of course incremental games are not quite the real world, and having unbelievable hours of ‘content’ often with stalling and offline time is the norm. Things are quite complex, but if you buy that it’s “easy” for someone to make money in a guru-style way (which I can get if people don’t, because of how many get rich quick scheme course scams there are) you probably believe more in RSI. Because you believe “oh you can use the money to easily automatically make more money”. The real world is of course complex and most jobs require a lot of manual “prove you are human” efforts in some indirect way, dealing with a lot of proprietary software.
In an incremental game you’re also stuck in a “log scale” sort of way. When you go from 10^10 to 10^12 it’s just numbers that change. But in some sort of proto-AGI system this could be very well seen as 100x of… something. That could represent ‘foom’ way more than it may appear on a log scale.
It is odd to think about, because we do seemingly have stuff like “100x in compute”, it just hasn’t seemed like the pieces have been put together for a kind of power-getting system, with computer use and command line use still seeming a little bit of a prototype compared to where it could be. This “100x” could mean nothing or everything depending on what it represents, like if it was “100x copies” for some botnet computer virus using a zero-day that could be the most relevant thing, whereas even something like “100x money” may not be scalable or dead-end without a good way to use it (in the same way as an incremental, lol).
Where I have doubts about FOOM/RSI is that LLMs seem to me in many ways a fundamentally different type of intelligence than organic life.
Psychometrics shows that general intelligence improves human abilities across a broad range of domains. If you take this view and apply it to AI it doesn’t quite work, I leverage AI very very heavily at work, and sometimes it is phenomenal, often it is not, and occasionally it makes mistakes a grade schooler would not (I’m using Opus4.6). The ”intelligence” is very unevenly distributed and skewed towards verifiable domains.
I tend to see LLMs as a grab bag of heuristics and concepts. And I see general intelligence as effectively pattern matching both within a domain and across domains. RLVF enhances the base models ability to pattern match within a domain (programming) but doesn’t seem to extend evenly outside of it.
I tend to land with Steve Byrnes that this particular architecture is unlikely to scale to AGI (I use a definition of a system capable enough to serve as a drop in replacement for all remote workers), although it could definitely replace a large percentage of them.
I do not hold these views with high confidence however, and am always open to having my mind changed.