I strongly agree with this post, but one question:
Assuming there exists a simple core of intelligence, then that simple core is probably some kind of algorithm.
When LLMs learn to predict the next token of a very complex process (like computer code or human thinking), they fit very high level patterns, and learn many algorithms (e.g. addition, multiplication, matrix multiplication, etc.) as long as those algorithms predict the next token well in certain contexts.
Now maybe the simple core of intelligence, is too complex an algorithm to be learned when predicting a single next token.
However, a long chain-of-thought can combine these relatively simple algorithms (for predicting one next token) in countless possible ways, forming tons of more advanced algorithms, with a lot of working memory. Reinforcement learning on the chain-of-thought, can gradually discover the best advanced algorithms for solving a great variety of tasks (any task which is cheaply verifiable).
Given that evolution used brute force to create the human brain, don’t you think it’s plausible for this RL loop to use brute force to rediscover the simple core of intelligence?
PS: This is just a thought, not a crux. It doesn’t conflict with your conclusions, since LLM AGI being a possibility doesn’t mean non-LLM AGI isn’t a possibility. And even if the simple core of intelligence was discovered by RL of LLMs, the consequences may be the same.
New large-scale learning algorithms can in principle be designed by (A) R&D (research taste, small-scale experiments, puzzling over the results, iterating, etc.), or (B) some blind search process. All the known large-scale learning algorithms in AI to date, from the earliest Perceptron to the modern Transformer, have been developed by (A), not (B). (Sometimes a few hyperparameters or whatever are set by blind search, but the bulk of the real design work in the learning algorithm has always come from intelligent R&D.) I expect that to remain the case: See Against evolution as an analogy for how humans will create AGI.
Or if you’re talking about (A), but you’re saying that LLMs will be the ones doing the intelligent R&D and puzzling over learning algorithm design, rather than humans, then … maybe kinda, see §1.4.4.
I agree that (B) never happened. Another way of saying this, is that “algorithms for discovering algorithms” have only ever been written by humans, and never directly discovered by another “algorithm for discovering algorithms.”
The LLM+RL “algorithm for discovering algorithms” is far less powerful than the simple core of intelligence, but far more powerful than any other “algorithm for discovering algorithms” we ever had before. Since it has discovered the algorithms for solving IMO level math problems.
Meanwhile, the simple core of intelligence may also be the easiest “algorithm for discovering algorithms” to discover (by another such algorithm). This is because evolution found it (and the entire algorithm fits inside the human genome), and the algorithm seems to be simple. The first time (B) happens, may be the only time (B) happens (before superintelligence).
I think it’s both plausible that the simple core of intelligence is found by human researchers, and that it just emerges inside a LLM with much greater effective scale (due to being both bigger and more efficient), subject to much greater amounts of chain-of-thought RL.
I strongly agree with this post, but one question:
Assuming there exists a simple core of intelligence, then that simple core is probably some kind of algorithm.
When LLMs learn to predict the next token of a very complex process (like computer code or human thinking), they fit very high level patterns, and learn many algorithms (e.g. addition, multiplication, matrix multiplication, etc.) as long as those algorithms predict the next token well in certain contexts.
Now maybe the simple core of intelligence, is too complex an algorithm to be learned when predicting a single next token.
However, a long chain-of-thought can combine these relatively simple algorithms (for predicting one next token) in countless possible ways, forming tons of more advanced algorithms, with a lot of working memory. Reinforcement learning on the chain-of-thought, can gradually discover the best advanced algorithms for solving a great variety of tasks (any task which is cheaply verifiable).
Given that evolution used brute force to create the human brain, don’t you think it’s plausible for this RL loop to use brute force to rediscover the simple core of intelligence?
PS: This is just a thought, not a crux. It doesn’t conflict with your conclusions, since LLM AGI being a possibility doesn’t mean non-LLM AGI isn’t a possibility. And even if the simple core of intelligence was discovered by RL of LLMs, the consequences may be the same.
New large-scale learning algorithms can in principle be designed by (A) R&D (research taste, small-scale experiments, puzzling over the results, iterating, etc.), or (B) some blind search process. All the known large-scale learning algorithms in AI to date, from the earliest Perceptron to the modern Transformer, have been developed by (A), not (B). (Sometimes a few hyperparameters or whatever are set by blind search, but the bulk of the real design work in the learning algorithm has always come from intelligent R&D.) I expect that to remain the case: See Against evolution as an analogy for how humans will create AGI.
Or if you’re talking about (A), but you’re saying that LLMs will be the ones doing the intelligent R&D and puzzling over learning algorithm design, rather than humans, then … maybe kinda, see §1.4.4.
Sorry if I’m misunderstanding.
To be honest I’m very unsure about all of this.
I agree that (B) never happened. Another way of saying this, is that “algorithms for discovering algorithms” have only ever been written by humans, and never directly discovered by another “algorithm for discovering algorithms.”
The LLM+RL “algorithm for discovering algorithms” is far less powerful than the simple core of intelligence, but far more powerful than any other “algorithm for discovering algorithms” we ever had before. Since it has discovered the algorithms for solving IMO level math problems.
Meanwhile, the simple core of intelligence may also be the easiest “algorithm for discovering algorithms” to discover (by another such algorithm). This is because evolution found it (and the entire algorithm fits inside the human genome), and the algorithm seems to be simple. The first time (B) happens, may be the only time (B) happens (before superintelligence).
I think it’s both plausible that the simple core of intelligence is found by human researchers, and that it just emerges inside a LLM with much greater effective scale (due to being both bigger and more efficient), subject to much greater amounts of chain-of-thought RL.