The power of LLMs comes almost entirely from imitation learning on human text. This leads to powerful capabilities quickly, but with a natural ceiling (i.e., existing human knowledge), beyond which it’s unclear how to make AI much better.
What do we make of RLVR on top of strong base models? Doesn’t this seem likely to learn genuinely new classes of problem currently unsolvable by humans? (I suppose it require us to be able to write reward functions, but we have Lean and the economy and nature that are glad to provide rewards even if we don’t know the solution ahead of time.)
I talk about RLVR a bunch in the next post (but from an alignment rather than capabilities perspective).
I wasn’t bringing up imitation learning here to argue that LLMs will not scale to AGI (which I believe, but was not trying to justify in this post), but rather to explain a disanalogy between how LLM capabilities have grown over time, versus the alleged future scary paradigm.
If you like, you can replace that text with a weaker statement “Up through 2024, the power of LLMs has come almost entirely from imitation learning on human text…”. That would still work in the context of that paragraph. (For the record, I do think the stronger statement as written is also valid. We’ll find out one way or the other soon enough!)
What do we make of RLVR on top of strong base models? Doesn’t this seem likely to learn genuinely new classes of problem currently unsolvable by humans? (I suppose it require us to be able to write reward functions, but we have Lean and the economy and nature that are glad to provide rewards even if we don’t know the solution ahead of time.)
I talk about RLVR a bunch in the next post (but from an alignment rather than capabilities perspective).
I wasn’t bringing up imitation learning here to argue that LLMs will not scale to AGI (which I believe, but was not trying to justify in this post), but rather to explain a disanalogy between how LLM capabilities have grown over time, versus the alleged future scary paradigm.
If you like, you can replace that text with a weaker statement “Up through 2024, the power of LLMs has come almost entirely from imitation learning on human text…”. That would still work in the context of that paragraph. (For the record, I do think the stronger statement as written is also valid. We’ll find out one way or the other soon enough!)