I talk about RLVR a bunch in the next post (but from an alignment rather than capabilities perspective).
I wasn’t bringing up imitation learning here to argue that LLMs will not scale to AGI (which I believe, but was not trying to justify in this post), but rather to explain a disanalogy between how LLM capabilities have grown over time, versus the alleged future scary paradigm.
If you like, you can replace that text with a weaker statement “Up through 2024, the power of LLMs has come almost entirely from imitation learning on human text…”. That would still work in the context of that paragraph. (For the record, I do think the stronger statement as written is also valid. We’ll find out one way or the other soon enough!)
I talk about RLVR a bunch in the next post (but from an alignment rather than capabilities perspective).
I wasn’t bringing up imitation learning here to argue that LLMs will not scale to AGI (which I believe, but was not trying to justify in this post), but rather to explain a disanalogy between how LLM capabilities have grown over time, versus the alleged future scary paradigm.
If you like, you can replace that text with a weaker statement “Up through 2024, the power of LLMs has come almost entirely from imitation learning on human text…”. That would still work in the context of that paragraph. (For the record, I do think the stronger statement as written is also valid. We’ll find out one way or the other soon enough!)