All the impressive ML results so far have only worked either in a narrow subspace around the training data (e.g. LLMs, still mostly the case even with RL), or in very small worlds (e.g. pure-RL game-players). There has been ~zero progress on fluid/general intelligence. Therefore, extrapolating straight lines on graphs predicts ~zero progress on fluid/general intelligence by doing more of the same kind of thing. The induction on increasing ‘intelligence’ that lots of other people appeal to only works by inappropriate compression.
I largely agree with this, yeah. It would need some probability caveats; I put nontrivial, like O(1-5%), on various scenarios leading to AGI within 10 years—largely the sorts of things people talk about, and generally “maybe I’m just confused and GPT architecture / training plus RLVR and a bit more whatever basically implements a GI seed” or “maybe I’m totally confused about “GI seed” being much of a thing or being ~necessary for world-ending AI”.
It’s still likely that we live in something like the 2011-Yudkowsky world as described in this tweet, with AGI to come from a lot of accumulation of insight. ML successes misleadingly make that world look falsified, if you aren’t tracking what they are and aren’t successes at.
Yeah, something like that. (I feel I have very little handle on how much insight is left, social dynamics around investment in conceptual “blue” capabilities research, etc.; hence very broad timelines. I also don’t much predict “there aren’t other major, impactful, discontinuous milestones before true world-ending AGI”; GPTs seem to be such a thing.)
Like, it’s evidence against the way of thinking that says understanding of intelligence is important. When you say (implicitly) ‘we probably need lots of AGI seedstuff’, I want to say ‘why isn’t the thought process you’re using to say that surprised, and downvoted, by how little stuff we needed to make LLMs?’.
It should probably be slightly directionally downvoted (though I’m not sure which preregistered hypotheses are doing better). But I think not very much, because I think that we did not observe “surprisingly obvious / easy / black-box idea generates lots of generally-shaped capabilities”. Partly that’s because the capabilities aren’t generally-distributed; e.g., gippities aren’t good at generating interesting novel concepts on par with humans, AFAIK. Partly that’s because there’s a great big screening-off explanation for the somewhat-generally-distributed capabilities that gippities do have: they got it from the data. I think we observed “surprisingly obvious / easy / black-box idea suddenly hoovers up lots of generally-shaped capabilities from the generally-shaped performances in the dataset (which we thus learned are surprisingly low-hanging fruit to distill from the data)”. (I do have the sense that there’s some things here that I’m not being clear about in my thinking, or at least in what I’ve written. One thing that I didn’t touch on, but that’s relevant, is that humans seem to exhibit this GI seedstuff, so it at least exists; whether it’s necessary to have that seedstuff to get various concrete consequences of AI is another question.)
gippities aren’t good at generating interesting novel concepts on par with humans, AFAIK
Sorry, this is a tangent from this comment thread, but an important one, I think:
LLMs aren’t good at generating interesting novel concepts on par with humans in deployment. But in deployment, we’ve turned off the learning, so of course they’re bad at inventing interesting novel concepts. A brilliant human with anterograde amnesia would also be quite bad at inventing interesting novel concepts.
It seems much more unclear if LLMs develop interesting new concepts in training, while they’re still learning.
They probably generate all kinds of interesting intuitive / S1 concepts and fine distinctions that allow them to get so good at the next token prediction task, just as experts in a domain generally learn all kinds of specialized conceptual representations.
(Though, apparently, and unlike human experts, the models don’t thereby learn words for those concepts, or have the ability to introspect and put handles on their conceptual representations, any more than I can introspect into how my visual cortex works.)
More speculatively, an LLM agent might invent new explicit concepts for itself and learn to use them, in RLVF training, especially if different rollouts are allowed to communicate with each other via a shared scratch-pad or something. I don’t think we have seen anything like this, and I’m not particularly expecting it at current capability levels, but I don’t think we can rule it out.
When we say that LLMs don’t generate new concepts, we’re selling them short. The part of the whole LLM system that has something-like-fluid intelligence to come up with new concepts is the training process, which we basically never interact with (currently).
I think I would generally avoid saying that LLMs or current learning programs don’t generate new concepts simpliciter. Plausibly I did, but if so, I’d hopefully be able to claim that it was a typo or elision for space/clarity. What I said here was “good at generating interesting novel concepts on par with humans”. I know perfectly well that LLMs gain concepts (after a fashion) during training and have written about that. I would dispute them using / having concepts in the same relevant ways that humans have them though.
I’m confident that there’s lots of interesting content generally speaking contained in LLMs, gained through training, which is unknown to all humans. (The same could be said of other systems such as AlphaGo, and even old-style Stockfish during runtime if you admit that.)
(Though, apparently, and unlike human experts, the models don’t thereby learn words for those concepts, or have the ability to introspect and put handles on their conceptual representations, any more than I can introspect into how my visual cortex works.)
So like, yeah, they have something kinda related to human concepts in their full power, but not. This fits with my claim that they don’t have much originary general intelligence; they have distilled GI from humans, some more distilled stuff that’s not exactly “knowledge from humans” but is kinda more narrow (like, LLMs know word collocation frequencies like no human does); and some other stuff that’s not very general. I posit.
I’m more trying to operationalize “interesting novel concept.” (But, it does look like we had approximately this conversation before and I’ll try to reread first. I think basically you said “they generate a novel concept that hadn’t been generated before and also people go on to use that concept in industry/science”, does that sound right?)
Part of what brought me here was remembering you saying:
My guess is that all or very nearly all human children have all or nearly all the intelligence juice. We just, like, don’t appreciate how much a child is doing in constructing zer world.
And wanting an example of thing that’s more like “what’s something that’d make you go ‘okay, this was in fact as smart as a four year old’” (and therefore either the end is nigh, or, we’re about to learn that children in fact did not have nearly all the intelligence juice.”)
I’ll try to think about some bets for ~1 year from now.
I think basically you said “they generate a novel concept that hadn’t been generated before and also people go on to use that concept in industry/science”, does that sound right?
Yeah, basically. I’m trying to be concrete here, and just saying “their intellectual output could be judged like human intellectual output is judged”.
And wanting an example of thing that’s more like “what’s something that’d make you go ‘okay, this was in fact as smart as a four year old’” (and therefore either the end is nigh, or, we’re about to learn that children in fact did not have nearly all the intelligence juice.”)
It’s a good question but it’s hard because that stuff looks from the outside like mostly pretty easy tasks. The way in which it is not easy is the way in which it is not “a task”. I guess, “very sample efficient learning” would be a concrete thing that 4yos do.
Nativization of a pidgin into a creole language might be an example, especially given that it seems to be largely underwritten by the cognitive plasticity of the linguistic developmental window.
A creole is believed to arise when a pidgin, developed by adults for use as a second language, becomes the native and primary language of their children – a process known as nativization.
Given that Opus 4.6 fails on very basic Classical Greek exercises (evidence towards “jaggedness”/bad “OOD generalization” even on very simple (though knowledge-heavy) tasks), I would be very surprised if it managed to successfully do something as unusual/OOD as creolizing a pidgin. It might also be very difficult to train it to do so, as it’s a very open-ended thing, and thus it’s very unclear how to specify a reward, and I would guess there isn’t much data on the internet that could be used for training.
I largely agree with this, yeah. It would need some probability caveats; I put nontrivial, like O(1-5%), on various scenarios leading to AGI within 10 years—largely the sorts of things people talk about, and generally “maybe I’m just confused and GPT architecture / training plus RLVR and a bit more whatever basically implements a GI seed” or “maybe I’m totally confused about “GI seed” being much of a thing or being ~necessary for world-ending AI”.
I also wouldn’t have quite so tight a categorization of sources of capabilities. Cf. https://www.lesswrong.com/posts/FG54euEAesRkSZuJN/ryan_greenblatt-s-shortform?commentId=QBca6vhdeKkjyNLKa
Yeah, something like that. (I feel I have very little handle on how much insight is left, social dynamics around investment in conceptual “blue” capabilities research, etc.; hence very broad timelines. I also don’t much predict “there aren’t other major, impactful, discontinuous milestones before true world-ending AGI”; GPTs seem to be such a thing.)
It should probably be slightly directionally downvoted (though I’m not sure which preregistered hypotheses are doing better). But I think not very much, because I think that we did not observe “surprisingly obvious / easy / black-box idea generates lots of generally-shaped capabilities”. Partly that’s because the capabilities aren’t generally-distributed; e.g., gippities aren’t good at generating interesting novel concepts on par with humans, AFAIK. Partly that’s because there’s a great big screening-off explanation for the somewhat-generally-distributed capabilities that gippities do have: they got it from the data. I think we observed “surprisingly obvious / easy / black-box idea suddenly hoovers up lots of generally-shaped capabilities from the generally-shaped performances in the dataset (which we thus learned are surprisingly low-hanging fruit to distill from the data)”. (I do have the sense that there’s some things here that I’m not being clear about in my thinking, or at least in what I’ve written. One thing that I didn’t touch on, but that’s relevant, is that humans seem to exhibit this GI seedstuff, so it at least exists; whether it’s necessary to have that seedstuff to get various concrete consequences of AI is another question.)
Sorry, this is a tangent from this comment thread, but an important one, I think:
LLMs aren’t good at generating interesting novel concepts on par with humans in deployment. But in deployment, we’ve turned off the learning, so of course they’re bad at inventing interesting novel concepts. A brilliant human with anterograde amnesia would also be quite bad at inventing interesting novel concepts.
It seems much more unclear if LLMs develop interesting new concepts in training, while they’re still learning.
They probably generate all kinds of interesting intuitive / S1 concepts and fine distinctions that allow them to get so good at the next token prediction task, just as experts in a domain generally learn all kinds of specialized conceptual representations.
(Though, apparently, and unlike human experts, the models don’t thereby learn words for those concepts, or have the ability to introspect and put handles on their conceptual representations, any more than I can introspect into how my visual cortex works.)
More speculatively, an LLM agent might invent new explicit concepts for itself and learn to use them, in RLVF training, especially if different rollouts are allowed to communicate with each other via a shared scratch-pad or something. I don’t think we have seen anything like this, and I’m not particularly expecting it at current capability levels, but I don’t think we can rule it out.
When we say that LLMs don’t generate new concepts, we’re selling them short. The part of the whole LLM system that has something-like-fluid intelligence to come up with new concepts is the training process, which we basically never interact with (currently).
I think I would generally avoid saying that LLMs or current learning programs don’t generate new concepts simpliciter. Plausibly I did, but if so, I’d hopefully be able to claim that it was a typo or elision for space/clarity. What I said here was “good at generating interesting novel concepts on par with humans”. I know perfectly well that LLMs gain concepts (after a fashion) during training and have written about that. I would dispute them using / having concepts in the same relevant ways that humans have them though.
I’m confident that there’s lots of interesting content generally speaking contained in LLMs, gained through training, which is unknown to all humans. (The same could be said of other systems such as AlphaGo, and even old-style Stockfish during runtime if you admit that.)
So like, yeah, they have something kinda related to human concepts in their full power, but not. This fits with my claim that they don’t have much originary general intelligence; they have distilled GI from humans, some more distilled stuff that’s not exactly “knowledge from humans” but is kinda more narrow (like, LLMs know word collocation frequencies like no human does); and some other stuff that’s not very general. I posit.
Thanks! The distinction between “generating capabilities” and “hoovering up capabilities” is another small click for me.
Can you give an example of a thing that you’d be surprised if an AI did in the next, say, 1.5 years?
Kill everyone? I’d be pretty surprised, like 1 in 100 or 200 surprised or something like that.
Generating interesting novel concepts on par with humans? See https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce?commentId=dqbLkADbJQJi6bFtN
See also https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce?commentId=HSqkp2JZEmesubDHD#HSqkp2JZEmesubDHD
Now, would you list something impressive that you do expect an AI to do in the next 1.5 years (that I might not say)?
I’m more trying to operationalize “interesting novel concept.” (But, it does look like we had approximately this conversation before and I’ll try to reread first. I think basically you said “they generate a novel concept that hadn’t been generated before and also people go on to use that concept in industry/science”, does that sound right?)
Part of what brought me here was remembering you saying:
And wanting an example of thing that’s more like “what’s something that’d make you go ‘okay, this was in fact as smart as a four year old’” (and therefore either the end is nigh, or, we’re about to learn that children in fact did not have nearly all the intelligence juice.”)
I’ll try to think about some bets for ~1 year from now.
Yeah, basically. I’m trying to be concrete here, and just saying “their intellectual output could be judged like human intellectual output is judged”.
It’s a good question but it’s hard because that stuff looks from the outside like mostly pretty easy tasks. The way in which it is not easy is the way in which it is not “a task”. I guess, “very sample efficient learning” would be a concrete thing that 4yos do.
Nativization of a pidgin into a creole language might be an example, especially given that it seems to be largely underwritten by the cognitive plasticity of the linguistic developmental window.
Given that Opus 4.6 fails on very basic Classical Greek exercises (evidence towards “jaggedness”/bad “OOD generalization” even on very simple (though knowledge-heavy) tasks), I would be very surprised if it managed to successfully do something as unusual/OOD as creolizing a pidgin. It might also be very difficult to train it to do so, as it’s a very open-ended thing, and thus it’s very unclear how to specify a reward, and I would guess there isn’t much data on the internet that could be used for training.