I wouldn’t exactly say that we have a method, more like a field wide methodology that I expect will probably yield fully automated AI R&D within maybe 6 years and will probably yield top-human-expert level fluid intelligence within maybe 9 years. (As in, my median to fully automated AI R&D is 6 years while my median to top-human-expert level fluid intelligence is somewhat longer due to the possibility of a significant lag between these milestones dragging my median out by some amount. Note that the median to B is in general not the same as median A + median from A to B and this applies in this case.)
(Technically, I actually think that we’ll probably achieve human level fluid intelligence within 8 years in some way where a large majority of this probability comes from something like the methodology of the current field, including the part of the methodology that involves getting weaker AIs to automate AI research in all kinds of ways but most focused on this methodology. I think some fraction of the mass comes from very different methodologies, but this is a smaller fraction. The probability of very different methodologies yielding powerful AI in the next ~10 years is boosted significantly due to going through a particularly salient compute region—somewhat above human lifetime compute—and higher attention on AI in general.)
As far as why I think this methodology will probably yield “human level fluid intelligence”, this is mostly downstream of a mix of qualitative and quantitative extrapolations combined with thinking that armies of automated AI researchers matching top human AI researchers and doing everything they do but better would probably find a way.
I agree with you that something like the crystalized/fluid distinction is relevant here, and that current LLMs seem to have more of the former. But I’m also confused about where the fluidity ever comes from on this model. Like, I buy that armies of automated researchers which are better at doing everything than top human researchers could probably find a way to figure out how to build “human level fluid intelligence,” but I am confused about how you get to that step in the first place. Why are they better than human researchers at everything when they are still mostly using crystallized intelligence?
(I get that me asking followup questions could be frustrating because you may not want to talk about this; I would just ask that if you’d like to bow out, if you could, give one or two sentences on why, coarsely—e.g. “the whole topic doesn’t seem valuable” or “I don’t want to discuss this publicly” or “you Tsvi won’t have a productive convo about this” etc.; I ask because I have a general feeling that there’s an unidentified blob of consensus around short timelines, which IMO is irrational and overconfident (meaning, it reached incorrect conclusions due to systematically substantively flawed reasoning), and I’m not aware of a good way to get recourse (i.e. to ask “take me to your leader”), so I’d at least appreciate hearing epistemic metadata in order to orient to that situation.)
As far as why I think this methodology will probably yield “human level fluid intelligence”, this is mostly downstream of a mix of qualitative and quantitative extrapolations combined with thinking that armies of automated AI researchers matching top human AI researchers and doing everything they do but better would probably find a way.
What’s the relative weight of these considerations, grossly speaking? Is it more like 90% extrapolations and 10% hand-off or more like the reverse? For the 90% one, if there is one, what are one or two of the most compelling points? For example if it’s extrapolations, what are you extrapolating (and why do you think that extrapolates to high fluid intelligence)? If it’s the hand-off, why do you think that the current-human-research that would be handed off is a kind of research that would soon produce high fluid intelligence (given the apparent circularity of this specific line of reasoning)?
I’m going to reply kinda briefly and then I’m going to bow out with some chance I comment more later.
What’s the relative weight of these considerations, grossly speaking? Is it more like 90% extrapolations and 10% hand-off or more like the reverse?
Hmm, these are tied up a bit. Like I’m doing the extrapolations to get to automating AI R&D to various extents and then trying to account for acceleration due to automation. But I’m also extrapolating out to human fluid intelligence somewhat and if this extrapolation looked extremely far that would influence my view.
To give a rough (and potentially not that stable on reflection answer):
50% extrapolations towards automation of current AI R&D (both towards full automation and large speeds up from very strong partial automation)
25% armies of AIs that can automate AI R&D will be able to find a way to get to human level fluid intelligence
25% extrapolations towards human level fluid intelligence
This is something like an importance attribution. E.g., if I updated toward a way lower qualitative extrapolation on (3), that would update me quite a bit, but probly not shift my 25th percentile by more than a factor of like 2-3.
For (1), I’m doing a mix of something like time horizon, some informal sense of how good AIs are at research tasks I give them and what I see from others, and some longer list of considerations. And part of this is factoring in earlier speed ups moving us along the trajectory faster.
For the hand-off (what I would call (2) in my breakdown), I can see why the argument seems circular, but I don’t think it’s actually circular. My view is like “probably tons of AI R&D via stuff sorta like the current stuff we’re doing would yield this (due to a mix of my observations from ML, my extrapolations of current progress, thinking of specific things you could do that I won’t mention here without more consideration)” and “probably you’ll be able to get AIs to do this without very high fluid intelligence, though I do think AIs at this point will match somewhat worse human engineers/scientists at moderate time horizon fluid intelligence (e.g. over the course of a few month) on the relevant tasks”.
(My views are also somewhat based on defering to other people and the forecasting track record of other people.)
Why I’m not engaging much here:
I feel like I’ve personally engaged with your arguments a decent amount and it seems unlikely that further argumentation would change my perspective very much.
I’d guess that there aren’t interested third parties for whom I care a bunch about their views for whom my discussion with you would update them at all.
I have limited capacity for writing stuff and there is other stuff I want to write about that seems more important in the short term.
I think other formats are better for me than writing comments back and forth.
I’m moderately excited to argue with people about timelines, especially via formats other than writing comments back and forth, but most strongly if (1) I think it might update my views (2) third parties I care about care about the other person’s views or perspective.
I’d guess that there aren’t interested third parties for whom I care a bunch about their views for whom my discussion with you would update them at all.
I’m not claiming that you do or should care a bunch about my views, but I am extremely interested in the questions “will LLM minds scale to superintelligence without any radical breakthroughs?” and “if not, can armies of very capable LLM-agents discover those breakthroughs?”.
If I had a stronger understanding here, it would probably influence what I predictions I highlight to policymakers.
(eg, I’m currently telling them that “the labs are planning to automate AI development; we’re already seeing AIs automating many simple AI research tasks [show them a demo of that]; given the METR trendline it’s likely that there will be full or near-full automation in 2027 to 2029; and (while it is hard to forecast with any precision) is likely that there will be strategically superhuman AI agents weeks to months after that point.”
If I came to think that LLM-agents automating AI research is much less likely to lead rapidly to Superintelligence than I currently think is plausible, I would probably not want to focus so on the above talking points in initial 30 minute meetings, in which we have to be selective about which points to convey.)
Also, generally, if I had more mass on “continuing to scale LLM-agents will not lead to superintelligence, either directly or by enabling the discovery of critical breakthroughs”, this would back up into my general strategic models and strategic priorities.
the labs are planning to automate AI development; we’re already seeing AIs automating many simple AI research tasks [show them a demo of that];
This one I wouldn’t claim to know that someone else doesn’t know this, and it’s what I would assume;
given the METR trendline it’s likely that there will be full or near-full automation in 2027 to 2029;
Pretty skeptical of “full” or “near-full”, in that I would still expect things to be largely bottlenecked on human judgement and not making much (say, greater than 2x or something?) progress compared to currently; in other words, I expect Amdahl. There’s also a big ambiguity and/or assumption about what you’re automating (if you successfully automate a process which doesn’t invent AGI, then you’ve done something but you haven’t set off an intelligence explosion or similar).
and (while it is hard to forecast with any precision) is likely that there will be strategically superhuman AI agents weeks to months after that point.
This is where I think no one has any especially strong reason to think so, or at least, no one has told me so far even in private, let alone publicly (and therefore consensus views seem quite mistaken).
This is where I think no one has any especially strong reason to think so, or at least, no one has told me so far even in private, let alone publicly (and therefore consensus views seem quite mistaken).
I mean, in the absence of specific arguments and info, “there’s a double digit probability that this will lead to superintelligence” seems like a correct or reasonable prior? So, I find it hard to engage with the way your sentence is phrased, which feels like it’s trying to make a claim about the burden of proof.
It might be that there’s something fundamentally missing from LLM agents, such that without breakthroughs, they can’t grow into superintelligences. But that’s sure not obvious to me. A lot of problems that people said were fundamental have fallen to scale and RL.
In an attempt to be legible, I think there’s at least a 50% chance (baring big government interventions) that the current technological trajectory will lead to AIs that are strategically superhuman without any innovations that are a bigger deal than the original “Attention Is All You Need” paper. (Maybe you think that’s obviously crazy?)
And regardless, it sure looks like we’re on track to automate a very wide range of “routine” cognitive operations and AI research tasks. My guess is that it will soon become clear that almost everything that almost every human does is made of routine cognitive operations that don’t require deep fluid intelligence. This includes the researchers at the AI companies.
AI research as it is done now seems pretty empirical / “throw spaghetti at the wall”-ish. Progress is mostly not driven by groundbreaking genius ideas, ala Einstein, but rather by listing 20 obvious next things to try and trying them. It seems like LLM agents will be able to automate that.
And even if the LLM agents, can’t do some crucial “deep thinking” step that’s necessary for getting to the next paradigm, automating all the routine AI development tasks will free up a lot of genius AI researcher attention, and allow them to build intuitions by operating the domain a lot faster.
It seems pretty surprising to me if we automate all routine AI development tasks (and for that matter, almost all cognitive work) and then we just hang out there for more than 5 years. It seems like at that point, AI will be radically transforming the world, and ~a whole generation of geniuses + armies of millions of in-many-ways-superhuman LLM agents will be turning their attention to figuring out the missing secret sauce for true AGI.
It might be that there’s something fundamentally missing from LLM agents, such that without breakthroughs, they can’t grow into superintelligences. But that’s sure not obvious to me
Can you expand on this, and in particular, how it gets you to be quite confident (relative to the prior where any given innovation isn’t confidently going to grow into superintelligence)?
In an attempt to be legible, I think there’s at least a 50% chance (baring big government interventions) that the current technological trajectory will lead to AIs that are strategically superhuman without any innovations that are a bigger deal than the original “Attention Is All You Need” paper. (Maybe you think that’s obviously crazy?)
I don’t think it’s obviously crazy at the outset, meaning, I could have seen it being the case that someone could know something that gets them to 50% on this. I am starting to think that it is obviously crazy, though, because no one can give an argument for this! Like, can you explain why you think this? In other words, I think it should be obviously crazy to someone who basically just updated off of a bunch of other people updating; you should be tracking when you’re doing that. In plenty of cases doing that update could be rational, but then in such cases usually it would also be rational to later update on the (strange) observation that no one can actually explain / argue for / defend the initial large consensus update.
And regardless, it sure looks like we’re on track to automate a very wide range of “routine” cognitive operations and AI research tasks. My guess is that it will soon become clear that almost everything that almost every human does is made of routine cognitive operations that don’t require deep fluid intelligence.
This includes the researchers at the AI companies.
One man’s modus ponens is another man’s modus tollens. The stuff that LLMs will soon automate probably isn’t the stuff that makes AGI. Or, go ahead and argue the opposite, but I will point you to the fact that your arguments better not imply that current LLMs would easily frequently be able to create many novel useful interested concepts like humans do (narrator: his arguments would imply that).
And even if the LLM agents, can’t do some crucial “deep thinking” step that’s necessary for getting to the next paradigm, automating all the routine AI development tasks will free up a lot of genius AI researcher attention, and allow them to build intuitions by operating the domain a lot faster.
Again, modus ponens meet modus tollens and Amdahl. According to your argument, the Industrial Revolution (or the invention of computers, or the invention of compilers, or operating systems, or the internet, or Google) should imminently create AGI because it frees up a bunch of human capital. Now, that’s probably true! Just not on any specific timeline.
Can you expand on this, and in particular, how it gets you to be quite confident (relative to the prior where any given innovation isn’t confidently going to grow into superintelligence)?
This question seems to be insisting on a weird burden of proof, to me. I’m going to try to answer it straightforwardly, but I imagine my answers will feel frustrating, or missing the point, or something.
The AI agents we have now meet many to most of the criteria for AGI that many people put forward in previous decades. It’s not crazy to say that Claude 4.6 is an AGI, and insofar as it isn’t, its not very clear what’s missing.
Last I checked, GPT-5.something was the 6th best competitive programmer in the world. The AIs are winning gold in math Olympiad competitions. They’re clearly better than me, already, at almost all of technical thinking.
I don’t see a fundamental reason they shouldn’t be able to eg design components of a nuclear reactor, or design a whole nuclear reactor, better than than the best human nuclear engineers, except that their time horizon is too short to make much progress. But the AI time horizons have been doubling on a consistent exponential since GPT-3 at least.
And notably, on the path to getting here, we went from language models that mostly output gibberish, to chatbots that are vastly more knowledgeable than any human, to agents that can solve open profesional level math problem and build software better than most humans and faster than almost every human, by applying very simple and obvious ideas.
“Huh, GPT-2 worked pretty well. What if we do the same thing, with a bigger model, on more data”. This worked.
“What if we have the model train on procedurally generated math and programming problems”. This worked.
“What if we build simple scaffold that queries the model, to implement an agent.” This didn’t work at first, but a combination of making the models better, making better scaffolding, and training the model in the scaffolding got it to work.
I am not an AI researcher. But I had all of these ideas, years before the AI companies got them working. (In contrast, I would need a lot of technical expertise, and maybe more raw intelligence than I have, to invent eg the attention mechanism.)
The amazing progress in AI over the past 5 years has been driven by taking very obvious ideas, and working out the engineering details to implement them.
I don’t have any strong reason to think that this trend of applying basically simple ideas to LLMs and getting increasingly impressive capabilities will break. There were lots and lots of things LLMs couldn’t do well, that people claimed were fundamental problems, that were solved naturally in the course of scaling.
There are still some gaps left. Notably, their time horizons are too short—but as noted, progress is being made continually on that. Also, LLMs can’t really invent new concepts, at least after training, which seems like a blocker for really doing science. Humans are still vastly more sample-efficient than AIs in training.
Do I know that all the gaps that are left in the LLM agents will be solved by the continual application of basically simple ideas and engineering schlep? No, obviously not. But I also don’t have any strong reason to think that they won’t be.
Eliezer was saying in 2021 that GPTs are only memorizing “shallow patterns” and so don’t embody the deep parts of cognition. It’s unclear if that’s a correct gloss. If it is true, it turns out you can get surprisingly technically competent by only memorizing and applying shallow patterns. It turns out the linguistic traces of human thought have a lot more of the true generators of human thought contained within them, than I would have guessed.
And in any case, we’re doing RLVF now, and so the AIs can bootstrap from human concepts to learn from their own trial and error, just like alpha-go. Maybe you can get all the way to AIs that are superhuman in ~every domain, by just designing a huge number of RL environments, and giving the agents the affordance to design their own RL environments to train on in response to novel circumstances, despite still having weak fluid intelligence.
Plus there are more-or-less obvious ideas for continual learning which seem like they would enable the AIs to develop new concepts.
Given all this, it feels like at least 30%(?) that everything left to do can be automated by the LLM agents of the next two years.
Again, modus ponens meet modus tollens and Amdahl. According to your argument, the Industrial Revolution (or the invention of computers, or the invention of compilers, or operating systems, or the internet, or Google) should imminently create AGI because it frees up a bunch of human capital. Now, that’s probably true! Just not on any specific timeline.
Well the difference in this case is how close we already are (or seem to be). When the industrial revolution happened, or when computers or compilers were invented, there were still many scientific discoveries to be made and engineering challenges to solve, between us and AGI.
Now, there are plausibly only engineering challenges, which we can automate with our existing AIs, and if not, probably only one or two scientific discoveries left.
I think it should be obviously crazy to someone who basically just updated off of a bunch of other people updating
This is unimportant, but I find the insinuation that everyone is updating on everyone else’s views kind of annoying. I’m not immune to the hype cycles, but I feel like I’m mostly updating on things that I’m seeing with my own eyes. It’s fine though—I can imagine how gaslighting it might feel if almost everyone around me was asserting or assuming some very important point, and none of them could manage to make an argument for that point.
This question seems to be insisting on a weird burden of proof, to me.
I’m not sure what’s weird about it, but yes, I think someone claiming to predict the future confidently as opposed to the more default background broad uncertainty would have the burden of proof.
The AI agents we have now meet many to most of the criteria for AGI that many people put forward in previous decades.
Do you think it has high fluid intelligence (assuming as best you can, arguendo, that this phrase maps to something meaningful + important)? If yes, why (given that you’d be disagreeing with a lot of other short timelines views)? If no, why talk about AGI that doesn’t include high fluid intelligence?
and insofar as it isn’t, its not very clear what’s missing.
I don’t see a fundamental reason they shouldn’t be able to
Do I know that all the gaps that are left in the LLM agents will be solved by the continual application of basically simple ideas and engineering schlep? No, obviously not. But I also don’t have any strong reason to think that they won’t be.
I think it’s true and good to be worried that the AI research community could adaptive creatively surprise us with inventing AGI seedstuff in 1 year or 5 years. What I’m arguing against, or just trying to understand, is people (such as you) seeming to have very high confidence in this (like having a median of 5 years, i.e. 50% chance). That sure sounds like positive knowledge of us having almost all of fluid intelligence AGI seedstuff. No?
They’re clearly better than me, already, at almost all of technical thinking.
Except for the most important parts, such as orienting to a new domain / new question in a manner that produces successful understanding in the long run.
I don’t have any strong reason to think that this trend of applying basically simple ideas to LLMs and getting increasingly impressive capabilities will break.
Response 1: This type of reasoning does not work for all those other previous big breakthroughs (such as the invention of the universal computer, of the operating system, or of google search).
Response 2: Consider the hypothesis that it went up fast because it used up available data. And, as you can see in the rest of the top-level thread (e.g. from Kokotaljo and Greenblatt) that in fact people use this as an excuse (as it were) for LLMs performing badly on some things.
I am not an AI researcher. But I had all of these ideas, years before the AI companies got them working.
The amazing progress in AI over the past 5 years has been driven by taking very obvious ideas, and working out the engineering details to implement them.
Which suggests that they aren’t much of an idea. Which suggests that we don’t understand much about intelligence. I think you’re trying to say “turns out that fairly obvious things are most of intelligence”. And I’m trying to say “actually most of that was just unlocking what was already fairly shallowly available in the massive training corpus, so we did not get much evidence that we understood / can build the generators of that corpus or of fluid / general intelligence in general”.
. If it is true, it turns out you can get surprisingly technically competent by only memorizing and applying shallow patterns. It turns out the linguistic traces of human thought have a lot more of the true generators of human thought contained within them, t
Wait can you expand on this? Why do you think true generators of human thought are contained in them / picked up by LLMs?
Now, there are plausibly only engineering challenges, which we can automate with our existing AIs, and if not, probably only one or two scientific discoveries left.
This comment felt like it made a better model of your views click. ISTM you think something like:
All the impressive ML results so far have only worked either in a narrow subspace around the training data (e.g. LLMs, still mostly the case even with RL), or in very small worlds (e.g. pure-RL game-players). There has been ~zero progress on fluid/general intelligence. Therefore, extrapolating straight lines on graphs predicts ~zero progress on fluid/general intelligence by doing more of the same kind of thing. The induction on increasing ‘intelligence’ that lots of other people appeal to only works by inappropriate compression.
It’s still likely that we live in something like the 2011-Yudkowsky world as described in this tweet, with AGI to come from a lot of accumulation of insight. ML successes misleadingly make that world look falsified, if you aren’t tracking what they are and aren’t successes at.
(Implied) The fact that [ML results so far required surprisingly little understanding-of-intelligence] is not significant evidence that [other-things-you-might-expect-to-require-understanding, e.g. fluid intelligence, will require less understanding]. If we’ve learned something about how little understanding-of-intelligence was needed to build things that succeed on some tasks, this still just doesn’t say much about AGI.
(Or maybe you don’t believe that ‘fact’ about ML results so far, idk.)
Intuitively-to-me, there should be a big inductive update on this level, even if induction on ‘intelligence’ doesn’t work.
Like, it’s evidence against the way of thinking that says understanding of intelligence is important. When you say (implicitly) ‘we probably need lots of AGI seedstuff’, I want to say ‘why isn’t the thought process you’re using to say that surprised, and downvoted, by how little stuff we needed to make LLMs?’.
All the impressive ML results so far have only worked either in a narrow subspace around the training data (e.g. LLMs, still mostly the case even with RL), or in very small worlds (e.g. pure-RL game-players). There has been ~zero progress on fluid/general intelligence. Therefore, extrapolating straight lines on graphs predicts ~zero progress on fluid/general intelligence by doing more of the same kind of thing. The induction on increasing ‘intelligence’ that lots of other people appeal to only works by inappropriate compression.
I largely agree with this, yeah. It would need some probability caveats; I put nontrivial, like O(1-5%), on various scenarios leading to AGI within 10 years—largely the sorts of things people talk about, and generally “maybe I’m just confused and GPT architecture / training plus RLVR and a bit more whatever basically implements a GI seed” or “maybe I’m totally confused about “GI seed” being much of a thing or being ~necessary for world-ending AI”.
It’s still likely that we live in something like the 2011-Yudkowsky world as described in this tweet, with AGI to come from a lot of accumulation of insight. ML successes misleadingly make that world look falsified, if you aren’t tracking what they are and aren’t successes at.
Yeah, something like that. (I feel I have very little handle on how much insight is left, social dynamics around investment in conceptual “blue” capabilities research, etc.; hence very broad timelines. I also don’t much predict “there aren’t other major, impactful, discontinuous milestones before true world-ending AGI”; GPTs seem to be such a thing.)
Like, it’s evidence against the way of thinking that says understanding of intelligence is important. When you say (implicitly) ‘we probably need lots of AGI seedstuff’, I want to say ‘why isn’t the thought process you’re using to say that surprised, and downvoted, by how little stuff we needed to make LLMs?’.
It should probably be slightly directionally downvoted (though I’m not sure which preregistered hypotheses are doing better). But I think not very much, because I think that we did not observe “surprisingly obvious / easy / black-box idea generates lots of generally-shaped capabilities”. Partly that’s because the capabilities aren’t generally-distributed; e.g., gippities aren’t good at generating interesting novel concepts on par with humans, AFAIK. Partly that’s because there’s a great big screening-off explanation for the somewhat-generally-distributed capabilities that gippities do have: they got it from the data. I think we observed “surprisingly obvious / easy / black-box idea suddenly hoovers up lots of generally-shaped capabilities from the generally-shaped performances in the dataset (which we thus learned are surprisingly low-hanging fruit to distill from the data)”. (I do have the sense that there’s some things here that I’m not being clear about in my thinking, or at least in what I’ve written. One thing that I didn’t touch on, but that’s relevant, is that humans seem to exhibit this GI seedstuff, so it at least exists; whether it’s necessary to have that seedstuff to get various concrete consequences of AI is another question.)
gippities aren’t good at generating interesting novel concepts on par with humans, AFAIK
Sorry, this is a tangent from this comment thread, but an important one, I think:
LLMs aren’t good at generating interesting novel concepts on par with humans in deployment. But in deployment, we’ve turned off the learning, so of course they’re bad at inventing interesting novel concepts. A brilliant human with anterograde amnesia would also be quite bad at inventing interesting novel concepts.
It seems much more unclear if LLMs develop interesting new concepts in training, while they’re still learning.
They probably generate all kinds of interesting intuitive / S1 concepts and fine distinctions that allow them to get so good at the next token prediction task, just as experts in a domain generally learn all kinds of specialized conceptual representations.
(Though, apparently, and unlike human experts, the models don’t thereby learn words for those concepts, or have the ability to introspect and put handles on their conceptual representations, any more than I can introspect into how my visual cortex works.)
More speculatively, an LLM agent might invent new explicit concepts for itself and learn to use them, in RLVF training, especially if different rollouts are allowed to communicate with each other via a shared scratch-pad or something. I don’t think we have seen anything like this, and I’m not particularly expecting it at current capability levels, but I don’t think we can rule it out.
When we say that LLMs don’t generate new concepts, we’re selling them short. The part of the whole LLM system that has something-like-fluid intelligence to come up with new concepts is the training process, which we basically never interact with (currently).
I think I would generally avoid saying that LLMs or current learning programs don’t generate new concepts simpliciter. Plausibly I did, but if so, I’d hopefully be able to claim that it was a typo or elision for space/clarity. What I said here was “good at generating interesting novel concepts on par with humans”. I know perfectly well that LLMs gain concepts (after a fashion) during training and have written about that. I would dispute them using / having concepts in the same relevant ways that humans have them though.
I’m confident that there’s lots of interesting content generally speaking contained in LLMs, gained through training, which is unknown to all humans. (The same could be said of other systems such as AlphaGo, and even old-style Stockfish during runtime if you admit that.)
(Though, apparently, and unlike human experts, the models don’t thereby learn words for those concepts, or have the ability to introspect and put handles on their conceptual representations, any more than I can introspect into how my visual cortex works.)
So like, yeah, they have something kinda related to human concepts in their full power, but not. This fits with my claim that they don’t have much originary general intelligence; they have distilled GI from humans, some more distilled stuff that’s not exactly “knowledge from humans” but is kinda more narrow (like, LLMs know word collocation frequencies like no human does); and some other stuff that’s not very general. I posit.
I’m more trying to operationalize “interesting novel concept.” (But, it does look like we had approximately this conversation before and I’ll try to reread first. I think basically you said “they generate a novel concept that hadn’t been generated before and also people go on to use that concept in industry/science”, does that sound right?)
Part of what brought me here was remembering you saying:
My guess is that all or very nearly all human children have all or nearly all the intelligence juice. We just, like, don’t appreciate how much a child is doing in constructing zer world.
And wanting an example of thing that’s more like “what’s something that’d make you go ‘okay, this was in fact as smart as a four year old’” (and therefore either the end is nigh, or, we’re about to learn that children in fact did not have nearly all the intelligence juice.”)
I’ll try to think about some bets for ~1 year from now.
I think basically you said “they generate a novel concept that hadn’t been generated before and also people go on to use that concept in industry/science”, does that sound right?
Yeah, basically. I’m trying to be concrete here, and just saying “their intellectual output could be judged like human intellectual output is judged”.
And wanting an example of thing that’s more like “what’s something that’d make you go ‘okay, this was in fact as smart as a four year old’” (and therefore either the end is nigh, or, we’re about to learn that children in fact did not have nearly all the intelligence juice.”)
It’s a good question but it’s hard because that stuff looks from the outside like mostly pretty easy tasks. The way in which it is not easy is the way in which it is not “a task”. I guess, “very sample efficient learning” would be a concrete thing that 4yos do.
Nativization of a pidgin into a creole language might be an example, especially given that it seems to be largely underwritten by the cognitive plasticity of the linguistic developmental window.
A creole is believed to arise when a pidgin, developed by adults for use as a second language, becomes the native and primary language of their children – a process known as nativization.
Given that Opus 4.6 fails on very basic Classical Greek exercises (evidence towards “jaggedness”/bad “OOD generalization” even on very simple (though knowledge-heavy) tasks), I would be very surprised if it managed to successfully do something as unusual/OOD as creolizing a pidgin. It might also be very difficult to train it to do so, as it’s a very open-ended thing, and thus it’s very unclear how to specify a reward, and I would guess there isn’t much data on the internet that could be used for training.
‘why isn’t the thought process you’re using to say that surprised, and downvoted, by how little stuff we needed to make LLMs?’
After I read this comment, my hasty-guess-of-a-Tsvi-model replies: ‘the big surprise is that “solid performance on a wide range of technical tasks is not that connected to GI.” This surprise sufficiently explains the surprise of ~easily achieving that performance. Any ex ante expectation that those tasks required lots of understanding would/should have been mediated by expecting they required GI. Given that they don’t require GI, it’s [not surprising? / not relevantly surprising?] that they don’t require much understanding.’
I wrote this response the day you sent the above. But it felt clear that we were missing each other, and I wanted to try to inhabit your view, in an attempt to make more effective progress. But I was too tried to do that well at the time, and so put this away to come back to later. But I have a day job in addition to the other projects that I’m trying to push on, and this fell by the wayside for two weeks.]
I’m not sure what’s weird about it, but yes, I think someone claiming to predict the future confidently as opposed to the more default background broad uncertainty would have the burden of proof.
Well part of the debate here is what the prior ought to be.
It’s in some sense a confident prediction to assert that Moore’s law will continue, in 1995. But, broadly, the burden of proof is more on the side of the guy who thinks that the trend will break. Or at least, it’s not totally clear what the prior should be and where the burden of proof should lie.
But we should also update that this behavior surprisingly turns out to not require as much general intelligence as we thought.
Yes exactly. I’ve updated that these tasks require less general intelligence than I thought, and as a consequence, I’ve updated that tasks in general require less general intelligence than I thought.
Do you think it has high fluid intelligence (assuming as best you can, arguendo, that this phrase maps to something meaningful + important)? If yes, why (given that you’d be disagreeing with a lot of other short timelines views)? If no, why talk about AGI that doesn’t include high fluid intelligence?
No, I think Claude 4.6 has quite weak fluid intelligence. I previously described that as “the LLMs are actually not very intelligent at all, but it turns out that you can make up for moderately weak intelligence with a lot of knowledge.”
If no, why talk about AGI that doesn’t include high fluid intelligence?
Because high fluid intelligence (at least as we currently conceive of it) 1) is maybe not necessary, and 2) might come from the default trajectory of LLM-AI development.
Like, it seems like you can maybe get a strategically superhuman AI by relying on a lattice of more-or-less specialized superhuman skills (including superhuman engineering, and superhuman persuasion, and superhuman corporate strategy, and so on), without having much fluid intelligence.
To be clear, it also seems possible to me that we will make superhuman AI agents that don’t have this fluid intelligence special sauce. Those AIs will be adequate to automate almost all human labor, because almost all of human labor is more-or-less routine application of crystalized knowledge. We’ll be living in a radical new world of ~full automation, except for a small number of geniuses who are adding critical insight steps to the new cyborg-process of doing science.[1]
But I will be surprised if we hang out in that regime for very long, before the combined might of humanity’s geniuses augmented their armies of superhumanly capable routine engineers, and enormous computer infrastructure to do massive experiments, can’t hit on a mechanism that replicates the human fluid intelligence special sauce.[2]
Maybe I’m wrong about how hard the problem of developing a mechanism that can do fluid intelligence is, or about if it’s the kind of thing that can be accelerated by armies of superhuman engineers. But just eyeballing how ~every AI capability since the advent of deep learning came to be, it seems like it involved a lot of tinkering, and running empirical experiments to see what works, and optimizing metrics, and bitter-lesson-style scaling, not eg Einstein style genius conceptual breakthroughs. To my only somewhat informed eye, it looks like the way AI capabilities are developed is exactly the kind of thing that armies of superhuman engineers doing the routine-cognition part of research should be able to do.
We call it “grad-student descent”, as a way to emphasize how much it resembles a dumb search process. And there will be a lot more AI agents, running a lot faster, than there ever were grad students.
That sure sounds like positive knowledge of us having almost all of fluid intelligence AGI seedstuff. No?
No, I’m putting forward a disjunction:
Fluid intelligence isn’t necessary for Strategically Superhuman AI.
or
LLM based agents will develop fluid intelligence on the default technological trajectory, via the application of not-very-clever ideas.
or
There’s about one or two “breakthrough” ideas missing, that when combined with the existing LLM-agent techniques, will make LLM-agents that can do the fluid intelligence thing (or a substitute for the fluid intelligence thing). Having armies of LLM-agents that can automate engineering and experimentation seems like it should accelerate the discovery of those one or two breakthroughs.
Those last two legs of the disjunction are assuming that there are not many pieces left before fluid intelligence is solved, but not making much of a claim about how many pieces we already have. Like, depending on what one means by “pieces”, maybe we have 0 out of 1 (and we’re likely to get that one in the next five years), or maybe we have 95 out of 100 (and we’re likely to get the last five in the next five years).
They’re clearly better than me, already, at almost all of technical thinking.
Except for the most important parts, such as orienting to a new domain / new question in a manner that produces successful understanding in the long run.
I mean, that’s very true of current LLM-agents after they leave training. It’s also true (though less so, I think), of LLMs in training—they come away with a massive library of concepts that they can’t wield as deftly as a human
But it’s also true of AlphaZero, in some sense, in that alpha-zero improves much less from each game it plays than a human does. But also alpha-zero can play enough games, fast enough, to become superhuman at go in a few hours.
Response 1: This type of reasoning does not work for all those other previous big breakthroughs (such as the invention of the universal computer, of the operating system, or of google search).
Maybe. But this does seem to be what works in Deep Learning, even if not in other CS subfields.
Response 2: Consider the hypothesis that it went up fast because it used up available data.
How does this relate to the fact that AIs are now getting better by training on procedurally generated problems instead of human data?
Are you suggesting that RLVR, is only eliciting capabilities that are already in the base model, rather than instilling new capabilities?
Wait can you expand on this? Why do you think true generators of human thought are contained in them / picked up by LLMs?
Because GPT-4 can do more reasoning than I would have naively guessed, under the hypothesis “GPT-3 is only memorizing shallow patterns, not the real, deep patterns of cognition.”
What makes you think we’re close?
That the AI agents are already able to do, or are a few METR doublings from being able to do, almost all of the mental work that humans do, weighted by “time spent doing that work.”
. . .
But overall, it seems like we’re obviously talking past each other, or something. Maybe I can try to articulate your view as I understand it and you can offer corrections?
It sounds like you’re saying something like...
Look, the important and dangerous thing about AGI is that it can do the cognitive operations of science / discovery / inventing and operating in new fields, at a superhuman level. The danger lies with AI that is able to make fundamental discovries the way a scientist does (and then apply / wield those discoveries). An AI that isn’t really able to make fundamental discoveries is just not that dangerous.
LLMs and LLM-agents can do a lot of seemingly impressive stuff, but they’re really dramatically bad at orienting to new domains or making discoveries like that.
They’re something like an Eliza-bot, in that Eliza-bot could use simple mechanisms to generate conversational outputs that appear like a conversation. Someone talking with Eliza might be astonished, and think with only a little improvement, the next generation would be able to converse as completely as a human can. But that’s an illusion: the simple mechanisms that Eliza is exploiting are basically not adequate to produce anything like a real conversation.
Similarly, the LLM-agents are able to do some portion of technical work that humans do, but they’re basically not doing the interesting parts. And the interesting parts are almost all of the problem. That the LLM agents are able to do some technical work is very little evidence of how much additional conceptual work needs to happen to solve the hard and interesting parts of AGI.
Someone who’s impressed with o3 or Claude Mythos, and thinks that there’s very little left to add before we get to AIs that can automate all or almost all of scientific progress is making an error analogous to someone who thinks that there’s very little left to add to Eliza to get AGI, because it’s so close to intelligent behavior.
As a side note, this world comes along with all kinds of new dangers that it’s not clear that we’re equipped to deal with, that mostly fall under the headings of “misuse” and “concentration of power”. I’m not sure how high the risk is, but if we fumble this, we could totally loose the game.
This would be a very scary world, because if LLM-agents already have all the pieces to be a strategically superhuman agent, except for one, and we’ve built out huge compute infrastructure for running them, we’re in for a very hard takeoff once someone builds the first “real” AGI.
@TsviBT here’s my distilled paraphrase of your view, perhaps mostly in my own conceptual vocabulary. Let me know how close this is.
The process of discovering how to make an AGI with fluid intelligence depends heavily and crucially on strong fluid intelligence. It’s a central example of a research task that requires insight and developing new, deep, technical concepts, not just patternmatching and reasoning by analogy to similar-seeming past problems.
In recent years the AI industry has made some progress on automating a wide range of (what Eli calls) “routine cognitive operations” or “crystalized intelligence”. We can now make AIs that are good at using pattern-matching to solve problems that are similar to problems that humans have solved a lot. This is very different from the ability to solve genuinely new problems (“fluid intelligence”).
But everyone seems to be eliding this distinction between fluid intelligence and crystalized intelligence!
Some people don’t seem to notice the difference at all, and think that the crystallized intelligence of the AIs is the same kind of thing as the deep fluid intelligence.
Other people at least give lip-service to the difference, and then say ~“well the LLMs, with their great crystalized intelligence, will invent mechanisms for fluid intelligence.”
Insofar as inventing an AI with fluid intelligence is itself a project that is loaded on fluid intelligence, the development of these AIs with strong crystallized intelligence is nearly irrelevant to the question of when AI with fluid intelligence will arrive.
This is a kind of obvious point, but everyone around me seems to be operating on a strategic model that apparently ignores it!
I have some response, but first, is that about right, as an expression of your view?
It’s a central example of a research task that requires insight and developing new, deep, technical concepts, not just patternmatching and reasoning by analogy to similar-seeming past problems.
(I wouldn’t use these words probably, but it’s a fine gesture in the direction.)
But everyone seems to be eliding this distinction between fluid intelligence and crystalized intelligence!
[Acknowledging that you acknowledged the ontology skew] Meh. I don’t think “fluid” vs. “crystallized” is all that important / clear / useful a distinction. It kinda sorta gets at the things, and other people bring it up so I use it. IDK if other people elide that distinction. I’d talk more about general intelligence, though that packs in additional stuff; I’m using something spiritually like Yudkowsky’s definition; something like “cross-domain optimization power divided by inputs”. Also gestured here: https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce#AGI
Some people don’t seem to notice the difference at all, and think that the crystallized intelligence of the AIs is the same kind of thing as the deep fluid intelligence.
IDK if this is true.
This is a kind of obvious point, but everyone around me seems to be operating on a strategic model that apparently ignores it!
No, I don’t think there’s one specific point I think everyone’s missing, or everyone with short timelines is missing. If there were, I suppose it would be more like “notice that you’re deferring way more than you’re noticing, and that others are too, and that this is creating very bad epistemic tidal forces” or something, but that’s kinda hard to discuss productively. I keep repeating that I don’t understand how you / others got to be so confident in short timelines in part so that it’s clear that IDK what you’re missing even on my perspective. I try to ask people to lay out their reasoning more in order to bring out background assumptions that I can disagree with better, which background assumptions support the [past x years of LLM observations] --> [confident short timelines] update. But that doesn’t work that well / is frustrating.
For example, I think that besides the “crystallizes int could make fluid int” thing, you’re also saying “we have fluid int or are close to it probably”. I’m asking about that and wanting to argue against that (that is, argue against the reasoning I’ve heard so far as not supporting the stated confidence). IDK how to phrase that as a positive statement for a position summary, because I would normally view it as weird / the wrong order of operations to start negating arguments (whatever your specific arguments are) without hearing them first, but I suppose I could preliminarily say “I’m guessing that you made a wrong update from observations to timelines based on a misconstrual of what matters about intelligence and what is the causal structure between cognitive faculties and cognitive performances.”, as described here
https://www.lesswrong.com/posts/FG54euEAesRkSZuJN/ryan_greenblatt-s-shortform?commentId=QBca6vhdeKkjyNLKa
and here
https://www.lesswrong.com/posts/FG54euEAesRkSZuJN/ryan_greenblatt-s-shortform?commentId=DDaz5zcETcuyuy5Xx . But again, that feels awkward to guess about given that I don’t understand your reasons for the update.
you’re also saying “we have fluid int or are close to it probably”.
I think I’m saying “crystalized intelligence can, to a large extent, substitute for fluid intelligence”. This is true to an extent, of humans, but it’s much more true of AIs, because they can have so much more crystalized intelligence than any human could hope to attain.
This is relevant to modeling if LLM agents will transform the world and to modeling if LLM agents will rapidly give way to something that’s much more capable.
In particular, I (unconfidently) dispute that developing an AI with fluid intelligence is a research project that is itself heavily and crucially loaded on fluid intelligence.[1]
On my view, it is pretty likely that huge amounts of superhuman crystalized intelligence can find fluid-intelligence-emulating mechanisms (possibly with a necessary ingredient of a relatively small amount of genius fluid intelligence that even huge amounts of crystalized intelligence can’t substitute for, or possibly even without that input).
In that sense, we’re “close to” solving fluid intelligence, even if there’s decades of subjective research an iteration time between here and there.
I do additionally suspect that mechanisms to implement fluid intelligence are just not that hard to invent and/or scale to, starting from the AI tech of 2026. Like, it seems somewhat likely to me that that various dumb ideas would just totally work, or that doing the same stuff we’ve already been doing, but more so will totally work. However, I’m much less confident about this point, and perhaps you can teach me some things that would quickly cause me to change my mind.
I want to check if this comment is clarifying, or if it feel like me repeating things that I’ve already said.
Is there a way you could restate this in terms of one or more propositions that are fairly load-bearing for your confident short timelines? Is it this?
it is pretty likely that huge amounts of superhuman crystalized intelligence can find fluid-intelligence-emulating mechanisms
I’m struggling to get clear on the logical structure of your beliefs. Cf. your comments
Like, it seems like you can maybe get a strategically superhuman AI by relying on a lattice of more-or-less specialized superhuman skills (including superhuman engineering, and superhuman persuasion, and superhuman corporate strategy, and so on), without having much fluid intelligence.
To be clear, it also seems possible to me that we will make superhuman AI agents that don’t have this fluid intelligence special sauce. Those AIs will be adequate to automate almost all human labor, because almost all of human labor is more-or-less routine application of crystalized knowledge. We’ll be living in a radical new world of ~full automation, except for a small number of geniuses who are adding critical insight steps to the new cyborg-process of doing science.[1]
But I will be surprised if we hang out in that regime for very long, before the combined might of humanity’s geniuses augmented their armies of superhumanly capable routine engineers, and enormous computer infrastructure to do massive experiments, can’t hit on a mechanism that replicates the human fluid intelligence special sauce.[2]
and
I’m putting forward a disjunction:
Fluid intelligence isn’t necessary for Strategically Superhuman AI.or
LLM based agents will develop fluid intelligence on the default technological trajectory, via the application of not-very-clever ideas.or
There’s about one or two “breakthrough” ideas missing, that when combined with the existing LLM-agent techniques, will make LLM-agents that can do the fluid intelligence thing (or a substitute for the fluid intelligence thing). Having armies of LLM-agents that can automate engineering and experimentation seems like it should accelerate the discovery of those one or two breakthroughs.
Those last two legs of the disjunction are assuming that there are not many pieces left before fluid intelligence is solved, but not making much of a claim about how many pieces we already have. Like, depending on what one means by “pieces”, maybe we have 0 out of 1 (and we’re likely to get that one in the next five years), or maybe we have 95 out of 100 (and we’re likely to get the last five in the next five years).
and
Like, if I thought that we were conceptually / technically far from AIs that can automate the process of scientific discovery, I would much less expect a FOOM in the next 10 years (though we would still have an emergency, because automating science isn’t a necessary capability for destabilizing or ending the world via any of a number of different pathways).
And
I think I’m saying “crystalized intelligence can, to a large extent, substitute for fluid intelligence”.
Can you please clarify what are main big thingies that you expect to happen fairly confidently within 10 years? Can you clarify what drives your confident beliefs in those thingies happening? E.g. are you confident in some X happening due to being confident that we have fluid intelligence already or are close, or is that not the case?
Maybe you could make up words for the main scenarios and main capabilities / faculties you’re thinking of?
To mods (@Raemon): this would be an example of a place where a “have an LLM summarize the exchanges in this thread between these two users, with links / quotes” button could be helpful.
(Also, it seems that the option to temporarily switch to LW editor was disabled, so I cannot tag people?)
I’ve been reading this thread and feeling motivated to at least make it easier to export the conversation to clipboard so you can more easily paste into LLM (in my case so I can argue with the AI about what obvious objections you’d have to what I’d say next).
I’m a bit confused about what’s going on with the markdown editor. I’ve asked other mods.
I don’t think this result plausibly holds up very well on certain important classes of problems or it might hold but for absurdly large k. In particular, I think GPT-5.1 is based on a base model that’s pretty similar in quality to GPT-4o (possible somewhat better, possible literally the same base model, I forget) while it’s surely requires k >>1000 for the GPT-4o base model to solve the hardest math problems that gpt-5.1 >30k tokens.
But I will be surprised if we hang out in that regime for very long, before the combined might of humanity’s geniuses augmented their armies of superhumanly capable routine engineers, and enormous computer infrastructure to do massive experiments, can’t hit on a mechanism that replicates the human fluid intelligence special sauce.[2]
Would you be able to coarsely quantify this? Like, let’s say OpenThinky has 500 genius human engineers, and have 100x as much compute as today and impunity and lots of CodeBrain5 coding agent things. Or whatever parameters you want to set. Is your median time to actually FOOM/takeover more like 1 year, 4 years, or 10 years?
(To be a bit more transparent: I had not previously considered “a hundred thousand technical PhDs all try really hard to crack AGI”; I haven’t heard that argument/scenario before and I don’t know what to think about that and it would substantially shorten my timelines. I don’t yet understand why that seems likely / what scenario about that seems likely to you.)
Background: almost nothing that most humans do actually requires fluid intelligence. Most people, most of the time, are executing routine cognitive operations. And most of the people who are using their fluid intelligence on the job could do just as well or better if they had a massive memory of case studies to extrapolate from instead of attempting novel reasoning.
Most of earth’s geniuses currently spend most of their time doing routine cognitive operations—pattern matching from their prior experience to solve problems, often in the context of automatable tasks like implementing experiments or solving engineering problems. When those classes of task are automated, it will free up the capacity of the geniuses.
At this point, most of all the work in the world will be automated or in the process of being automated. Science and tech development will be going faster than ever in human history. It will be obvious to the whole world that AI is a really big deal.
Also, it will be obvious to many people that there’s something missing: the AIs are doing more and better design and engineering, faster, than human civilization ever did, and they’re accelerating the science, but they’re not doing the science. There will be enormous financial and strategic incentives to crack that.
Ok, thanks. I think I’ll probably have to chew on this scenario to say much of use. (I mean, I’ve thought about related things, but haven’t asked myself about this scenario.) My initial reaction is skepticism which I think comes from a combo of
LLMs are somewhat less useful than you seem to think
Humans apply somewhat more GI than you seem to think
The most important stuff would still be bottlenecked on human GI and would be hard to accelerate; you don’t just simply “free up” the humans in a super liquid, fungible way
If this were happening, some pretty strong political forces would be at play, including hopefully / kinda probably (??) a strong push to stop the spiral
But I’m not super confident about any of that. It’s strategically relevant but ATM I don’t have much novel perspective to offer, and it seems to need some other expertise (e.g. a good understanding of politics, of science and tech research, and similar).
Ok, I see. Now, regarding your disjunction earlier of (in my words)
A. (NAA (nonAGI AI) takeover) You can get strategic takeover AI without AGI
B. (AGI soon easy) Gippity+ will soon be AGI by adding a bit more ~mundane human research juice
C. (AGI soon hard) Gippity+++ will soon be AGI by adding a couple big insights
First, to clarify, I think the discourse on this thread is that I asked you about
“why talk about nonAGI AI”
and you said
“because 2) maybe we already have AGI basically (scenario B, AGI soon easy), and because 1) you could get transformative AI with current nonAGI AI (scenario A, NAA takeover)”,
and now we are discussing what NAA looks like and how timelines look in NAA takeover world.
Now I’m wondering, what are your very approximate relative probabilities of these things? E.g. is one of them 90% of the source of your confidence (I mean, 90% of your prob mass) in FOOM within 10 years? If they are roughly equal, I would raise my eyebrow and say “that seems kinda strange, unless there’s a shared factor such as you thinking that actually we basically have ~AGI in current systems; if so, could you clarify that shared factor”.
As stated, these don’t have to sum to 1. B and C are mutually exclusive but A can be true even if B or C are also true.
(I also object a bit to calling “strong fluid intelligence” “AGI.”
Part of what’s at stake is how far can you get with basically just specialized knowledge and the ability to train new specialized knowledge. It would be surprising to me, but not out of the question, that there’s almost nothing that such an AI can’t do that an AI with more fluid intelligence can do. But I only object a bit.)
Ass numbers:
A: 80%
B: 40%
C: 30%
If they are roughly equal, I would raise my eyebrow and say “that seems kinda strange, unless there’s a shared factor such as you thinking that actually we basically have ~AGI in current systems; if so, could you clarify that shared factor”.
I mean that’s kind of fair. But I in fact don’t have a lot of precise ability to distinguish between “one key idea is missing” and “only engineering schlep is missing”. Those wolds look very similar, to me, and so get similar amounts of mass.
I suppose the question is whether this sort of thing does FOOM / takeover. Are you saying you can make up for weak intelligence with knowledge (gleaned from human text) well enough to do that?
I suppose the question is whether this sort of thing does FOOM / takeover. Are you saying you can make up for weak intelligence with knowledge (gleaned from human text) well enough to do that?
More like you can make a weak intelligence with lots of specialized knowledge and skills, mostly gleaned from RL (though starting from the superhuman breadth of baseline knowledge that GPT-4 had), that can outcompete humans in acquiring power and/or FOOM.
Yes exactly. I’ve updated that these tasks require less general intelligence than I thought, and as a consequence, I’ve updated that tasks in general require less general intelligence than I thought.
Um ok, but, do you get that I’m also saying that you should also update that there’s some type of task which you are observing to have a different relative need for general intelligence compared to other tasks?
I think you’re saying “that coding fell to methods like these is both evidence that methods like these are more powerful than we might have guessed, and also evidence that coding required less general intelligence than we might have guessed.”
I’m right now trying to inhabit this point, and try to really grok it.
I guess it could be the case that the kind of intelligence that you need to engineer software and the kind that you need to develop novel algorithms are almost completely disjoint and unrelated. You can basically solve “make an AI that can make software”, and not have even scratched the surface of “make an AI that can make new algorithms / new interesting math concepts”.
(Is this what you think?)
It would surprise me if this were true, because it seems like there’s a lot of overlap in the mental operations between those two kinds of work.
This is a good question, thank you. (It’s an important topic which I won’t fully treat here.)
I guess it could be the case that the kind of intelligence that you need to engineer software and the kind that you need to develop novel algorithms are almost completely disjoint and unrelated
One very-cartoon model, which I guess you know but to lay it out:
Let’s say we have GI (general intelligence, like a human) and PI (performance intelligence). GI causes PI. Other things can cause PI. For example, board game AIs have PI through something spiritually equivalent to brute force search.
Humans are born with HGI. During their lifetime, humans gain HPI, which is fairly meager in some sense. That proceeds by a combo of HGI → HPI as well as copying a bunch of HPI from other humans. All humans apply quite a lot of HGI as kids. Humans who then reach the frontier of some field (which includes for example being a floor manager at a retail store) will largely apply their HPI, but somewhat apply their HGI.
Gippities are trained using fragments of GI but not probably not full GI. It’s unclear how much is missing, and it’s hard to get good evidence about that. Gippities have a lot of PI, way more than any one human. They get that almost entirely by copying from HPI represented in training data. They get some additional PI from various sources (RLVR, and just the pretraining itself (e.g. gippities know a bunch of things about the distribution of human text that no human knows), and from online reasoning (though of course there’s a memory problem, but that’s inessential)). That additional PI has a different and less-general distribution from HPI, because HPI comes from HGI which is GI. Thus gippities have a confusing summation of general PI copied from HPI, plus other PI.
In humans, if you have HPI, then you have HGI. But it’s possible to get generally-distributed PI without having GI by copying HPI, and that’s what gippities do. It’s also possible to have originary PI that is not generated by GI at all, which gippities also have. Thus, confusingly, gippities have lots of PI, some of it originary and some of it generally-distributed, but plausibly / probably not caused by artificial GI.
On a psychologizing note, which I hope to offer just as a hypothesis-piece to maybe track if you weren’t already (I think you’re pretty likely to be already aware of this, but from my perspective there’s a significant chance that you don’t think of it frequently enough): There’s a strong default to overly interpret things (behaviors, say) with a presumption of a human-shaped background mental context. E.g. how people ask “does the LLM believe X”, even though that question probably doesn’t straightforwardly translate from humans to LLMs at all and would lead to incorrect inferences about what behaviors would be concomitant. Cf. https://www.lesswrong.com/posts/L2h9nAtPqEFK6atSJ/an-anthropomorphic-ai-dilemma#Gemini_modeling_with_alien_contexts_is_hard
In particular, when people imagine changes to gippity-based systems, such as “unhobbling” by adding tool access, they imagine that what gets opened up for the gippity is similar to what would be opened up for a human if you newly made that same change (e.g. gave a human that tool). I think this drives some “we’re close to having AGI” intuitions, and I think it’s mistaken.
Modulo, I think that more of the capabilities are coming from the RLVF than from copying humans than you seem to think.[1]
They get that almost entirely by copying from HPI represented in training data. They get some additional PI from various sources (RLVR, and just the pretraining itself (e.g. gippities know a bunch of things about the distribution of human text that no human knows), and from online reasoning (though of course there’s a memory problem, but that’s inessential)).
Why are you emphasizing the pretraining instead of the RL?
It’s in some sense a confident prediction to assert that Moore’s law will continue, in 1995. But, broadly, the burden of proof is more on the side of the guy who thinks that the trend will break. Or at least, it’s not totally clear what the prior should be and where the burden of proof should lie.
Well, in this example, there is an end to Moore’s law specifically, and you can even approximately call it based on physics (atomic scale for transistors), and the guy who believes in LGU (line go up) narrowly for Moore’s law specifically would be being silly.
Let me refine my statement about burden of proof (recapitulating something from DMs). I think that 10 or 20 years ago, if you or I had had opinions about when AGI comes, we would have correctly had very broad/uncertain timelines. Is that true? Assuming that’s true: Right now, I have significantly shorter timelines than I shoulda/woulda had 10 years ago; I would have (if I had had worked out timelines) said it seemed quite unlikely, like <2% or something, to get AGI within 5 or 10 years. Now I say more like 5% or something like that.
I take it that you have quite a lot more probability on AGI within 10 years, though I just now realized I don’t actually know your specific beliefs. Could you link to writing you’ve done about your timelines beliefs? For example, would I be correct in assuming that your median is <10 years? Assuming that’s the case:
You must have had quite a large update. Your beliefs went from a broad spread out thing, to a quite sharp (relatively speaking) distribution. That update would look like something along the lines of:
By [arguments XYZ], the great majority* of prob-weighted hypotheses that say “AGI >10 years away” have been eliminated by the recent observations.
For example, if previously you had 80% probability on “AGI in >15 years”, and then 5 years later you have 50% probability on “AGI in <10 years”, then you must have lost at least 60% prob-weighted hypotheses from your original distribution.
I’m trying to understand that update. When I ask people about that update, they give various statements, but somehow I come away having no idea how they made their update to get to a confident (sharp) distribution. I mean, I can write out sentences like “gippity++ automates AI R&D”, and I mean, qualitatively I agree that this is plausible and is extremely alarming (and should be stopped ASAP etc. etc.); but I have no idea how people got so confident in that (relative to the earlier broad distribution).
I feel like I understand the question you’re asking:
~”If you previously had a very spread-out prediction, and now you have a relatively more narrow prediction, between then and now, you must have made a pretty large Bayesian update—you saw some evidence with a quite lopsided odds ratio.
If you made such an update, you should be able to point to the evidence, and explain why you think the odds ratio is so lopsided. Please do that!”
(Is that about right?)
But, I don’t get why the evidence / arguments that people are offering isn’t clarifying for you.
Like, I keep trying to point to the same basic IMO pretty straightforward considerations, and you keep saying things like “somehow I come away having no idea how they made their update to get to a confident (sharp) distribution”. [1]
I’m not sure what kind of thing you’re asking for that’s different from the kinds of things that I’m already saying. Do you want more of a quantitative model? Do we just need to get further into the argument tree?
I further get that, from your perspective, you’re saying something like “dude, give me actual evidence and arguments”, and I’m somehow being a dunce about that. But I don’t get what exactly you’re asking for.
Which to be clear, is socially and epistemically valid, on your part. Please continue to loudly say “I don’t get why everyone thinks this”, for as long as that’s true. I want to do the opposite of shaming you for not getting it.
I don’t believe I can pass your ITT, but I will try to draw a sample from my model of you (which is almost entirely a much more blobulous model of generally short timelines people, so sorry for resulting lumpification), in the form of a dialogue. [After writing it, this isn’t yet that much of a great attempt at understanding you, sorry; maybe it’s still a helpful summary of what I think the discourse state is.]
Shorty: Previously there were a whole bunch of tasks that we would have considered the sole domain of real thinking / minds / agents, such as advanced math and science, complex fast functional coding, language understanding, etc. Now we’ve observed systems that, using fairly simple principles, learn to do all that and more. There are various weaknesses around the edges, especially for very hard tasks, some hard to specify tasks and values-based tasks, and various things that seem to be fixable with a bit more engineering (e.g. agent scaffolding, tool access, other unhobbling). It’s true that for very advanced things, e.g. expert questions in advanced fields, LLMs are not very useful or accurate (or maybe useful mainly in narrow ways such as narrow search, verification, engineering legwork, or shallow brainstorming). But most humans would be even more useless in such contexts. It’s hard to point to anything of much relevance that LLMs can’t do, that many or most humans can do. There’s some good reason to not be utterly confident that we’ve gotten the lion’s share of what’s relevant about intelligence, but the overall update to a pretty high confidence in that is justified by the very surprising observations I just gestured at.
Sumwat Longery: It turns out, surprisingly to ~everyone, that solid performance on a wide range of technical tasks is not that connected to GI. Even you agree that gippity performances don’t exhibit much GI and are mainly the result of distilling performances present in the training data. This would seem to demand not a sharp update to “we have GI” but rather a search for better understanding of the distinction between GI and technical performance.
Thread 1 (not very Eli?):
Shorty: Well what if there is no such thing as deep general intelligence stuff?
Sumwat Longery: [head barely not exploding] Um ok but have you noticed that LLMs have many OOMs less sample efficiency than humans, as one example?
Shorty: Well we’ll just throw more compute at it.
Sumwat Longery: Ok, but so, in saying that, you’re agreeing that there is such a thing as general intelligence (which grants sample efficiency, for example), and we haven’t gotten it, right?
Shorty: For some reason I’m going to pretend you didn’t say that.
Thread 2 (more Eli?):
Shorty: Well actually, even if we didn’t get GI yet, people will use LLM-based systems to do ~automated AI R&D, greatly accelerating AI research, probably in a compounding way.
Sumwat Longery: Ok, but then, the research you’re automating is research that hasn’t produced the insights needed for AGI. (Also I’m skeptical that you’re even automating the important parts of that research; in which case Amdahl gets you. But it make sense to expect some nontrivial speedup from that, like 1.5x or 2x or something. This point is somewhat overlapping with the previous point, or in other words, it offers another explanation of the previous point: The most important bottlenecks on AGI capabilities research are hard, probably won’t be automated soon, and haven’t had all that much progress made on them.)
Shorty: [I’m not sure what to put here because I’m bad at listening sometimes haha, but if it’s Eli then:] These AI R&D efforts may be greatly further accelerated by the broader economic productivity of current systems, which would bring in resources, talent, and sociopolitical power.
Sumwat Longery: Oh ok. So your short AGI timelines are largely coming not from a belief that we’re conceptually / technically very close to having solved AGI, but rather from a belief that we’ve crossed a threshold of compounding accretion of human efforts towards solving AGI?
Even you agree that gippity performances don’t exhibit much GI and are mainly the result of distilling performances present in the training data.
Um no. At least if “training data” is meant to refer to the text corpuses used in pre-training. I think the problem-solving capabilities are mostly coming from the RLVF.
Sumwat Longery: Oh ok. So your short AGI timelines are largely coming not from a belief that we’re conceptually / technically very close to having solved AGI, but rather from a belief that we’ve crossed a threshold of compounding accretion of human efforts towards solving AGI?
I would not endorse this.
Like, if I thought that we were conceptually / technically far from AIs that can automate the process of scientific discovery, I would much less expect a FOOM in the next 10 years (though we would still have an emergency, because automating science isn’t a necessary capability for destabilizing or ending the world via any of a number of different pathways).
This would seem to demand not a sharp update to “we have GI” but rather a search for better understanding of the distinction between GI and technical performance.
I would be excited about attempts at clarification here! (Modulo that they seem potentially very infohazardous.)
if I thought that we were conceptually / technically far from AIs that can automate the process of scientific discovery, I would much less expect a FOOM in the next 10 ye
Ok great. Can you clarify why you think this? Previously you wrote, in response to “What makes you think we’re close?”:
That the AI agents are already able to do, or are a few METR doublings from being able to do, almost all of the mental work that humans do, weighted by “time spent doing that work.”
Can you clarify / expand? What makes you think the METR results imply we’re close to having algorithmic ideas sufficient to automate scientific discovery?
the current technological trajectory will lead to AIs that are strategically superhuman without any innovations that are a bigger deal than the original “Attention Is All You Need” paper.
One thing which is awkward about this operationalization is that it’s not clear how big an innovation “attention is all you need” is. Like, we already had multi-head attention before that paper, and ablating out the other components from the transformer isn’t that galaxy-brained a thing to try (though of course the authors executed well on it).
I think other formats are better for me than writing comments back and forth.
Curious what formats. (E.g. I’m happy to have a call to be published on YouTube; we could do a private discussion, though that’s not my preference.)
Just for the public record:
I feel like I’ve personally engaged with your arguments a decent amount
I don’t agree with this (e.g. I doubt either Greenblatt or I could give a high-quality ITT of the other’s view, and I expect that one or both of us would update substantively if we understood each other).
Is it your experience that if someone comes to understand your view, they update significantly towards it, even if they had an elaborate-ish short timeline view beforehand?
Yes, but this is an extremely weak signal, in that it’s a small number (n=2 to 5 or something, depending who you count) and it’s pretty selected for “people who talk to me a lot or people who actively tell me that I updated them somewhat”, and I have a high bar for understanding (for both parties symmetrically). Abram might be someone who somewhat understands my view but only slightly updated towards it?
So what makes you think anyone has a method for creating computer programs with “human level fluid intelligence”?
I wouldn’t exactly say that we have a method, more like a field wide methodology that I expect will probably yield fully automated AI R&D within maybe 6 years and will probably yield top-human-expert level fluid intelligence within maybe 9 years. (As in, my median to fully automated AI R&D is 6 years while my median to top-human-expert level fluid intelligence is somewhat longer due to the possibility of a significant lag between these milestones dragging my median out by some amount. Note that the median to B is in general not the same as median A + median from A to B and this applies in this case.)
(Technically, I actually think that we’ll probably achieve human level fluid intelligence within 8 years in some way where a large majority of this probability comes from something like the methodology of the current field, including the part of the methodology that involves getting weaker AIs to automate AI research in all kinds of ways but most focused on this methodology. I think some fraction of the mass comes from very different methodologies, but this is a smaller fraction. The probability of very different methodologies yielding powerful AI in the next ~10 years is boosted significantly due to going through a particularly salient compute region—somewhat above human lifetime compute—and higher attention on AI in general.)
As far as why I think this methodology will probably yield “human level fluid intelligence”, this is mostly downstream of a mix of qualitative and quantitative extrapolations combined with thinking that armies of automated AI researchers matching top human AI researchers and doing everything they do but better would probably find a way.
I agree with you that something like the crystalized/fluid distinction is relevant here, and that current LLMs seem to have more of the former. But I’m also confused about where the fluidity ever comes from on this model. Like, I buy that armies of automated researchers which are better at doing everything than top human researchers could probably find a way to figure out how to build “human level fluid intelligence,” but I am confused about how you get to that step in the first place. Why are they better than human researchers at everything when they are still mostly using crystallized intelligence?
(I get that me asking followup questions could be frustrating because you may not want to talk about this; I would just ask that if you’d like to bow out, if you could, give one or two sentences on why, coarsely—e.g. “the whole topic doesn’t seem valuable” or “I don’t want to discuss this publicly” or “you Tsvi won’t have a productive convo about this” etc.; I ask because I have a general feeling that there’s an unidentified blob of consensus around short timelines, which IMO is irrational and overconfident (meaning, it reached incorrect conclusions due to systematically substantively flawed reasoning), and I’m not aware of a good way to get recourse (i.e. to ask “take me to your leader”), so I’d at least appreciate hearing epistemic metadata in order to orient to that situation.)
What’s the relative weight of these considerations, grossly speaking? Is it more like 90% extrapolations and 10% hand-off or more like the reverse? For the 90% one, if there is one, what are one or two of the most compelling points? For example if it’s extrapolations, what are you extrapolating (and why do you think that extrapolates to high fluid intelligence)? If it’s the hand-off, why do you think that the current-human-research that would be handed off is a kind of research that would soon produce high fluid intelligence (given the apparent circularity of this specific line of reasoning)?
I’m going to reply kinda briefly and then I’m going to bow out with some chance I comment more later.
Hmm, these are tied up a bit. Like I’m doing the extrapolations to get to automating AI R&D to various extents and then trying to account for acceleration due to automation. But I’m also extrapolating out to human fluid intelligence somewhat and if this extrapolation looked extremely far that would influence my view.
To give a rough (and potentially not that stable on reflection answer):
50% extrapolations towards automation of current AI R&D (both towards full automation and large speeds up from very strong partial automation)
25% armies of AIs that can automate AI R&D will be able to find a way to get to human level fluid intelligence
25% extrapolations towards human level fluid intelligence
This is something like an importance attribution. E.g., if I updated toward a way lower qualitative extrapolation on (3), that would update me quite a bit, but probly not shift my 25th percentile by more than a factor of like 2-3.
For (1), I’m doing a mix of something like time horizon, some informal sense of how good AIs are at research tasks I give them and what I see from others, and some longer list of considerations. And part of this is factoring in earlier speed ups moving us along the trajectory faster.
For the hand-off (what I would call (2) in my breakdown), I can see why the argument seems circular, but I don’t think it’s actually circular. My view is like “probably tons of AI R&D via stuff sorta like the current stuff we’re doing would yield this (due to a mix of my observations from ML, my extrapolations of current progress, thinking of specific things you could do that I won’t mention here without more consideration)” and “probably you’ll be able to get AIs to do this without very high fluid intelligence, though I do think AIs at this point will match somewhat worse human engineers/scientists at moderate time horizon fluid intelligence (e.g. over the course of a few month) on the relevant tasks”.
(My views are also somewhat based on defering to other people and the forecasting track record of other people.)
Why I’m not engaging much here:
I feel like I’ve personally engaged with your arguments a decent amount and it seems unlikely that further argumentation would change my perspective very much.
I’d guess that there aren’t interested third parties for whom I care a bunch about their views for whom my discussion with you would update them at all.
I have limited capacity for writing stuff and there is other stuff I want to write about that seems more important in the short term.
I think other formats are better for me than writing comments back and forth.
I’m moderately excited to argue with people about timelines, especially via formats other than writing comments back and forth, but most strongly if (1) I think it might update my views (2) third parties I care about care about the other person’s views or perspective.
I’m not claiming that you do or should care a bunch about my views, but I am extremely interested in the questions “will LLM minds scale to superintelligence without any radical breakthroughs?” and “if not, can armies of very capable LLM-agents discover those breakthroughs?”.
If I had a stronger understanding here, it would probably influence what I predictions I highlight to policymakers.
(eg, I’m currently telling them that “the labs are planning to automate AI development; we’re already seeing AIs automating many simple AI research tasks [show them a demo of that]; given the METR trendline it’s likely that there will be full or near-full automation in 2027 to 2029; and (while it is hard to forecast with any precision) is likely that there will be strategically superhuman AI agents weeks to months after that point.”
If I came to think that LLM-agents automating AI research is much less likely to lead rapidly to Superintelligence than I currently think is plausible, I would probably not want to focus so on the above talking points in initial 30 minute meetings, in which we have to be selective about which points to convey.)
Also, generally, if I had more mass on “continuing to scale LLM-agents will not lead to superintelligence, either directly or by enabling the discovery of critical breakthroughs”, this would back up into my general strategic models and strategic priorities.
Just for the record:
This one I wouldn’t claim to know that someone else doesn’t know this, and it’s what I would assume;
Pretty skeptical of “full” or “near-full”, in that I would still expect things to be largely bottlenecked on human judgement and not making much (say, greater than 2x or something?) progress compared to currently; in other words, I expect Amdahl. There’s also a big ambiguity and/or assumption about what you’re automating (if you successfully automate a process which doesn’t invent AGI, then you’ve done something but you haven’t set off an intelligence explosion or similar).
This is where I think no one has any especially strong reason to think so, or at least, no one has told me so far even in private, let alone publicly (and therefore consensus views seem quite mistaken).
I mean, in the absence of specific arguments and info, “there’s a double digit probability that this will lead to superintelligence” seems like a correct or reasonable prior? So, I find it hard to engage with the way your sentence is phrased, which feels like it’s trying to make a claim about the burden of proof.
It might be that there’s something fundamentally missing from LLM agents, such that without breakthroughs, they can’t grow into superintelligences. But that’s sure not obvious to me. A lot of problems that people said were fundamental have fallen to scale and RL.
In an attempt to be legible, I think there’s at least a 50% chance (baring big government interventions) that the current technological trajectory will lead to AIs that are strategically superhuman without any innovations that are a bigger deal than the original “Attention Is All You Need” paper. (Maybe you think that’s obviously crazy?)
And regardless, it sure looks like we’re on track to automate a very wide range of “routine” cognitive operations and AI research tasks. My guess is that it will soon become clear that almost everything that almost every human does is made of routine cognitive operations that don’t require deep fluid intelligence. This includes the researchers at the AI companies.
AI research as it is done now seems pretty empirical / “throw spaghetti at the wall”-ish. Progress is mostly not driven by groundbreaking genius ideas, ala Einstein, but rather by listing 20 obvious next things to try and trying them. It seems like LLM agents will be able to automate that.
And even if the LLM agents, can’t do some crucial “deep thinking” step that’s necessary for getting to the next paradigm, automating all the routine AI development tasks will free up a lot of genius AI researcher attention, and allow them to build intuitions by operating the domain a lot faster.
It seems pretty surprising to me if we automate all routine AI development tasks (and for that matter, almost all cognitive work) and then we just hang out there for more than 5 years. It seems like at that point, AI will be radically transforming the world, and ~a whole generation of geniuses + armies of millions of in-many-ways-superhuman LLM agents will be turning their attention to figuring out the missing secret sauce for true AGI.
Do you think I’m missing something here?
Can you expand on this, and in particular, how it gets you to be quite confident (relative to the prior where any given innovation isn’t confidently going to grow into superintelligence)?
I don’t think it’s obviously crazy at the outset, meaning, I could have seen it being the case that someone could know something that gets them to 50% on this. I am starting to think that it is obviously crazy, though, because no one can give an argument for this! Like, can you explain why you think this? In other words, I think it should be obviously crazy to someone who basically just updated off of a bunch of other people updating; you should be tracking when you’re doing that. In plenty of cases doing that update could be rational, but then in such cases usually it would also be rational to later update on the (strange) observation that no one can actually explain / argue for / defend the initial large consensus update.
https://www.lesswrong.com/posts/FG54euEAesRkSZuJN/ryan_greenblatt-s-shortform?commentId=upZ8KPxneNxnJHd8D
One man’s modus ponens is another man’s modus tollens. The stuff that LLMs will soon automate probably isn’t the stuff that makes AGI. Or, go ahead and argue the opposite, but I will point you to the fact that your arguments better not imply that current LLMs would easily frequently be able to create many novel useful interested concepts like humans do (narrator: his arguments would imply that).
Again, modus ponens meet modus tollens and Amdahl. According to your argument, the Industrial Revolution (or the invention of computers, or the invention of compilers, or operating systems, or the internet, or Google) should imminently create AGI because it frees up a bunch of human capital. Now, that’s probably true! Just not on any specific timeline.
This question seems to be insisting on a weird burden of proof, to me. I’m going to try to answer it straightforwardly, but I imagine my answers will feel frustrating, or missing the point, or something.
The AI agents we have now meet many to most of the criteria for AGI that many people put forward in previous decades. It’s not crazy to say that Claude 4.6 is an AGI, and insofar as it isn’t, its not very clear what’s missing.
Last I checked, GPT-5.something was the 6th best competitive programmer in the world. The AIs are winning gold in math Olympiad competitions. They’re clearly better than me, already, at almost all of technical thinking.
I don’t see a fundamental reason they shouldn’t be able to eg design components of a nuclear reactor, or design a whole nuclear reactor, better than than the best human nuclear engineers, except that their time horizon is too short to make much progress. But the AI time horizons have been doubling on a consistent exponential since GPT-3 at least.
And notably, on the path to getting here, we went from language models that mostly output gibberish, to chatbots that are vastly more knowledgeable than any human, to agents that can solve open profesional level math problem and build software better than most humans and faster than almost every human, by applying very simple and obvious ideas.
“Huh, GPT-2 worked pretty well. What if we do the same thing, with a bigger model, on more data”. This worked.
“What if we have the model train on procedurally generated math and programming problems”. This worked.
“What if we build simple scaffold that queries the model, to implement an agent.” This didn’t work at first, but a combination of making the models better, making better scaffolding, and training the model in the scaffolding got it to work.
I am not an AI researcher. But I had all of these ideas, years before the AI companies got them working. (In contrast, I would need a lot of technical expertise, and maybe more raw intelligence than I have, to invent eg the attention mechanism.)
The amazing progress in AI over the past 5 years has been driven by taking very obvious ideas, and working out the engineering details to implement them.
I don’t have any strong reason to think that this trend of applying basically simple ideas to LLMs and getting increasingly impressive capabilities will break. There were lots and lots of things LLMs couldn’t do well, that people claimed were fundamental problems, that were solved naturally in the course of scaling.
There are still some gaps left. Notably, their time horizons are too short—but as noted, progress is being made continually on that. Also, LLMs can’t really invent new concepts, at least after training, which seems like a blocker for really doing science. Humans are still vastly more sample-efficient than AIs in training.
Do I know that all the gaps that are left in the LLM agents will be solved by the continual application of basically simple ideas and engineering schlep? No, obviously not. But I also don’t have any strong reason to think that they won’t be.
Eliezer was saying in 2021 that GPTs are only memorizing “shallow patterns” and so don’t embody the deep parts of cognition. It’s unclear if that’s a correct gloss. If it is true, it turns out you can get surprisingly technically competent by only memorizing and applying shallow patterns. It turns out the linguistic traces of human thought have a lot more of the true generators of human thought contained within them, than I would have guessed.
And in any case, we’re doing RLVF now, and so the AIs can bootstrap from human concepts to learn from their own trial and error, just like alpha-go. Maybe you can get all the way to AIs that are superhuman in ~every domain, by just designing a huge number of RL environments, and giving the agents the affordance to design their own RL environments to train on in response to novel circumstances, despite still having weak fluid intelligence.
Plus there are more-or-less obvious ideas for continual learning which seem like they would enable the AIs to develop new concepts.
Given all this, it feels like at least 30%(?) that everything left to do can be automated by the LLM agents of the next two years.
Well the difference in this case is how close we already are (or seem to be). When the industrial revolution happened, or when computers or compilers were invented, there were still many scientific discoveries to be made and engineering challenges to solve, between us and AGI.
Now, there are plausibly only engineering challenges, which we can automate with our existing AIs, and if not, probably only one or two scientific discoveries left.
This is unimportant, but I find the insinuation that everyone is updating on everyone else’s views kind of annoying. I’m not immune to the hype cycles, but I feel like I’m mostly updating on things that I’m seeing with my own eyes. It’s fine though—I can imagine how gaslighting it might feel if almost everyone around me was asserting or assuming some very important point, and none of them could manage to make an argument for that point.
I’m not sure what’s weird about it, but yes, I think someone claiming to predict the future confidently as opposed to the more default background broad uncertainty would have the burden of proof.
Yes, but one has to update one’s beliefs about everything (or more feasibly, about the most relevant things), not just about one thing. There is a missing update here: those people had bad models, and in fact those tasks are shocking not only achievable through general intelligence. See https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce#Things_that_might_actually_work:~:text=There is a-,missing update,-. We see impressive
Do you think it has high fluid intelligence (assuming as best you can, arguendo, that this phrase maps to something meaningful + important)? If yes, why (given that you’d be disagreeing with a lot of other short timelines views)? If no, why talk about AGI that doesn’t include high fluid intelligence?
Just because I, a non-engineer, cannot explain to you in detail why this pile of steel beams will not stand up as a bridge, does not mean it is a bridge. See https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce#The__no_blockers__intuition
I think it’s true and good to be worried that the AI research community could adaptive creatively surprise us with inventing AGI seedstuff in 1 year or 5 years. What I’m arguing against, or just trying to understand, is people (such as you) seeming to have very high confidence in this (like having a median of 5 years, i.e. 50% chance). That sure sounds like positive knowledge of us having almost all of fluid intelligence AGI seedstuff. No?
Except for the most important parts, such as orienting to a new domain / new question in a manner that produces successful understanding in the long run.
Response 1: This type of reasoning does not work for all those other previous big breakthroughs (such as the invention of the universal computer, of the operating system, or of google search).
Response 2: Consider the hypothesis that it went up fast because it used up available data. And, as you can see in the rest of the top-level thread (e.g. from Kokotaljo and Greenblatt) that in fact people use this as an excuse (as it were) for LLMs performing badly on some things.
Which suggests that they aren’t much of an idea. Which suggests that we don’t understand much about intelligence. I think you’re trying to say “turns out that fairly obvious things are most of intelligence”. And I’m trying to say “actually most of that was just unlocking what was already fairly shallowly available in the massive training corpus, so we did not get much evidence that we understood / can build the generators of that corpus or of fluid / general intelligence in general”.
Wait can you expand on this? Why do you think true generators of human thought are contained in them / picked up by LLMs?
What makes you think we’re close?
This comment felt like it made a better model of your views click. ISTM you think something like:
All the impressive ML results so far have only worked either in a narrow subspace around the training data (e.g. LLMs, still mostly the case even with RL), or in very small worlds (e.g. pure-RL game-players). There has been ~zero progress on fluid/general intelligence. Therefore, extrapolating straight lines on graphs predicts ~zero progress on fluid/general intelligence by doing more of the same kind of thing. The induction on increasing ‘intelligence’ that lots of other people appeal to only works by inappropriate compression.
It’s still likely that we live in something like the 2011-Yudkowsky world as described in this tweet, with AGI to come from a lot of accumulation of insight. ML successes misleadingly make that world look falsified, if you aren’t tracking what they are and aren’t successes at.
(Implied) The fact that [ML results so far required surprisingly little understanding-of-intelligence] is not significant evidence that [other-things-you-might-expect-to-require-understanding, e.g. fluid intelligence, will require less understanding]. If we’ve learned something about how little understanding-of-intelligence was needed to build things that succeed on some tasks, this still just doesn’t say much about AGI.
(Or maybe you don’t believe that ‘fact’ about ML results so far, idk.)
Intuitively-to-me, there should be a big inductive update on this level, even if induction on ‘intelligence’ doesn’t work.
Like, it’s evidence against the way of thinking that says understanding of intelligence is important. When you say (implicitly) ‘we probably need lots of AGI seedstuff’, I want to say ‘why isn’t the thought process you’re using to say that surprised, and downvoted, by how little stuff we needed to make LLMs?’.
I largely agree with this, yeah. It would need some probability caveats; I put nontrivial, like O(1-5%), on various scenarios leading to AGI within 10 years—largely the sorts of things people talk about, and generally “maybe I’m just confused and GPT architecture / training plus RLVR and a bit more whatever basically implements a GI seed” or “maybe I’m totally confused about “GI seed” being much of a thing or being ~necessary for world-ending AI”.
I also wouldn’t have quite so tight a categorization of sources of capabilities. Cf. https://www.lesswrong.com/posts/FG54euEAesRkSZuJN/ryan_greenblatt-s-shortform?commentId=QBca6vhdeKkjyNLKa
Yeah, something like that. (I feel I have very little handle on how much insight is left, social dynamics around investment in conceptual “blue” capabilities research, etc.; hence very broad timelines. I also don’t much predict “there aren’t other major, impactful, discontinuous milestones before true world-ending AGI”; GPTs seem to be such a thing.)
It should probably be slightly directionally downvoted (though I’m not sure which preregistered hypotheses are doing better). But I think not very much, because I think that we did not observe “surprisingly obvious / easy / black-box idea generates lots of generally-shaped capabilities”. Partly that’s because the capabilities aren’t generally-distributed; e.g., gippities aren’t good at generating interesting novel concepts on par with humans, AFAIK. Partly that’s because there’s a great big screening-off explanation for the somewhat-generally-distributed capabilities that gippities do have: they got it from the data. I think we observed “surprisingly obvious / easy / black-box idea suddenly hoovers up lots of generally-shaped capabilities from the generally-shaped performances in the dataset (which we thus learned are surprisingly low-hanging fruit to distill from the data)”. (I do have the sense that there’s some things here that I’m not being clear about in my thinking, or at least in what I’ve written. One thing that I didn’t touch on, but that’s relevant, is that humans seem to exhibit this GI seedstuff, so it at least exists; whether it’s necessary to have that seedstuff to get various concrete consequences of AI is another question.)
Sorry, this is a tangent from this comment thread, but an important one, I think:
LLMs aren’t good at generating interesting novel concepts on par with humans in deployment. But in deployment, we’ve turned off the learning, so of course they’re bad at inventing interesting novel concepts. A brilliant human with anterograde amnesia would also be quite bad at inventing interesting novel concepts.
It seems much more unclear if LLMs develop interesting new concepts in training, while they’re still learning.
They probably generate all kinds of interesting intuitive / S1 concepts and fine distinctions that allow them to get so good at the next token prediction task, just as experts in a domain generally learn all kinds of specialized conceptual representations.
(Though, apparently, and unlike human experts, the models don’t thereby learn words for those concepts, or have the ability to introspect and put handles on their conceptual representations, any more than I can introspect into how my visual cortex works.)
More speculatively, an LLM agent might invent new explicit concepts for itself and learn to use them, in RLVF training, especially if different rollouts are allowed to communicate with each other via a shared scratch-pad or something. I don’t think we have seen anything like this, and I’m not particularly expecting it at current capability levels, but I don’t think we can rule it out.
When we say that LLMs don’t generate new concepts, we’re selling them short. The part of the whole LLM system that has something-like-fluid intelligence to come up with new concepts is the training process, which we basically never interact with (currently).
I think I would generally avoid saying that LLMs or current learning programs don’t generate new concepts simpliciter. Plausibly I did, but if so, I’d hopefully be able to claim that it was a typo or elision for space/clarity. What I said here was “good at generating interesting novel concepts on par with humans”. I know perfectly well that LLMs gain concepts (after a fashion) during training and have written about that. I would dispute them using / having concepts in the same relevant ways that humans have them though.
I’m confident that there’s lots of interesting content generally speaking contained in LLMs, gained through training, which is unknown to all humans. (The same could be said of other systems such as AlphaGo, and even old-style Stockfish during runtime if you admit that.)
So like, yeah, they have something kinda related to human concepts in their full power, but not. This fits with my claim that they don’t have much originary general intelligence; they have distilled GI from humans, some more distilled stuff that’s not exactly “knowledge from humans” but is kinda more narrow (like, LLMs know word collocation frequencies like no human does); and some other stuff that’s not very general. I posit.
Thanks! The distinction between “generating capabilities” and “hoovering up capabilities” is another small click for me.
Can you give an example of a thing that you’d be surprised if an AI did in the next, say, 1.5 years?
Kill everyone? I’d be pretty surprised, like 1 in 100 or 200 surprised or something like that.
Generating interesting novel concepts on par with humans? See https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce?commentId=dqbLkADbJQJi6bFtN
See also https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce?commentId=HSqkp2JZEmesubDHD#HSqkp2JZEmesubDHD
Now, would you list something impressive that you do expect an AI to do in the next 1.5 years (that I might not say)?
I’m more trying to operationalize “interesting novel concept.” (But, it does look like we had approximately this conversation before and I’ll try to reread first. I think basically you said “they generate a novel concept that hadn’t been generated before and also people go on to use that concept in industry/science”, does that sound right?)
Part of what brought me here was remembering you saying:
And wanting an example of thing that’s more like “what’s something that’d make you go ‘okay, this was in fact as smart as a four year old’” (and therefore either the end is nigh, or, we’re about to learn that children in fact did not have nearly all the intelligence juice.”)
I’ll try to think about some bets for ~1 year from now.
Yeah, basically. I’m trying to be concrete here, and just saying “their intellectual output could be judged like human intellectual output is judged”.
It’s a good question but it’s hard because that stuff looks from the outside like mostly pretty easy tasks. The way in which it is not easy is the way in which it is not “a task”. I guess, “very sample efficient learning” would be a concrete thing that 4yos do.
Nativization of a pidgin into a creole language might be an example, especially given that it seems to be largely underwritten by the cognitive plasticity of the linguistic developmental window.
Given that Opus 4.6 fails on very basic Classical Greek exercises (evidence towards “jaggedness”/bad “OOD generalization” even on very simple (though knowledge-heavy) tasks), I would be very surprised if it managed to successfully do something as unusual/OOD as creolizing a pidgin. It might also be very difficult to train it to do so, as it’s a very open-ended thing, and thus it’s very unclear how to specify a reward, and I would guess there isn’t much data on the internet that could be used for training.
After I read this comment, my hasty-guess-of-a-Tsvi-model replies: ‘the big surprise is that “solid performance on a wide range of technical tasks is not that connected to GI.” This surprise sufficiently explains the surprise of ~easily achieving that performance. Any ex ante expectation that those tasks required lots of understanding would/should have been mediated by expecting they required GI. Given that they don’t require GI, it’s [not surprising? / not relevantly surprising?] that they don’t require much understanding.’
[Sorry for the long delay.
I wrote this response the day you sent the above. But it felt clear that we were missing each other, and I wanted to try to inhabit your view, in an attempt to make more effective progress. But I was too tried to do that well at the time, and so put this away to come back to later. But I have a day job in addition to the other projects that I’m trying to push on, and this fell by the wayside for two weeks.]
Well part of the debate here is what the prior ought to be.
It’s in some sense a confident prediction to assert that Moore’s law will continue, in 1995. But, broadly, the burden of proof is more on the side of the guy who thinks that the trend will break. Or at least, it’s not totally clear what the prior should be and where the burden of proof should lie.
From your linked post:
Yes exactly. I’ve updated that these tasks require less general intelligence than I thought, and as a consequence, I’ve updated that tasks in general require less general intelligence than I thought.
No, I think Claude 4.6 has quite weak fluid intelligence. I previously described that as “the LLMs are actually not very intelligent at all, but it turns out that you can make up for moderately weak intelligence with a lot of knowledge.”
Because high fluid intelligence (at least as we currently conceive of it) 1) is maybe not necessary, and 2) might come from the default trajectory of LLM-AI development.
Like, it seems like you can maybe get a strategically superhuman AI by relying on a lattice of more-or-less specialized superhuman skills (including superhuman engineering, and superhuman persuasion, and superhuman corporate strategy, and so on), without having much fluid intelligence.
To be clear, it also seems possible to me that we will make superhuman AI agents that don’t have this fluid intelligence special sauce. Those AIs will be adequate to automate almost all human labor, because almost all of human labor is more-or-less routine application of crystalized knowledge. We’ll be living in a radical new world of ~full automation, except for a small number of geniuses who are adding critical insight steps to the new cyborg-process of doing science.[1]
But I will be surprised if we hang out in that regime for very long, before the combined might of humanity’s geniuses augmented their armies of superhumanly capable routine engineers, and enormous computer infrastructure to do massive experiments, can’t hit on a mechanism that replicates the human fluid intelligence special sauce.[2]
Maybe I’m wrong about how hard the problem of developing a mechanism that can do fluid intelligence is, or about if it’s the kind of thing that can be accelerated by armies of superhuman engineers. But just eyeballing how ~every AI capability since the advent of deep learning came to be, it seems like it involved a lot of tinkering, and running empirical experiments to see what works, and optimizing metrics, and bitter-lesson-style scaling, not eg Einstein style genius conceptual breakthroughs. To my only somewhat informed eye, it looks like the way AI capabilities are developed is exactly the kind of thing that armies of superhuman engineers doing the routine-cognition part of research should be able to do.
We call it “grad-student descent”, as a way to emphasize how much it resembles a dumb search process. And there will be a lot more AI agents, running a lot faster, than there ever were grad students.
No, I’m putting forward a disjunction:
Fluid intelligence isn’t necessary for Strategically Superhuman AI.
or
LLM based agents will develop fluid intelligence on the default technological trajectory, via the application of not-very-clever ideas.
or
There’s about one or two “breakthrough” ideas missing, that when combined with the existing LLM-agent techniques, will make LLM-agents that can do the fluid intelligence thing (or a substitute for the fluid intelligence thing). Having armies of LLM-agents that can automate engineering and experimentation seems like it should accelerate the discovery of those one or two breakthroughs.
Those last two legs of the disjunction are assuming that there are not many pieces left before fluid intelligence is solved, but not making much of a claim about how many pieces we already have. Like, depending on what one means by “pieces”, maybe we have 0 out of 1 (and we’re likely to get that one in the next five years), or maybe we have 95 out of 100 (and we’re likely to get the last five in the next five years).
I mean, that’s very true of current LLM-agents after they leave training. It’s also true (though less so, I think), of LLMs in training—they come away with a massive library of concepts that they can’t wield as deftly as a human
But it’s also true of AlphaZero, in some sense, in that alpha-zero improves much less from each game it plays than a human does. But also alpha-zero can play enough games, fast enough, to become superhuman at go in a few hours.
Maybe. But this does seem to be what works in Deep Learning, even if not in other CS subfields.
How does this relate to the fact that AIs are now getting better by training on procedurally generated problems instead of human data?
Are you suggesting that RLVR, is only eliciting capabilities that are already in the base model, rather than instilling new capabilities?
Because GPT-4 can do more reasoning than I would have naively guessed, under the hypothesis “GPT-3 is only memorizing shallow patterns, not the real, deep patterns of cognition.”
That the AI agents are already able to do, or are a few METR doublings from being able to do, almost all of the mental work that humans do, weighted by “time spent doing that work.”
. . .
But overall, it seems like we’re obviously talking past each other, or something. Maybe I can try to articulate your view as I understand it and you can offer corrections?
It sounds like you’re saying something like...
Look, the important and dangerous thing about AGI is that it can do the cognitive operations of science / discovery / inventing and operating in new fields, at a superhuman level. The danger lies with AI that is able to make fundamental discovries the way a scientist does (and then apply / wield those discoveries). An AI that isn’t really able to make fundamental discoveries is just not that dangerous.
LLMs and LLM-agents can do a lot of seemingly impressive stuff, but they’re really dramatically bad at orienting to new domains or making discoveries like that.
They’re something like an Eliza-bot, in that Eliza-bot could use simple mechanisms to generate conversational outputs that appear like a conversation. Someone talking with Eliza might be astonished, and think with only a little improvement, the next generation would be able to converse as completely as a human can. But that’s an illusion: the simple mechanisms that Eliza is exploiting are basically not adequate to produce anything like a real conversation.
Similarly, the LLM-agents are able to do some portion of technical work that humans do, but they’re basically not doing the interesting parts. And the interesting parts are almost all of the problem. That the LLM agents are able to do some technical work is very little evidence of how much additional conceptual work needs to happen to solve the hard and interesting parts of AGI.
Someone who’s impressed with o3 or Claude Mythos, and thinks that there’s very little left to add before we get to AIs that can automate all or almost all of scientific progress is making an error analogous to someone who thinks that there’s very little left to add to Eliza to get AGI, because it’s so close to intelligent behavior.
As a side note, this world comes along with all kinds of new dangers that it’s not clear that we’re equipped to deal with, that mostly fall under the headings of “misuse” and “concentration of power”. I’m not sure how high the risk is, but if we fumble this, we could totally loose the game.
This would be a very scary world, because if LLM-agents already have all the pieces to be a strategically superhuman agent, except for one, and we’ve built out huge compute infrastructure for running them, we’re in for a very hard takeoff once someone builds the first “real” AGI.
@TsviBT here’s my distilled paraphrase of your view, perhaps mostly in my own conceptual vocabulary. Let me know how close this is.
I have some response, but first, is that about right, as an expression of your view?
Thanks.
(I wouldn’t use these words probably, but it’s a fine gesture in the direction.)
[Acknowledging that you acknowledged the ontology skew] Meh. I don’t think “fluid” vs. “crystallized” is all that important / clear / useful a distinction. It kinda sorta gets at the things, and other people bring it up so I use it. IDK if other people elide that distinction. I’d talk more about general intelligence, though that packs in additional stuff; I’m using something spiritually like Yudkowsky’s definition; something like “cross-domain optimization power divided by inputs”. Also gestured here: https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce#AGI
IDK if this is true.
No, I don’t think there’s one specific point I think everyone’s missing, or everyone with short timelines is missing. If there were, I suppose it would be more like “notice that you’re deferring way more than you’re noticing, and that others are too, and that this is creating very bad epistemic tidal forces” or something, but that’s kinda hard to discuss productively. I keep repeating that I don’t understand how you / others got to be so confident in short timelines in part so that it’s clear that IDK what you’re missing even on my perspective. I try to ask people to lay out their reasoning more in order to bring out background assumptions that I can disagree with better, which background assumptions support the [past x years of LLM observations] --> [confident short timelines] update. But that doesn’t work that well / is frustrating.
For example, I think that besides the “crystallizes int could make fluid int” thing, you’re also saying “we have fluid int or are close to it probably”. I’m asking about that and wanting to argue against that (that is, argue against the reasoning I’ve heard so far as not supporting the stated confidence). IDK how to phrase that as a positive statement for a position summary, because I would normally view it as weird / the wrong order of operations to start negating arguments (whatever your specific arguments are) without hearing them first, but I suppose I could preliminarily say “I’m guessing that you made a wrong update from observations to timelines based on a misconstrual of what matters about intelligence and what is the causal structure between cognitive faculties and cognitive performances.”, as described here https://www.lesswrong.com/posts/FG54euEAesRkSZuJN/ryan_greenblatt-s-shortform?commentId=QBca6vhdeKkjyNLKa and here https://www.lesswrong.com/posts/FG54euEAesRkSZuJN/ryan_greenblatt-s-shortform?commentId=DDaz5zcETcuyuy5Xx . But again, that feels awkward to guess about given that I don’t understand your reasons for the update.
I think I’m saying “crystalized intelligence can, to a large extent, substitute for fluid intelligence”. This is true to an extent, of humans, but it’s much more true of AIs, because they can have so much more crystalized intelligence than any human could hope to attain.
This is relevant to modeling if LLM agents will transform the world and to modeling if LLM agents will rapidly give way to something that’s much more capable.
In particular, I (unconfidently) dispute that developing an AI with fluid intelligence is a research project that is itself heavily and crucially loaded on fluid intelligence.[1]
On my view, it is pretty likely that huge amounts of superhuman crystalized intelligence can find fluid-intelligence-emulating mechanisms (possibly with a necessary ingredient of a relatively small amount of genius fluid intelligence that even huge amounts of crystalized intelligence can’t substitute for, or possibly even without that input).
In that sense, we’re “close to” solving fluid intelligence, even if there’s decades of subjective research an iteration time between here and there.
I do additionally suspect that mechanisms to implement fluid intelligence are just not that hard to invent and/or scale to, starting from the AI tech of 2026. Like, it seems somewhat likely to me that that various dumb ideas would just totally work, or that doing the same stuff we’ve already been doing, but more so will totally work. However, I’m much less confident about this point, and perhaps you can teach me some things that would quickly cause me to change my mind.
I want to check if this comment is clarifying, or if it feel like me repeating things that I’ve already said.
Though you flagged that those aren’t the words that you would use, so
Is there a way you could restate this in terms of one or more propositions that are fairly load-bearing for your confident short timelines? Is it this?
I’m struggling to get clear on the logical structure of your beliefs. Cf. your comments
and
and
And
Can you please clarify what are main big thingies that you expect to happen fairly confidently within 10 years? Can you clarify what drives your confident beliefs in those thingies happening? E.g. are you confident in some X happening due to being confident that we have fluid intelligence already or are close, or is that not the case?
Maybe you could make up words for the main scenarios and main capabilities / faculties you’re thinking of?
To mods (@Raemon): this would be an example of a place where a “have an LLM summarize the exchanges in this thread between these two users, with links / quotes” button could be helpful.
(Also, it seems that the option to temporarily switch to LW editor was disabled, so I cannot tag people?)
I’ve been reading this thread and feeling motivated to at least make it easier to export the conversation to clipboard so you can more easily paste into LLM (in my case so I can argue with the AI about what obvious objections you’d have to what I’d say next).
I’m a bit confused about what’s going on with the markdown editor. I’ve asked other mods.
(actually I asked for Gemini 3.1 pro preview for a summary of a copypasted version and it wasn’t good)
As evidenced by base models outperforming RLVR’d models on pass@k for large k.
I don’t think this result plausibly holds up very well on certain important classes of problems or it might hold but for absurdly large k. In particular, I think GPT-5.1 is based on a base model that’s pretty similar in quality to GPT-4o (possible somewhat better, possible literally the same base model, I forget) while it’s surely requires k >>1000 for the GPT-4o base model to solve the hardest math problems that gpt-5.1 >30k tokens.
Would you be able to coarsely quantify this? Like, let’s say OpenThinky has 500 genius human engineers, and have 100x as much compute as today and impunity and lots of CodeBrain5 coding agent things. Or whatever parameters you want to set. Is your median time to actually FOOM/takeover more like 1 year, 4 years, or 10 years?
What do you mean by genius human engineers?
Part of my model is that like 25% of the math and CS phds on earth, and especially the ones that win Nobel prizes, will be working on this problem.
I don’t know. I think my median is FOOM in 2 years? This is an ass-number though. I don’t feel super confident.
I’m like 90% probability that it happens within 10 years, and 95% probability that it happens within 35 years?
(To be a bit more transparent: I had not previously considered “a hundred thousand technical PhDs all try really hard to crack AGI”; I haven’t heard that argument/scenario before and I don’t know what to think about that and it would substantially shorten my timelines. I don’t yet understand why that seems likely / what scenario about that seems likely to you.)
Why I think that might happen:
Background: almost nothing that most humans do actually requires fluid intelligence. Most people, most of the time, are executing routine cognitive operations. And most of the people who are using their fluid intelligence on the job could do just as well or better if they had a massive memory of case studies to extrapolate from instead of attempting novel reasoning.
Most of earth’s geniuses currently spend most of their time doing routine cognitive operations—pattern matching from their prior experience to solve problems, often in the context of automatable tasks like implementing experiments or solving engineering problems. When those classes of task are automated, it will free up the capacity of the geniuses.
At this point, most of all the work in the world will be automated or in the process of being automated. Science and tech development will be going faster than ever in human history. It will be obvious to the whole world that AI is a really big deal.
Also, it will be obvious to many people that there’s something missing: the AIs are doing more and better design and engineering, faster, than human civilization ever did, and they’re accelerating the science, but they’re not doing the science. There will be enormous financial and strategic incentives to crack that.
Ok, thanks. I think I’ll probably have to chew on this scenario to say much of use. (I mean, I’ve thought about related things, but haven’t asked myself about this scenario.) My initial reaction is skepticism which I think comes from a combo of
LLMs are somewhat less useful than you seem to think
Humans apply somewhat more GI than you seem to think
The most important stuff would still be bottlenecked on human GI and would be hard to accelerate; you don’t just simply “free up” the humans in a super liquid, fungible way
If this were happening, some pretty strong political forces would be at play, including hopefully / kinda probably (??) a strong push to stop the spiral
But I’m not super confident about any of that. It’s strategically relevant but ATM I don’t have much novel perspective to offer, and it seems to need some other expertise (e.g. a good understanding of politics, of science and tech research, and similar).
Ok, I see. Now, regarding your disjunction earlier of (in my words)
A. (NAA (nonAGI AI) takeover) You can get strategic takeover AI without AGI
B. (AGI soon easy) Gippity+ will soon be AGI by adding a bit more ~mundane human research juice
C. (AGI soon hard) Gippity+++ will soon be AGI by adding a couple big insights
First, to clarify, I think the discourse on this thread is that I asked you about
and you said
and now we are discussing what NAA looks like and how timelines look in NAA takeover world.
Now I’m wondering, what are your very approximate relative probabilities of these things? E.g. is one of them 90% of the source of your confidence (I mean, 90% of your prob mass) in FOOM within 10 years? If they are roughly equal, I would raise my eyebrow and say “that seems kinda strange, unless there’s a shared factor such as you thinking that actually we basically have ~AGI in current systems; if so, could you clarify that shared factor”.
As stated, these don’t have to sum to 1. B and C are mutually exclusive but A can be true even if B or C are also true.
(I also object a bit to calling “strong fluid intelligence” “AGI.”
Part of what’s at stake is how far can you get with basically just specialized knowledge and the ability to train new specialized knowledge. It would be surprising to me, but not out of the question, that there’s almost nothing that such an AI can’t do that an AI with more fluid intelligence can do. But I only object a bit.)
Ass numbers:
A: 80%
B: 40%
C: 30%
I mean that’s kind of fair. But I in fact don’t have a lot of precise ability to distinguish between “one key idea is missing” and “only engineering schlep is missing”. Those wolds look very similar, to me, and so get similar amounts of mass.
We’re on the same page that you can make up for it with a lot of knowledge for some swath of performances. For other performances (games, RLVR, https://tsvibt.github.io/theory/pages/bl_24_07_25_09_52_56_652909.html ), you can make up for it with brute compute.
I suppose the question is whether this sort of thing does FOOM / takeover. Are you saying you can make up for weak intelligence with knowledge (gleaned from human text) well enough to do that?
More like you can make a weak intelligence with lots of specialized knowledge and skills, mostly gleaned from RL (though starting from the superhuman breadth of baseline knowledge that GPT-4 had), that can outcompete humans in acquiring power and/or FOOM.
(Splitting out for better threading)
Um ok, but, do you get that I’m also saying that you should also update that there’s some type of task which you are observing to have a different relative need for general intelligence compared to other tasks?
I get that you’re saying something like that.
I think you’re saying “that coding fell to methods like these is both evidence that methods like these are more powerful than we might have guessed, and also evidence that coding required less general intelligence than we might have guessed.”
Yes. (With the tangential but crucial caveat that “coding” as a category simply does not work in this context to support the relevant inferences. Cf. https://www.lesswrong.com/posts/5tqFT3bcTekvico4d/do-confident-short-timelines-make-sense?commentId=C8dWvwdeTFyWnnahi )
I’m right now trying to inhabit this point, and try to really grok it.
I guess it could be the case that the kind of intelligence that you need to engineer software and the kind that you need to develop novel algorithms are almost completely disjoint and unrelated. You can basically solve “make an AI that can make software”, and not have even scratched the surface of “make an AI that can make new algorithms / new interesting math concepts”.
(Is this what you think?)
It would surprise me if this were true, because it seems like there’s a lot of overlap in the mental operations between those two kinds of work.
This is a good question, thank you. (It’s an important topic which I won’t fully treat here.)
One very-cartoon model, which I guess you know but to lay it out:
On a psychologizing note, which I hope to offer just as a hypothesis-piece to maybe track if you weren’t already (I think you’re pretty likely to be already aware of this, but from my perspective there’s a significant chance that you don’t think of it frequently enough): There’s a strong default to overly interpret things (behaviors, say) with a presumption of a human-shaped background mental context. E.g. how people ask “does the LLM believe X”, even though that question probably doesn’t straightforwardly translate from humans to LLMs at all and would lead to incorrect inferences about what behaviors would be concomitant. Cf. https://www.lesswrong.com/posts/L2h9nAtPqEFK6atSJ/an-anthropomorphic-ai-dilemma#Gemini_modeling_with_alien_contexts_is_hard
In particular, when people imagine changes to gippity-based systems, such as “unhobbling” by adding tool access, they imagine that what gets opened up for the gippity is similar to what would be opened up for a human if you newly made that same change (e.g. gave a human that tool). I think this drives some “we’re close to having AGI” intuitions, and I think it’s mistaken.
I like and basically endorse your cartoon model!
Modulo, I think that more of the capabilities are coming from the RLVF than from copying humans than you seem to think.[1]
Why are you emphasizing the pretraining instead of the RL?
Though Zack dropped a paper in this thread which looks relevant to that question.
Well, in this example, there is an end to Moore’s law specifically, and you can even approximately call it based on physics (atomic scale for transistors), and the guy who believes in LGU (line go up) narrowly for Moore’s law specifically would be being silly.
Let me refine my statement about burden of proof (recapitulating something from DMs). I think that 10 or 20 years ago, if you or I had had opinions about when AGI comes, we would have correctly had very broad/uncertain timelines. Is that true? Assuming that’s true: Right now, I have significantly shorter timelines than I shoulda/woulda had 10 years ago; I would have (if I had had worked out timelines) said it seemed quite unlikely, like <2% or something, to get AGI within 5 or 10 years. Now I say more like 5% or something like that.
I take it that you have quite a lot more probability on AGI within 10 years, though I just now realized I don’t actually know your specific beliefs. Could you link to writing you’ve done about your timelines beliefs? For example, would I be correct in assuming that your median is <10 years? Assuming that’s the case:
You must have had quite a large update. Your beliefs went from a broad spread out thing, to a quite sharp (relatively speaking) distribution. That update would look like something along the lines of:
For example, if previously you had 80% probability on “AGI in >15 years”, and then 5 years later you have 50% probability on “AGI in <10 years”, then you must have lost at least 60% prob-weighted hypotheses from your original distribution.
I’m trying to understand that update. When I ask people about that update, they give various statements, but somehow I come away having no idea how they made their update to get to a confident (sharp) distribution. I mean, I can write out sentences like “gippity++ automates AI R&D”, and I mean, qualitatively I agree that this is plausible and is extremely alarming (and should be stopped ASAP etc. etc.); but I have no idea how people got so confident in that (relative to the earlier broad distribution).
I feel like I understand the question you’re asking:
~”If you previously had a very spread-out prediction, and now you have a relatively more narrow prediction, between then and now, you must have made a pretty large Bayesian update—you saw some evidence with a quite lopsided odds ratio.
If you made such an update, you should be able to point to the evidence, and explain why you think the odds ratio is so lopsided. Please do that!”
(Is that about right?)
But, I don’t get why the evidence / arguments that people are offering isn’t clarifying for you.
Like, I keep trying to point to the same basic IMO pretty straightforward considerations, and you keep saying things like “somehow I come away having no idea how they made their update to get to a confident (sharp) distribution”. [1]
I’m not sure what kind of thing you’re asking for that’s different from the kinds of things that I’m already saying. Do you want more of a quantitative model? Do we just need to get further into the argument tree?
I further get that, from your perspective, you’re saying something like “dude, give me actual evidence and arguments”, and I’m somehow being a dunce about that. But I don’t get what exactly you’re asking for.
Which to be clear, is socially and epistemically valid, on your part. Please continue to loudly say “I don’t get why everyone thinks this”, for as long as that’s true. I want to do the opposite of shaming you for not getting it.
I don’t believe I can pass your ITT, but I will try to draw a sample from my model of you (which is almost entirely a much more blobulous model of generally short timelines people, so sorry for resulting lumpification), in the form of a dialogue. [After writing it, this isn’t yet that much of a great attempt at understanding you, sorry; maybe it’s still a helpful summary of what I think the discourse state is.]
Thread 1 (not very Eli?):
Thread 2 (more Eli?):
Um no. At least if “training data” is meant to refer to the text corpuses used in pre-training. I think the problem-solving capabilities are mostly coming from the RLVF.
I would not endorse this.
Like, if I thought that we were conceptually / technically far from AIs that can automate the process of scientific discovery, I would much less expect a FOOM in the next 10 years (though we would still have an emergency, because automating science isn’t a necessary capability for destabilizing or ending the world via any of a number of different pathways).
I would be excited about attempts at clarification here! (Modulo that they seem potentially very infohazardous.)
Ok great. Can you clarify why you think this? Previously you wrote, in response to “What makes you think we’re close?”:
Can you clarify / expand? What makes you think the METR results imply we’re close to having algorithmic ideas sufficient to automate scientific discovery?
One thing which is awkward about this operationalization is that it’s not clear how big an innovation “attention is all you need” is. Like, we already had multi-head attention before that paper, and ablating out the other components from the transformer isn’t that galaxy-brained a thing to try (though of course the authors executed well on it).
I was thinking something like that when I wrote it.
Do you have a better suggested operationalization?
Thanks!
Curious what formats. (E.g. I’m happy to have a call to be published on YouTube; we could do a private discussion, though that’s not my preference.)
Just for the public record:
I don’t agree with this (e.g. I doubt either Greenblatt or I could give a high-quality ITT of the other’s view, and I expect that one or both of us would update substantively if we understood each other).
Is it your experience that if someone comes to understand your view, they update significantly towards it, even if they had an elaborate-ish short timeline view beforehand?
Yes, but this is an extremely weak signal, in that it’s a small number (n=2 to 5 or something, depending who you count) and it’s pretty selected for “people who talk to me a lot or people who actively tell me that I updated them somewhat”, and I have a high bar for understanding (for both parties symmetrically). Abram might be someone who somewhat understands my view but only slightly updated towards it?