> @julianboolean_: the biggest lesson I’ve learned from the last few years is that the “tiny gap between village idiot and Einstein” chart was completely wrong
I agree that I underestimated this distance, at least partially out of youthful idealism.
That said, one of the few places where my peers managed to put forth a clear contrary bet was on this case. And I did happen to win that bet. This was less than 7% of the distance in AI’s 75-year journey! And arguably the village-idiot level was only reached as of 4o or o1.
I was very interested to see this tweet. I have thought of that “Village Idiot and Einstein” claim as the most obvious example of a way that Eliezer and co were super wrong about how AI would go, and they’ve AFAIK totally failed to publicly reckon with it as it’s become increasingly obvious that they were wrong over the last eight years.
It’s helpful to see Eliezer clarify what he thinks of this point. I would love to see more from him on this—why he got this wrong, how updating changes his opinion about the rest of the problem, what he thinks now about time between different levels of intelligence.
I have thought of that “Village Idiot and Einstein” claim as the most obvious example of a way that Eliezer and co were super wrong about how AI would go, and they’ve AFAIK totally failed to publicly reckon with it as it’s become increasingly obvious that they were wrong over the last eight years
I’m confused—what evidence do you mean? As I understood it, the point of the village idiot/Einstein post was that the size of the relative differences in intelligence we were familiar with—e.g., between humans, or between humans and other organisms—tells us little about the absolute size possible in principle. Has some recent evidence updated you about that, or did you interpret the post as making a different point?
(To be clear I also feel confused by Eliezer’s tweet, for the same reason).
Ugh, I think you’re totally right and I was being sloppy; I totally unreasonably interpreted Eliezer as saying that he was wrong about how long/how hard/how expensive it would be to get between capability levels. (But maybe Eliezer misinterpreted himself the same way? His subsequent tweets are consistent with this interpretation.)
I totally agree with Eliezer’s point in that post, though I do wish that he had been clearer about what exactly he was saying.
Makes sense. But on this question too I’m confused—has some evidence in the last 8 years updated you about the old takeoff speed debates? Or are you referring to claims Eliezer made about pre-takeoff rates of progress? From what I recall, the takeoff debates were mostly focused on the rate of progress we’d see given AI much more advanced than anything we have. For example, Paul Christiano operationalized slow takeoff like so:
Given that we have yet to see any such doublings, nor even any discernable impact on world GDP:
… it seems to me that takeoff (in this sense, at least) has not yet started, and hence that we have not yet had much chance to observe evidence that it will be slow?
The common theme here is that the capabilities frontier is more jagged than expected. So the way in which people modeled takeoff in the pre-LLM era was too simplistic.
Takeoff used to be seen as equivalent to the time between AGI and ASI.
In reality we got programmes which are not AGI, but do have capabilities that most in the past would have assumed to entail AGI.
So, we have pretty-general intelligence that’s better than most humans in some areas, and is amplifying programming and mathematics productivity. So, I think takeoff has begun, but it’s under quite different conditions than people used to model.
So, I think takeoff has begun, but it’s under quite different conditions than people used to model.
I don’t think they are quite different. Christiano’s argument was largely about the societal impact, i.e. that transformative AI would arrive in an already-pretty-transformed world:
I believe that before we have incredibly powerful AI, we will have AI which is merely very powerful. This won’t be enough to create 100% GDP growth, but it will be enough to lead to (say) 50% GDP growth. I think the likely gap between these events is years rather than months or decades.
In particular, this means that incredibly powerful AI will emerge in a world where crazy stuff is already happening (and probably everyone is already freaking out). If true, I think it’s an important fact about the strategic situation.
I claim the world is clearly not yet pretty-transformed, in this sense. So insofar as you think takeoff has already begun, or expect short (e.g. AI 2027-ish) timelines—I personally expect neither, to be clear—I do think this takeoff is centrally of the sort Christiano would call “fast.”
I think you accurately interpreted me as saying I was wrong about how long it would take to get from the “apparently a village idiot” level to “apparently Einstein” level! I hadn’t thought either of us were talking about the vastness of the space above, in re what I was mistaken about. You do not need to walk anything back afaict!
Have you stated anywhere what makes you think “apparently a village idiot” is a sensible description of current learning programs, as they inform us regarding the question of whether or not we currently have something that is capable via generators sufficiently similar to [the generators of humanity’s world-affecting capability] that we can reasonably induce that these systems are somewhat likely to kill everyone soon?
This comic by Tim Urban is interesting, but I remember when I first read it, it seemed wrong.
In his framework, I think ASI can only be quantitatively more powerful than human intelligence, not qualitatively.
The reason is simple: humans are already Turing complete. Anything a machine can do, it can only be faster execution of something a human could already do.
I don’t think it has much bearing on the wider discussion of AI/AI-risk, I haven’t heard anybody else think that the distinction of quantitative/qualitative superiority had any bearing on AI risk.
I don’t think it matters much for practical purposes. It could be that some problems are theoretically solvable by human intelligence but we realistically lack the time to do so in the age of the universe, or that they just can’t be solved by us, and either way an ASI that solves them in a day leaves us in the dust. The reason why becomes secondary at that point.
I feel like one problem with solving problems intelligently is that it’s rarely as easy as tackling a tedious task in small bits—you need an intuition to see the whole path in a sort of coarse light, and then refine on each individual step. So there’s a fast algorithm that goes “I know I can do this, I don’t know how yet” and then we slowly unpack the relevant bits. And I think there might be a qualitative effect to e.g. being able to hold more steps in memory simultaneously or such.
Isn’t this too soon to claim that this was some big mistake? Up until December 2024 the best available LLM barely reasoned. Everyone and their dog was saying that LLMs are fundamentally incapable of reasoning. Just eight months later two separate LLM-based systems got Gold on the IMO (one of which is now available, albeit in a weaker form). We aren’t at the level of Einstein yet, but we could be within a couple years. Would this not be a very short period of time to go from models incapable of reasoning to models which are beyond human comprehension? Would this image not then be seen as having aged very well?
I think that, at some point in the development of Artificial Intelligence, we are likely to see a fast, local increase in capability—“AI go FOOM.” Just to be clear on the claim, “fast” means on a timescale of weeks or hours rather than years or decades; and “FOOM” means way the hell smarter than anything else around, capable of delivering in short time periods technological advancements that would take humans decades, probably including full-scale molecular nanotechnology.
So yeah, a few years does seem a ton slower than what he was talking about, at least here.
Here’s Scott Alexander, who describes hard takeoff as a one-month thing:
If AI saunters lazily from infrahuman to human to superhuman, then we’ll probably end up with a lot of more-or-less equally advanced AIs that we can tweak and fine-tune until they cooperate well with us. In this situation, we have to worry about who controls those AIs, and it is here that OpenAI’s model [open sourcing AI] makes the most sense.
But Bostrom et al worry that AI won’t work like this at all. Instead there could be a “hard takeoff”, a subjective discontinuity in the function mapping AI research progress to intelligence as measured in ability-to-get-things-done. If on January 1 you have a toy AI as smart as a cow, and on February 1 it’s proved the Riemann hypothesis and started building a ring around the sun, that was a hard takeoff.
In general, I think, people who just entered the conversation recently really seem to me to miss how fast people were actually talking about.
It really depends what you mean by a small amount of time. On a cosmic scale, ten years is indeed short. But I definitely interpreted Eliezer back then (for example, while I worked at MIRI) as making a way stronger claim than this; that we’d e.g. within a few days/weeks/months go from AI that was almost totally incapable of intellectual work to AI that can overpower humanity. And I think you need to believe that much stronger claim in order for a lot of the predictions about the future that MIRI-sphere people were making back then to make sense. I wish we had all been clearer at the time about what specifically everyone was predicting.
I’d be excited for people (with aid of LLMs) to go back and grade how various past predictions from MIRI folks are doing, plus ideally others who disagreed. I just read back through part of https://www.lesswrong.com/posts/vwLxd6hhFvPbvKmBH/yudkowsky-and-christiano-discuss-takeoff-speeds and my quick take is that Paul looks mildly better than Eliezer due to predicting larger impacts/revenue/investment pre-AGI (which we appear to be on track for and to some extent already seeing) and predicitng a more smooth increase in coding abilities, but hard to say in part because Eliezer mostly didn’t want to make confident predictions, also I think Paul was wrong about Nvidia but that felt like an aside.
edit: oh also there’s the IMO bet, I didn’t get to that part on my partial re-read, that one goes to Eliezer.
Looking through IEM and the Yudkowsky-Hanson debate also seems like potentially useful sources, as well as things that I’m probably forgetting or unaware of.
If by intelligence you mean “we made some tests and made sure they are legible enough that people like them as benchmarks, and lo and behold, learning programs (LPs) continue to perform some amount better on them as time passes”, ok, but that’s a dumb way to use that word. If by intelligence you mean “we have something that is capable via generators sufficiently similar to [the generators of humanity’s world-affecting capability] that we can reasonably induce that these systems are somewhat likely to kill everyone”, then I challenge you to provide the evidence / reasoning that apparently makes you confident that LP25 is at a ~human (village idiot) level of intelligence.
So...I actually think that it technically wasn’t wrong, though the implications that we derived at the time were wrong because reality was more complicated than our simple model.
Roughly, it seems like mental performance is depends on at least two factors: “intelligence” and “knowledge”. It turns out that, at least in some regimes, there’s an exchange rate at which you can make up for mediocre intelligence with massive amounts of knowledge.
My understanding is that this is what’s happening even with the reasoning models. They have a ton of knowledge, including a ton of procedural knowledge about how to solve problems, which is masking the ways in which they’re not very smart.[1]
One way to operationalize how dumb the models are is the number of bits/tokens/inputs/something that are necessary to learn a concept or achieve some performance level on a task. Amortizing over the whole training process / development process, humans are still much more sample efficient learners than foundation models.
Basically, we’ve found a hack where we can get a kind of smart thing to learn a massive amount, which is enough to make it competitive with humans in a bunch of domains. Overall performance is less sensitive to increases in knowledge than to increases in intelligence. This means that we’re traversing the human range of ability, much more slowly than I anticipated we would based on the 2010s LessWrong arguments.
But that doesn’t mean that, for instance, comparatively tiny changes in human brains make the difference between idiots and geniuses.
I’m interested in evidence that bears on this model. Is there evidence that I’m unaware of that’s suggestive that foundation models are smarter than I think, or not relying on knowledge as much as I think?
Is that sentence dumb? Maybe when I’m saying things like that, it should prompt me to refactor my concept of intelligence. Maybe intelligence basically is procedural knowledge of how to solve problems and factoring knowledge and intelligence separately is dumb?
Is that sentence dumb? Maybe when I’m saying things like that, it should prompt me to refactor my concept of intelligence.
I don’t think it’s dumb. But I do think you’re correct that it’s extremely dubious—that we should definitely refactoring the concept of intelligence.
Specifically: There’s default LW-esque frame of some kind of a “core” of intelligence as “general problem solving” apart from any specific bit of knowledge, but I think that—if you manage to turn this belief into a hypothesis rather than a frame—there’s a ton of evidence against this thesis. You could even basically look at the last ~3 years of ML progress as just continuing little bits of evidence against this thesis, month after month after month.
I’m not gonna argue this in a comment, because this is a big thing, but here are some notes around this thesis if you want to tug on the thread.
Comparative psychology finds human infants are characterized by overimmitation relative to Chimpanzees, more than any general problem-solving skill. (That’s a link to a popsci source but there’s a ton of stuff on this.) That is, the skills humans excel at vs. Chimps + Bonobos in experiments are social and allow the quick copying and imitating of others: overimitation, social learning, understanding others as having intentions, etc. The evidence for this is pretty overwhelming, imo.
Ask what Nobel disease seems to say about the general-domain-transfer specificity of human brilliance. Look into scientists with pretty dumb opinions, even when they aren’t getting older. What do people say about the transferability of taste? What does that imply?
How do humans do on even very simple tasks that require reversing heuristics?
Etc etc. Big issue, this is not a complete take, etc. But in general I think LW has an unexamined notion of “intelligence” that feels like it has coherence because of social elaboration, but whose actual predictive validity is very questionable.
All this seems relevant, but there’s still the fact that a human elo at go or chess will improve much more from playing 1000 games (and no more) than an AI playing a 1000 games. That’s suggestive of property learning, or reflection, or conceptualization, or generalization, or something, that the AIs seem to lack, but can compensate for with brute force.
So for the case of our current RL game-playing AIs not learning much from 1000 games—sure, the actual game-playing AIs we have built don’t learn games as efficiently as humans do, in the sense of “from as little data.” But:
Learning from as little data as possible hasn’t actually been a research target, because self-play data is so insanely cheap. So it’s hard to conclude that our current setup for AIs is seriously lacking, because there hasn’t been serious effort to push along this axis.
To point out some areas we could be pushing on, but aren’t: Game-play networks are usually something like ~100x smaller than LLMs, which are themselves ~100-10x smaller than human brains (very approximate numbers). We know from numerous works that data efficiency scales with network size, so even if Adam over matmul is 100% as efficient as human brain matter, we’d still expect our current RL setups to do amazingly poorly with data-efficiency simply because of network size, even leaving aside further issues about lack of hyperparameter search and research effort.
Given this, while this is of course a consideration, it seems far from a conclusive consideration.
Edit: Or more broadly, again—different concepts of “intelligence” will tend to have different areas where they seem to have more predictive use, and different areas they seem to have more epicycles. The areas above are the kind of thing that—if one made them central to one’s notions of intelligence rather than peripheral—you’d probably end up with something different than the LW notion. But again—they certainly do not compel one to do that refactor! It probably wouldn’t make sense to try to do the refactor unless you just keep getting the feeling “this is really awkward / seems off / doesn’t seem to be getting at it some really important stuff” while using the non-refactored notion.
That is, the skills humans excel at vs. Chimps + Bonobos in experiments are social and allow the quick copying and imitating of others: overimitation, social learning, understanding others as having intentions, etc.
Yes, indeed, they copy the actions and play them through their own minds as a method of play, to continue extracting nonobvious concepts. Or at least that is my interpretation. Are you claiming that they are merely copying??
This is very much my gut feeling, too. LLMs have a much greater knowledge base than humans do, and some of them can “think” faster. But humans are still better at many things, including raw problem solving skills. (Though LLM’s problem solving skills have improved a breathtaking amount in the last 12 months since o1-preview shipped. Seriously, folks. The goalpost-moving is giving me vertigo.)
This uneven capabilities profile means that LLMs are still well below the so-called “village idiot” in many important ways, and have already soared past Einstein in others. This averages out to “kinda competent on short time horizons if you don’t squint too hard.”
But even if the difference between “the village idiot” and “smarter than Einstein” involved another AI winter, two major theoretical breakthroughs, and another 10 years, I would still consider that damn close to a vertical curve.
I don’t know that they were wrong about that claim. Or, it depends on what we interpret as the claim. “AI would do the thing in this chart” proved false[1], but I don’t think this necessarily implies that “there’s a vast distance between a village idiot and Einstein in intelligence levels”.
Rather, what we’re observing may just be a property of the specific approach to AI represented by LLMs. It is not quite “imitation learning”, but it shares some core properties of imitation learning. LLMs skyrocketed to human-ish level because they’re trained to emulate humans via human-generated data. Improvements then slowed to a (relative) crawl because it became a data-quality problem. It’s not that there’s a vast distance between stupid and smart humans, such that moving from a random initialization to “dumb human” is as hard as moving from a “dumb human” to a “smart human”. It’s just that, for humans, assembling an “imitate a dumb human” dataset is easy (scrape the internet), whereas transforming it into an “imitate a smart human” dataset is very hard. (And then RL is just strictly worse at compute-efficiency and generality, etc.)
(Edit: Yeah, that roughly seems to be Eliezer’s model too, see this thread.)
If that’s the case, Eliezer and co.’s failure wasn’t in modeling the underlying dynamics of intelligence incorrectly, but in failing to predict and talk about the foibles of an ~imitation-learning paradigm. That seems fairly minor.
Also: did that chart actually get disproven? To believe so, we have to assume current LLMs are at the “dumb human” levels, and that what’s currently happening is a slow crawl to “smart human” and beyond. But if LLMs are not AGI-complete, if their underlying algorithms (rather than externally visible behaviors) qualitatively differ from what humans do, this gives us little information on the speed with which an AGI-complete AI would move from a “dumb human” to a “smart human”. Indeed, I still expect pretty much that chart to happen once we get to actual AGI; see here.
You seem to think that imitation resulted in LLMs quickly saturating on an S-curve, but relevant metrics (e.g. time-horizon seem like they smoothly advance without a clear reduction in slope from the regime where pretraining was rapidly being scaled up (e.g. up to and through GPT-4) to after (in fact, the slope seems somewhat higher).
Presumably you think some qualitative notion of intelligence (which is hard to measure) has slowed down?
My view is that basically everything is progressing relatively smoothly and there isn’t anything which is clearly stalled in a robust way.
That’s not the relevant metric. The process of training involves a model skyrocketing in capabilities, from a random initialization to a human-ish level (or the surface appearance of it, at least). There’s a simple trick – pretraining – which allows to push a model’s intelligence from zero to that level.
Advancing past this point then slows down to a crawl: each incremental advance requires new incremental research derived by humans, rather than just turning a compute crank.
(Indeed, IIRC a model’s loss curves across training do look like S-curves? Edit: On looking it up, nope, I think.)
The FOOM scenario, on the other hand, assumes a paradigm that grows from random initialization to human level to superintelligence all in one go, as part of the same training loop, without a phase change from “get it to human level incredibly fast, over months” to “painstakingly and manually improve the paradigm past the human level, over years/decades”.
Relevant metrics of performance are roughly linear in log-compute when compute is utilized effectively in the current paradigm for training frontier models.
From my perspective it looks like performance has been steadily advancing as you scale up compute and other resources.
(This isn’t to say that pretraining hasn’t had lower returns recently, but you made a stronger claim.)
I think one of the (many) reasons people have historically tended to miscommunicate/talk past each other so much about AI timelines, is that the perceived suddenness of growth rates depends heavily on your choice of time span. (As Eliezer puts it, “Any process is continuous if you zoom in close enough.”)
It sounds to me like you guys (Thane and Ryan) agree about the growth rate of the training process, but are assessing its perceived suddenness/continuousness relative to different time spans?
A key reason, independent of LLMs, is that we see vast ranges of human performance, and Eliezer’s claim that the fact that humans have similar brain architectures means that there’s very little effort needed to become the best human who ever lived is wrong (admittedly this is a claim that the post was always wrong, and we just failed to notice it, including myself).
In terms of general intelligence including long-horizon agency, reliability, etc., do we think AIs are yet, for example, as autonomously good as the worst professionals? My instinct is no for many of them, even though the AIs might be better at the majority of sub-tasks and are very helpful as collaborators rather than fully replacing someone. But I’m uncertain, it might depend on the operalization and profession, for some professions the answer seems clearly yes.[1][2] It also seems harder to reason about than the literally least capable professional something like the 10th percentile.
If the answer is no and we’re looking at the ability to fully autonomously replace humans, this would mean the village idiot → Einstein claim might technically not be falsified. The spirit of the claim might be though, e.g. in terms of the claimed implications.
One profession for which it seems likely that the AIs are better than the least capable humans is therapy. Also teaching/tutoring. In general this seems true for professions that can be done via remote work, don’t involve heavy required computer use or long horizon agency.
What specifically do you think is obviously wrong about the village idiot <-> Einstein gap? This post from 2008 which uses the original chart makes some valid points that hold up well today, and rebuts some real misconceptions that were common at the time.
The original chart doesn’t have any kind of labels or axes, but here are two ways you could plausibly view it as “wrong” in light of recent developments with LLMs:
Duration: the chart could be read as a claim that the gap between the development of village idiot and Einstein-level AI in wall-clock time would be more like hours or days rather than months or years.
Size and dimensionality of mind-space below the superintelligence level. The chart could be read as a claim that the size of mindspace between village idiot and Einstein is relatively small, so it’s surprising to Eliezer-200x that there are lots of current AIs landing in between them, and staying there for a while.
I think it’s debatable how much Eliezer was actually making the stronger versions of the claims above circa 2008, and also remains to be seen how wrong they actually are, when applied to actual superintelligence instead of whatever you want to call the AI models of today.
OTOH, here are a couple of ways that the village idiot <-> Einstein post looks prescient:
Qualitative differences between the current best AI models and second-to-third tier models are small. Most AI models today are all roughly similar to each other in terms of overall architecture and training regime, but there are various tweaks and special sauce that e.g. Opus and GPT-5 have that Llama 4 doesn’t. So you have something like: Llama 4: GPT-5 :: Village idiot : Einstein, which is predicted by:
Maybe Einstein has some minor genetic differences from the village idiot, engine tweaks. But the brain-design-distance between Einstein and the village idiot is nothing remotely like the brain-design-distance between the village idiot and a chimpanzee. A chimp couldn’t tell the difference between Einstein and the village idiot, and our descendants may not see much of a difference either.
(and something like a 4B parameter open-weights model is analogous to the chimpanzee)
Whereas I expect that e.g. Robin Hanson in 2008 would have been quite surprised by the similarity and non-specialization among different models of today.
Implications for scaling. Here’s a claim on which I think the Eliezer-200x Einstein chart makes a prediction that is likely to outperform other mental models of 2008, as well as various contemporary predictions based on scaling “laws” or things like METR task time horizon graphs:
”The rough number of resources, in terms of GPUs, energy, wall clock time, lines of Python code, etc. needed to train and run best models today (e.g. o4, GPT-5), are sufficient (or more than sufficient) to train and run a superintelligence (without superhuman / AI-driven levels of optimization / engineering / insight).”
My read of task-time-horizon and scaling law-based models of AI progress is that they more strongly predict that further AI progress will basically require more GPUs. It might be that the first Einstein+ level AGI is in fact developed mostly through scaling, but these models of progress are also more surprised than Eliezer-2008 when it turns out that (ordinary, human-developed) algorithmic improvements and optimizations allow for the training of e.g. a GPT-4-level model with many fewer resources than it took to train the original GPT-4 just a few years ago.
I find myself puzzled by Eliezer’s tweet. I had always taken the point of the diagram to be the vastness of the space above Einstein compared with the distance between Einstein and the village idiot. I do not see how recent developments in AI affect that. AI has (in Eliezer’s view) barely reached the level of the village idiot. Nothing in the diagram bears on how long it will take to equal Einstein. That is anyway a matter of the future, and Eliezer has often remarked on how many predictions of long timelines to some achievement turned out to be achieved within months, or already had been when the prediction was made. I wonder what Eliezer’s predicted time to Einstein is, given no slowdown.
@Eliezer Yudkowsky tweets:
I was very interested to see this tweet. I have thought of that “Village Idiot and Einstein” claim as the most obvious example of a way that Eliezer and co were super wrong about how AI would go, and they’ve AFAIK totally failed to publicly reckon with it as it’s become increasingly obvious that they were wrong over the last eight years.
It’s helpful to see Eliezer clarify what he thinks of this point. I would love to see more from him on this—why he got this wrong, how updating changes his opinion about the rest of the problem, what he thinks now about time between different levels of intelligence.
I’m confused—what evidence do you mean? As I understood it, the point of the village idiot/Einstein post was that the size of the relative differences in intelligence we were familiar with—e.g., between humans, or between humans and other organisms—tells us little about the absolute size possible in principle. Has some recent evidence updated you about that, or did you interpret the post as making a different point?
(To be clear I also feel confused by Eliezer’s tweet, for the same reason).
Ugh, I think you’re totally right and I was being sloppy; I totally unreasonably interpreted Eliezer as saying that he was wrong about how long/how hard/how expensive it would be to get between capability levels. (But maybe Eliezer misinterpreted himself the same way? His subsequent tweets are consistent with this interpretation.)
I totally agree with Eliezer’s point in that post, though I do wish that he had been clearer about what exactly he was saying.
Makes sense. But on this question too I’m confused—has some evidence in the last 8 years updated you about the old takeoff speed debates? Or are you referring to claims Eliezer made about pre-takeoff rates of progress? From what I recall, the takeoff debates were mostly focused on the rate of progress we’d see given AI much more advanced than anything we have. For example, Paul Christiano operationalized slow takeoff like so:
Given that we have yet to see any such doublings, nor even any discernable impact on world GDP:
… it seems to me that takeoff (in this sense, at least) has not yet started, and hence that we have not yet had much chance to observe evidence that it will be slow?
The common theme here is that the capabilities frontier is more jagged than expected. So the way in which people modeled takeoff in the pre-LLM era was too simplistic.
Takeoff used to be seen as equivalent to the time between AGI and ASI.
In reality we got programmes which are not AGI, but do have capabilities that most in the past would have assumed to entail AGI.
So, we have pretty-general intelligence that’s better than most humans in some areas, and is amplifying programming and mathematics productivity. So, I think takeoff has begun, but it’s under quite different conditions than people used to model.
I don’t think they are quite different. Christiano’s argument was largely about the societal impact, i.e. that transformative AI would arrive in an already-pretty-transformed world:
I claim the world is clearly not yet pretty-transformed, in this sense. So insofar as you think takeoff has already begun, or expect short (e.g. AI 2027-ish) timelines—I personally expect neither, to be clear—I do think this takeoff is centrally of the sort Christiano would call “fast.”
I think you accurately interpreted me as saying I was wrong about how long it would take to get from the “apparently a village idiot” level to “apparently Einstein” level! I hadn’t thought either of us were talking about the vastness of the space above, in re what I was mistaken about. You do not need to walk anything back afaict!
Have you stated anywhere what makes you think “apparently a village idiot” is a sensible description of current learning programs, as they inform us regarding the question of whether or not we currently have something that is capable via generators sufficiently similar to [the generators of humanity’s world-affecting capability] that we can reasonably induce that these systems are somewhat likely to kill everyone soon?
The following illustration from 2015 by Tim Urban seems like a decent summary of how people interpreted this and other statements.
This comic by Tim Urban is interesting, but I remember when I first read it, it seemed wrong.
In his framework, I think ASI can only be quantitatively more powerful than human intelligence, not qualitatively.
The reason is simple: humans are already Turing complete. Anything a machine can do, it can only be faster execution of something a human could already do.
I don’t think it has much bearing on the wider discussion of AI/AI-risk, I haven’t heard anybody else think that the distinction of quantitative/qualitative superiority had any bearing on AI risk.
I don’t think it matters much for practical purposes. It could be that some problems are theoretically solvable by human intelligence but we realistically lack the time to do so in the age of the universe, or that they just can’t be solved by us, and either way an ASI that solves them in a day leaves us in the dust. The reason why becomes secondary at that point.
I feel like one problem with solving problems intelligently is that it’s rarely as easy as tackling a tedious task in small bits—you need an intuition to see the whole path in a sort of coarse light, and then refine on each individual step. So there’s a fast algorithm that goes “I know I can do this, I don’t know how yet” and then we slowly unpack the relevant bits. And I think there might be a qualitative effect to e.g. being able to hold more steps in memory simultaneously or such.
Link to the actual tweet.
And the whole exchange on nitter for those who don’t like going on x/twitter.
Isn’t this too soon to claim that this was some big mistake? Up until December 2024 the best available LLM barely reasoned. Everyone and their dog was saying that LLMs are fundamentally incapable of reasoning. Just eight months later two separate LLM-based systems got Gold on the IMO (one of which is now available, albeit in a weaker form). We aren’t at the level of Einstein yet, but we could be within a couple years. Would this not be a very short period of time to go from models incapable of reasoning to models which are beyond human comprehension? Would this image not then be seen as having aged very well?
Here’s Yudkowsky, in the Hanson-Yudkowsky debate:
So yeah, a few years does seem a ton slower than what he was talking about, at least here.
Here’s Scott Alexander, who describes hard takeoff as a one-month thing:
In general, I think, people who just entered the conversation recently really seem to me to miss how fast people were actually talking about.
It really depends what you mean by a small amount of time. On a cosmic scale, ten years is indeed short. But I definitely interpreted Eliezer back then (for example, while I worked at MIRI) as making a way stronger claim than this; that we’d e.g. within a few days/weeks/months go from AI that was almost totally incapable of intellectual work to AI that can overpower humanity. And I think you need to believe that much stronger claim in order for a lot of the predictions about the future that MIRI-sphere people were making back then to make sense. I wish we had all been clearer at the time about what specifically everyone was predicting.
I’d be excited for people (with aid of LLMs) to go back and grade how various past predictions from MIRI folks are doing, plus ideally others who disagreed. I just read back through part of https://www.lesswrong.com/posts/vwLxd6hhFvPbvKmBH/yudkowsky-and-christiano-discuss-takeoff-speeds and my quick take is that Paul looks mildly better than Eliezer due to predicting larger impacts/revenue/investment pre-AGI (which we appear to be on track for and to some extent already seeing) and predicitng a more smooth increase in coding abilities, but hard to say in part because Eliezer mostly didn’t want to make confident predictions, also I think Paul was wrong about Nvidia but that felt like an aside.
edit: oh also there’s the IMO bet, I didn’t get to that part on my partial re-read, that one goes to Eliezer.
Looking through IEM and the Yudkowsky-Hanson debate also seems like potentially useful sources, as well as things that I’m probably forgetting or unaware of.
The part of this graph that has aged the least well is that the y-axis is labeled “intelligence” and it’s becoming harder to see that as a real value.
If by intelligence you mean “we made some tests and made sure they are legible enough that people like them as benchmarks, and lo and behold, learning programs (LPs) continue to perform some amount better on them as time passes”, ok, but that’s a dumb way to use that word. If by intelligence you mean “we have something that is capable via generators sufficiently similar to [the generators of humanity’s world-affecting capability] that we can reasonably induce that these systems are somewhat likely to kill everyone”, then I challenge you to provide the evidence / reasoning that apparently makes you confident that LP25 is at a ~human (village idiot) level of intelligence.
Cf. https://www.lesswrong.com/posts/5tqFT3bcTekvico4d/do-confident-short-timelines-make-sense
Here is Eliezer’s post on this topic from 17 years ago for anyone interested: https://www.lesswrong.com/posts/3Jpchgy53D2gB5qdk/my-childhood-role-model
Anna Salamon’s comment and Eliezer’s reply to it are particularly relevant.
Thanks heaps for pulling this up! I totally agree with Eliezer’s point there.
[Epistemic status: unconfident]
So...I actually think that it technically wasn’t wrong, though the implications that we derived at the time were wrong because reality was more complicated than our simple model.
Roughly, it seems like mental performance is depends on at least two factors: “intelligence” and “knowledge”. It turns out that, at least in some regimes, there’s an exchange rate at which you can make up for mediocre intelligence with massive amounts of knowledge.
My understanding is that this is what’s happening even with the reasoning models. They have a ton of knowledge, including a ton of procedural knowledge about how to solve problems, which is masking the ways in which they’re not very smart.[1]
One way to operationalize how dumb the models are is the number of bits/tokens/inputs/something that are necessary to learn a concept or achieve some performance level on a task. Amortizing over the whole training process / development process, humans are still much more sample efficient learners than foundation models.
Basically, we’ve found a hack where we can get a kind of smart thing to learn a massive amount, which is enough to make it competitive with humans in a bunch of domains. Overall performance is less sensitive to increases in knowledge than to increases in intelligence. This means that we’re traversing the human range of ability, much more slowly than I anticipated we would based on the 2010s LessWrong arguments.
But that doesn’t mean that, for instance, comparatively tiny changes in human brains make the difference between idiots and geniuses.
I’m interested in evidence that bears on this model. Is there evidence that I’m unaware of that’s suggestive that foundation models are smarter than I think, or not relying on knowledge as much as I think?
Is that sentence dumb? Maybe when I’m saying things like that, it should prompt me to refactor my concept of intelligence. Maybe intelligence basically is procedural knowledge of how to solve problems and factoring knowledge and intelligence separately is dumb?
I don’t think it’s dumb. But I do think you’re correct that it’s extremely dubious—that we should definitely refactoring the concept of intelligence.
Specifically: There’s default LW-esque frame of some kind of a “core” of intelligence as “general problem solving” apart from any specific bit of knowledge, but I think that—if you manage to turn this belief into a hypothesis rather than a frame—there’s a ton of evidence against this thesis. You could even basically look at the last ~3 years of ML progress as just continuing little bits of evidence against this thesis, month after month after month.
I’m not gonna argue this in a comment, because this is a big thing, but here are some notes around this thesis if you want to tug on the thread.
Comparative psychology finds human infants are characterized by overimmitation relative to Chimpanzees, more than any general problem-solving skill. (That’s a link to a popsci source but there’s a ton of stuff on this.) That is, the skills humans excel at vs. Chimps + Bonobos in experiments are social and allow the quick copying and imitating of others: overimitation, social learning, understanding others as having intentions, etc. The evidence for this is pretty overwhelming, imo.
Take a look at how hard far transfer learning is to get in humans.
Ask what Nobel disease seems to say about the general-domain-transfer specificity of human brilliance. Look into scientists with pretty dumb opinions, even when they aren’t getting older. What do people say about the transferability of taste? What does that imply?
How do humans do on even very simple tasks that require reversing heuristics?
Etc etc. Big issue, this is not a complete take, etc. But in general I think LW has an unexamined notion of “intelligence” that feels like it has coherence because of social elaboration, but whose actual predictive validity is very questionable.
All this seems relevant, but there’s still the fact that a human elo at go or chess will improve much more from playing 1000 games (and no more) than an AI playing a 1000 games. That’s suggestive of property learning, or reflection, or conceptualization, or generalization, or something, that the AIs seem to lack, but can compensate for with brute force.
So for the case of our current RL game-playing AIs not learning much from 1000 games—sure, the actual game-playing AIs we have built don’t learn games as efficiently as humans do, in the sense of “from as little data.” But:
Learning from as little data as possible hasn’t actually been a research target, because self-play data is so insanely cheap. So it’s hard to conclude that our current setup for AIs is seriously lacking, because there hasn’t been serious effort to push along this axis.
To point out some areas we could be pushing on, but aren’t: Game-play networks are usually something like ~100x smaller than LLMs, which are themselves ~100-10x smaller than human brains (very approximate numbers). We know from numerous works that data efficiency scales with network size, so even if Adam over matmul is 100% as efficient as human brain matter, we’d still expect our current RL setups to do amazingly poorly with data-efficiency simply because of network size, even leaving aside further issues about lack of hyperparameter search and research effort.
Given this, while this is of course a consideration, it seems far from a conclusive consideration.
Edit: Or more broadly, again—different concepts of “intelligence” will tend to have different areas where they seem to have more predictive use, and different areas they seem to have more epicycles. The areas above are the kind of thing that—if one made them central to one’s notions of intelligence rather than peripheral—you’d probably end up with something different than the LW notion. But again—they certainly do not compel one to do that refactor! It probably wouldn’t make sense to try to do the refactor unless you just keep getting the feeling “this is really awkward / seems off / doesn’t seem to be getting at it some really important stuff” while using the non-refactored notion.
and whose predictive validity in humans doesn’t transfer well across cognitive architectures. e.g. reverse digit span.
Yes, indeed, they copy the actions and play them through their own minds as a method of play, to continue extracting nonobvious concepts. Or at least that is my interpretation. Are you claiming that they are merely copying??
This is very much my gut feeling, too. LLMs have a much greater knowledge base than humans do, and some of them can “think” faster. But humans are still better at many things, including raw problem solving skills. (Though LLM’s problem solving skills have improved a breathtaking amount in the last 12 months since o1-preview shipped. Seriously, folks. The goalpost-moving is giving me vertigo.)
This uneven capabilities profile means that LLMs are still well below the so-called “village idiot” in many important ways, and have already soared past Einstein in others. This averages out to “kinda competent on short time horizons if you don’t squint too hard.”
But even if the difference between “the village idiot” and “smarter than Einstein” involved another AI winter, two major theoretical breakthroughs, and another 10 years, I would still consider that damn close to a vertical curve.
I don’t know that they were wrong about that claim. Or, it depends on what we interpret as the claim. “AI would do the thing in this chart” proved false[1], but I don’t think this necessarily implies that “there’s a vast distance between a village idiot and Einstein in intelligence levels”.
Rather, what we’re observing may just be a property of the specific approach to AI represented by LLMs. It is not quite “imitation learning”, but it shares some core properties of imitation learning. LLMs skyrocketed to human-ish level because they’re trained to emulate humans via human-generated data. Improvements then slowed to a (relative) crawl because it became a data-quality problem. It’s not that there’s a vast distance between stupid and smart humans, such that moving from a random initialization to “dumb human” is as hard as moving from a “dumb human” to a “smart human”. It’s just that, for humans, assembling an “imitate a dumb human” dataset is easy (scrape the internet), whereas transforming it into an “imitate a smart human” dataset is very hard. (And then RL is just strictly worse at compute-efficiency and generality, etc.)
(Edit: Yeah, that roughly seems to be Eliezer’s model too, see this thread.)
If that’s the case, Eliezer and co.’s failure wasn’t in modeling the underlying dynamics of intelligence incorrectly, but in failing to predict and talk about the foibles of an ~imitation-learning paradigm. That seems fairly minor.
Also: did that chart actually get disproven? To believe so, we have to assume current LLMs are at the “dumb human” levels, and that what’s currently happening is a slow crawl to “smart human” and beyond. But if LLMs are not AGI-complete, if their underlying algorithms (rather than externally visible behaviors) qualitatively differ from what humans do, this gives us little information on the speed with which an AGI-complete AI would move from a “dumb human” to a “smart human”. Indeed, I still expect pretty much that chart to happen once we get to actual AGI; see here.
Or did it? See below.
You seem to think that imitation resulted in LLMs quickly saturating on an S-curve, but relevant metrics (e.g. time-horizon seem like they smoothly advance without a clear reduction in slope from the regime where pretraining was rapidly being scaled up (e.g. up to and through GPT-4) to after (in fact, the slope seems somewhat higher).
Presumably you think some qualitative notion of intelligence (which is hard to measure) has slowed down?
My view is that basically everything is progressing relatively smoothly and there isn’t anything which is clearly stalled in a robust way.
That’s not the relevant metric. The process of training involves a model skyrocketing in capabilities, from a random initialization to a human-ish level (or the surface appearance of it, at least). There’s a simple trick – pretraining – which allows to push a model’s intelligence from zero to that level.
Advancing past this point then slows down to a crawl: each incremental advance requires new incremental research derived by humans, rather than just turning a compute crank.
(Indeed, IIRC a model’s loss curves across training do look like S-curves? Edit: On looking it up, nope, I think.)
The FOOM scenario, on the other hand, assumes a paradigm that grows from random initialization to human level to superintelligence all in one go, as part of the same training loop, without a phase change from “get it to human level incredibly fast, over months” to “painstakingly and manually improve the paradigm past the human level, over years/decades”.
Relevant metrics of performance are roughly linear in log-compute when compute is utilized effectively in the current paradigm for training frontier models.
From my perspective it looks like performance has been steadily advancing as you scale up compute and other resources.
(This isn’t to say that pretraining hasn’t had lower returns recently, but you made a stronger claim.)
I think one of the (many) reasons people have historically tended to miscommunicate/talk past each other so much about AI timelines, is that the perceived suddenness of growth rates depends heavily on your choice of time span. (As Eliezer puts it, “Any process is continuous if you zoom in close enough.”)
It sounds to me like you guys (Thane and Ryan) agree about the growth rate of the training process, but are assessing its perceived suddenness/continuousness relative to different time spans?
A key reason, independent of LLMs, is that we see vast ranges of human performance, and Eliezer’s claim that the fact that humans have similar brain architectures means that there’s very little effort needed to become the best human who ever lived is wrong (admittedly this is a claim that the post was always wrong, and we just failed to notice it, including myself).
The range of human ability is wide, actually.
In terms of general intelligence including long-horizon agency, reliability, etc., do we think AIs are yet, for example, as autonomously good as the worst professionals? My instinct is no for many of them, even though the AIs might be better at the majority of sub-tasks and are very helpful as collaborators rather than fully replacing someone. But I’m uncertain, it might depend on the operalization and profession, for some professions the answer seems clearly yes.[1][2] It also seems harder to reason about than the literally least capable professional something like the 10th percentile.
If the answer is no and we’re looking at the ability to fully autonomously replace humans, this would mean the village idiot → Einstein claim might technically not be falsified. The spirit of the claim might be though, e.g. in terms of the claimed implications.
There’s also a question of whether we should include phyiscal abilities, if so then the answer would clearly be no for those professions or tasks.
One profession for which it seems likely that the AIs are better than the least capable humans is therapy. Also teaching/tutoring. In general this seems true for professions that can be done via remote work, don’t involve heavy required computer use or long horizon agency.
What specifically do you think is obviously wrong about the village idiot <-> Einstein gap? This post from 2008 which uses the original chart makes some valid points that hold up well today, and rebuts some real misconceptions that were common at the time.
The original chart doesn’t have any kind of labels or axes, but here are two ways you could plausibly view it as “wrong” in light of recent developments with LLMs:
Duration: the chart could be read as a claim that the gap between the development of village idiot and Einstein-level AI in wall-clock time would be more like hours or days rather than months or years.
Size and dimensionality of mind-space below the superintelligence level. The chart could be read as a claim that the size of mindspace between village idiot and Einstein is relatively small, so it’s surprising to Eliezer-200x that there are lots of current AIs landing in between them, and staying there for a while.
I think it’s debatable how much Eliezer was actually making the stronger versions of the claims above circa 2008, and also remains to be seen how wrong they actually are, when applied to actual superintelligence instead of whatever you want to call the AI models of today.
OTOH, here are a couple of ways that the village idiot <-> Einstein post looks prescient:
Qualitative differences between the current best AI models and second-to-third tier models are small. Most AI models today are all roughly similar to each other in terms of overall architecture and training regime, but there are various tweaks and special sauce that e.g. Opus and GPT-5 have that Llama 4 doesn’t. So you have something like: Llama 4: GPT-5 :: Village idiot : Einstein, which is predicted by:
(and something like a 4B parameter open-weights model is analogous to the chimpanzee)
Whereas I expect that e.g. Robin Hanson in 2008 would have been quite surprised by the similarity and non-specialization among different models of today.
Implications for scaling. Here’s a claim on which I think the Eliezer-200x Einstein chart makes a prediction that is likely to outperform other mental models of 2008, as well as various contemporary predictions based on scaling “laws” or things like METR task time horizon graphs:
”The rough number of resources, in terms of GPUs, energy, wall clock time, lines of Python code, etc. needed to train and run best models today (e.g. o4, GPT-5), are sufficient (or more than sufficient) to train and run a superintelligence (without superhuman / AI-driven levels of optimization / engineering / insight).”
My read of task-time-horizon and scaling law-based models of AI progress is that they more strongly predict that further AI progress will basically require more GPUs. It might be that the first Einstein+ level AGI is in fact developed mostly through scaling, but these models of progress are also more surprised than Eliezer-2008 when it turns out that (ordinary, human-developed) algorithmic improvements and optimizations allow for the training of e.g. a GPT-4-level model with many fewer resources than it took to train the original GPT-4 just a few years ago.
I find myself puzzled by Eliezer’s tweet. I had always taken the point of the diagram to be the vastness of the space above Einstein compared with the distance between Einstein and the village idiot. I do not see how recent developments in AI affect that. AI has (in Eliezer’s view) barely reached the level of the village idiot. Nothing in the diagram bears on how long it will take to equal Einstein. That is anyway a matter of the future, and Eliezer has often remarked on how many predictions of long timelines to some achievement turned out to be achieved within months, or already had been when the prediction was made. I wonder what Eliezer’s predicted time to Einstein is, given no slowdown.