Most of my posts and comments are about AI and alignment. Posts I’m most proud of, which also provide a good introduction to my worldview:
Without a trajectory change, the development of AGI is likely to go badly
Steering systems, and a follow up on corrigibility.
I also created Forum Karma, and wrote a longer self-introduction here.
PMs and private feedback are always welcome.
NOTE: I am not Max Harms, author of Crystal Society. I’d prefer for now that my LW postings not be attached to my full name when people Google me for other reasons, but you can PM me here or on Discord (m4xed) if you want to know who I am.
What specifically do you think is obviously wrong about the village idiot <-> Einstein gap? This post from 2008 which uses the original chart makes some valid points that hold up well today, and rebuts some real misconceptions that were common at the time.
The original chart doesn’t have any kind of labels or axes, but here are two ways you could plausibly view it as “wrong” in light of recent developments with LLMs:
Duration: the chart could be read as a claim that the gap between the development of village idiot and Einstein-level AI in wall-clock time would be more like hours or days rather than months or years.
Size and dimensionality of mind-space below the superintelligence level. The chart could be read as a claim that the size of mindspace between village idiot and Einstein is relatively small, so it’s surprising to Eliezer-200x that there are lots of current AIs landing in between them, and staying there for a while.
I think it’s debatable how much Eliezer was actually making the stronger versions of the claims above circa 2008, and also remains to be seen how wrong they actually are, when applied to actual superintelligence instead of whatever you want to call the AI models of today.
OTOH, here are a couple of ways that the village idiot <-> Einstein post looks prescient:
Qualitative differences between the current best AI models and second-to-third tier models are small. Most AI models today are all roughly similar to each other in terms of overall architecture and training regime, but there are various tweaks and special sauce that e.g. Opus and GPT-5 have that Llama 4 doesn’t. So you have something like: Llama 4: GPT-5 :: Village idiot : Einstein, which is predicted by:
(and something like a 4B parameter open-weights model is analogous to the chimpanzee)
Whereas I expect that e.g. Robin Hanson in 2008 would have been quite surprised by the similarity and non-specialization among different models of today.
Implications for scaling. Here’s a claim on which I think the Eliezer-200x Einstein chart makes a prediction that is likely to outperform other mental models of 2008, as well as various contemporary predictions based on scaling “laws” or things like METR task time horizon graphs:
”The rough number of resources, in terms of GPUs, energy, wall clock time, lines of Python code, etc. needed to train and run best models today (e.g. o4, GPT-5), are sufficient (or more than sufficient) to train and run a superintelligence (without superhuman / AI-driven levels of optimization / engineering / insight).”
My read of task-time-horizon and scaling law-based models of AI progress is that they more strongly predict that further AI progress will basically require more GPUs. It might be that the first Einstein+ level AGI is in fact developed mostly through scaling, but these models of progress are also more surprised than Eliezer-2008 when it turns out that (ordinary, human-developed) algorithmic improvements and optimizations allow for the training of e.g. a GPT-4-level model with many fewer resources than it took to train the original GPT-4 just a few years ago.