Yudkowsky vs Hanson on FOOM: Whose Predictions Were Better?

TLDR

Starting in 2008, Robin Hanson and Eliezer Yudkowsky debated the likelihood of FOOM: a rapid and localized increase in some AI’s intelligence that occurs because an AI recursively improves itself.

As Yudkowsky summarizes his position:

I think that, at some point in the development of Artificial Intelligence, we are likely to see a fast, local increase in capability—“AI go FOOM.” Just to be clear on the claim, “fast” means on a timescale of weeks or hours rather than years or decades; and “FOOM” means way the hell smarter than anything else around, capable of delivering in short time periods technological advancements that would take humans decades, probably including full-scale molecular nanotechnology. (FOOM, 235)

Over the course of this debate, both Hanson and Yudkowsky made a number of incidental predictions about things which could occur before the advent of artificial superintelligence—or for which we could at the very least receive strong evidence before artificial superintelligence.

On the object level, my conclusions is that when you examine these predictions, Hanson probably does a little better than Yudkowsky. Although depending on how you weigh different topics, I could see arguments from “they do about the same” to “Hanson does much better.”

On one meta level, my conclusion is that Hanson’s view—that we should try to use abstractions that have proven prior predictive power—looks like a pretty good policy.

On another meta level, my conclusion—springing to a great degree from how painful seeking clear predictions in 700 pages of words has been—is that if anyone says “I have a great track record” without pointing to specific predictions that they made, you should probably ignore them, or maybe point out their lack of epistemic virtue if you have the energy to spare for doing that kind of criticism productively.

Intro

There are number of difficulties involved in evaluating some public figure’s track record. We want to avoid cherry-picking sets of particularly good or bad predictions. And we want to have some baseline to compare them to.

We can mitigate both of these difficulties—although not, alas, eliminate them—by choosing one document to evaluate: “The Hanson-Yudkowsky Foom Debate”. (All future page numbers refer to this PDF.) Note that the PDF includes the (1) debate-via-blogposts which took place on OvercomingBias, (2) an actual in-person debate that took place at Jane Street in 2011 and (3) further summary materials from Hanson (further blogposts) and Yudkowsky (“Intelligence Explosion Microeconomic”). This spans a period from 2008 to 2013.

I do not intend this to be a complete review of everything in these arguments.

The discussion spans the time from the big bang until hypothetical far future galactic civilizations. My review is a little more constrained: I am only going to look at predictions for which I think we’ve received strong evidence in the 15 or so years since the debate started.

Note also that the context of this debate was quite different than it would be if it happened today.

At the time of the debate, both Hanson and Yudkowsky believed that machine intelligence would be extremely important, but that the time of its arrival was uncertain. They thought that it would probably arrive this century, but neither had the very, certain short timelines which are common today.

At this point Yudkowsky was interested in actually creating a recursively self-improving artificial intelligence, a “seed AI.” For instance, in 2006 the Singularity Institute—what MIRI was before it renamed—had a website explicitly stating that they sought funding to create recursively self-improving AI. During the Jane Street debate Yudkowsky humorously describes the Singularity Institute as the “Institute for Carefully Programmed Intelligence Explosion.”

So this context is quite different than today.

I think that if I make a major mistake in the below, it’s probably that I missed some major statements from Hanson or Yudkowsky, rather than drastically mis-calling the items that I did include. Such a mistake could happen in part because I have tried to be conservative, and mostly included predictions which seem to have multiple affirmations in the text. But I definitely did skim-read parts of the debate that seemed irrelevant to predictions, such as the parts about the origin of life or about the morality of taking over the world with a superhuman AI. So I could very well have missed something.

Feel free to mention such missed predictions in the comments, although please quote and cite page numbers. Rereading this has confirmed my belief that the recollected mythology of positions advanced during this debate is… somewhat different than what people’s actual positions were.

Predictions—Relatively Easy To Call

In this section, I’m going to include predictions which appear to me relatively straightforward.

I don’t think many people who read the FOOM debate for the first time now would dispute them. Although they are nevertheless disputable if you try really hard, like everything.

I’ll phrase each prediction so that Yudkowsky takes the positive, and Hanson the negative.

“Cyc is not a Promising Approach to Machine Intelligence”

Cyc was (and is) an effort to build an artificial intelligence by building a vast database of logically-connected facts about the world by hand. So a belief like “Bob works as an engineer” is represented by relating the entity <Bob> to <engineer> with <works-as>, in this database. These facts would then be entered into an “inference engine,” which can reason about them in long chains of valid proofs. Right now, CycCorp claims Cyc has a knowledge base with 25 million axioms, 40,000 predicates, and so on. Its creator, Douglas Lenat, moved on to Cyc from Eurisko because he decided AI needed in a large base of knowledge to work correctly.

Hanson thinks that this is a promising approach, stating:

The lesson Lenat took from Eurisko is that architecture is overrated; AIs learn slowly now mainly because they know so little. So we need to explicitly code knowledge by hand until we have enough to build systems effective at asking questions, reading, and learning for themselves. Prior AI researchers were too comfortable starting every project over from scratch; they needed to join to create larger integrated knowledge bases.… They had to start somewhere, and in my opinion they have now collected a knowledge base with a truly spectacular size, scope, and integration. Other architectures may well work better, but if knowing lots is anywhere near as important as Lenat thinks, I’d expect serious AI attempts to import Cyc’s knowledge, translating it into a new representation. (FOOM, 226)

On the other hand, Yudkowsky thinks Cyc has approximately zero chance of working well:

Knowledge isn’t being able to repeat back English statements. This is true even of humans. It’s a hundred times more true of AIs, even if you turn the words into tokens and put the tokens in tree structures… A basic exercise to perform with any supposed AI is to replace all the English names with random gensyms and see what the AI can still do, if anything. Deep Blue remains invariant under this exercise. Cyc, maybe, could count—it may have a genuine understanding of the word “four”—and could check certain uncomplicatedly structured axiom sets for logical consistency, although not, of course, anything on the order of say Peano arithmetic. The rest of Cyc is bogus. If it knows about anything, it only knows about certain relatively small and simple mathematical objects, certainly nothing about the real world. (FOOM, 228)

Yudkowsky seems obviously right from where we stand now; Cyc was not promising.

The most advanced modern AI systems have zero need to import knowledge from Cyc, and Cyc’s abilities pale besides modern LLMs. By 2011, Hanson concedes at least somewhat to Yudkowsky’s position and states that Cyc might not have enough information or be in the wrong format (FOOM, 496).

Aside: What counts as being right

I think that Yudkowsky is obviously right here.

But if you wished, you could say that Hanson’s position that “Cyc is promising” has not been entirely falsified. CycCorp still appears to have customers. Their product functions. They advertise the conclusions of Cyc as auditable, in a way that the conclusions of DL are not, and this is true. Functionally, Cyc is surpassed by machine learning for basically everything—but you could say that in the future the approach might possibly turn things around. It’s a logically coherent thing to say.

Nevertheless—I’m comfortable saying that Cyc is the wrong approach, and that Yudkowsky clearly had the better predictions about this. As Yudkowsky said even in 2011, Cyc being promising has “been incrementally more and more falsified” each year (FOOM 476), and each year since 2011 has been further incremental falsification.

My basic criteria for judgement is that, if you had believed Hanson’s view, you’d have been waaaaaaaaay more surprised by the future than if you had believed Yudkowsky’s view. This will be the approach I’m taking for all the other predictions as well.

“AI Comes Before Whole-Brain Emulations”

Intelligence-on-computers could come in at least two ways.

It could come through AI and machine-learning algorithms manually coded by humans, perhaps inspired by the human brain but ultimately only loosely connected to it. Or it could come from some kind of high-resolution scan of a human brain, which is then virtualized and run on a computer: a whole brain emulation (WBE or “em”).

Hanson literally wrote the book on ems (albeit after this debate) and thinks that ems are marginally more likely to occur before hand-coded AI (FOOM, 26).

Yudkowsky also had—as of the Hanson-Yudkowsky debate, not now—very broad intervals for the arrival of machine intelligence, which he summarizes as “I don’t know which decade and you don’t know either” (FOOM, 682). Nevertheless, he think AI is likely to occur before ems.

AI seems well on its way, and ems as distant as they did in 2008, so I’m comfortable saying that Yudkowsky’s position looks far more accurate right now.

Nevertheless, both Yudkowsky and Hanson explicitly call attention the very broad distribution of their own timelines, so it is a small update towards Yudkowsky over Hanson.

“AI Won’t Be Able to Exchange Cognitive Content Easily”

A central part of the dispute between Yudkowsky and Hanson is how localized future growth rates will be.

They both think that an economy with machine intelligences in it—either em or AI—will grow very quickly compared to our current economy.

But Hanson sees a world where “these AIs, and their human owners, and the economy that surrounds them, undergo a collective FOOM of self-improvement. No local agent is capable of doing all this work, only the collective system” (FOOM, 276, Yudkowsky summarizing Hanson). Yudkowsky, on the other hand, sees a world where an individual AI undergoes a rapid spike in self-improvement relative to the world; where a brain in a box in a basement can grow quickly to come to out-think all of humanity.

One thing that could influence whether growth is more local or global is whether AIs can trade cognitive content. If such trading such cognitive content with your neighbors is more advantageous—or trading in general is advantageous—then growth will probably be more global; if trading is less advantageous, growth will probably be more local.

Yudkowsky thinks trading or simply exchanging cognitive content between AIs is quite unlikely. Part of this is because of the current state AI in 2008, where no one AI architecture has grown to dominate the others:

And I have to say that, looking over the diversity of architectures proposed at any AGI conference I’ve attended, it is very hard to imagine directly trading cognitive content between any two of them. It would be an immense amount of work just to set up a language in which they could communicate what they take to be facts about the world—never mind preprocessed cognitive content.

And a little earlier:

Trading cognitive content between diverse AIs is more difficult and less likely than it might sound. Consider the field of AI as it works today. Is there any standard database of cognitive content that you buy off the shelf and plug into your amazing new system, whether it be a chess player or a new data-mining algorithm? If it’s a chess-playing program, there are databases of stored games—but that’s not the same as having databases of preprocessed cognitive content. So far as I can tell, the diversity of cognitive architectures acts as a tremendous barrier to trading around cognitive content. (FOOM, 278)

But not all of this is simply projecting the present into the future. He further thinks that even if different AIs were to have the same architecture, trading cognitive content between them would be quite difficult:

If you have many AIs around that are all built on the same architecture by the same programmers, they might, with a fair amount of work, be able to pass around learned cognitive content. Even this is less trivial than it sounds. If two AIs both see an apple for the first time, and they both independently form concepts about that apple, and they both independently build some new cognitive content around those concepts, then their thoughts are effectively written in a different language. (FOOM, 278)

By default, he also expects more sophisticated, advanced AIs to have representations that are more opaque to each other. This effect he thinks will be so significant that pre-FOOM AIs might be incapable of doing it: “AI would have to get very sophisticated before it got over the “hump” of increased sophistication making sharing harder instead of easier. I’m not sure this is pre-takeoff sophistication we’re talking about, here” (FOOM, 280).

Again—in today’s world, sharing of cognitive content between diverse AIs doesn’t happen, even though there are lots of machine learning algorithms out there doing various jobs. You could say things would happen differently in the future, but it’d be up to you to make that case. (FOOM, 280)

Hanson, on the other hand, thinks that the current diverse state of AI architectures is simply an artifact of the early state of AI development. As AI research finds solutions that work, we should expect that architectures become more standardized. And as architectures become more standardized, this will make sharing between AIs more easy:

Amost every new technology comes at first in a dizzying variety of styles and then converges to what later seems the “obvious” configuration. It is actually quite an eye-opener to go back and see old might-have-beens, from steam-powered cars to pneumatic tube mail to memex to Engelbart’s computer tools. Techs that are only imagined, not implemented, take on the widest range of variations. When actual implementations appear, people slowly figure out what works better, while network and other scale effects lock in popular approaches… But of course “visionaries” take a wide range of incompatible approaches. Commercial software tries much harder to match standards and share sources. (FOOM, 339-340)

This makes him think that sharing between AIs is likely to occur relatively easily, because AI progress will make architectures more similar, which makes it easier to share cognitive content between AIs.

Hanson is the clear winner here. We don’t have AIs that are exchanging cognitive content, because we don’t have AIs that are sufficiently agent-like to do this. But humans now exchange cognitive AI content all the time.

Per Hanson’s prediction, AI architectures have standardized around one thing—neural networks, and even around a single neural network architecture (Transformers) to a very great degree. The diversity Yudkowsky observed in architectures has shrunk enormously, comparatively speaking.

Moreover, granting neural networks, trading cognitive content has turned out to be not particularly hard. It does not require superintelligence to share representations between different neural networks; a language model can be adapted to handle visual data without enormous difficulty. Encodings from BERT or an ImageNet model can be applied to a variety of downstream tasks, and this is by now a standard element in toolkits and workflows. When you share architectures and training data, as for two differently fine-tuned diffusion models, you can get semantically meaningful merges between networks simply by taking the actual averages of their weights. Thoughts are not remotely “written in a different language.”

So generally, cognitive content looks to be relatively easy to swap between different systems. It remains easy to swap as systems get smarter, and workflows that involve such swapping are becoming increasingly common. Hanson’s view looks more accurate.

“Improvements in One AI Project Generally Won’t Improve Another Much”

This issue mirrors the one above.

As whether cognitive content could be easily shared between AIs is relevant for local vs. global takeoff, so is whether cognitive algorithms could be easily shared between AIs. That is, whether the improvements you make to one AI could be relatively easily transferred to another.

Yudkowsky states:

The same sort of barriers that apply to trading direct cognitive content would also apply to trading changes in cognitive source code.… It’s a whole lot easier to modify the source code in the interior of your own mind than to take that modification and sell it to a friend who happens to be written on different source code.… This is another localizing force. It means that the improvements you make to yourself, and the compound interest earned on those improvements, are likely to stay local. If the scenario with an AI takeoff is anything at all like the modern world in which all the attempted AGI projects have completely incommensurable architectures, then any self-improvements will definitely stay put, not spread.

Yudkowsky does relax his confidence about sharing cognitive algorithms by the time of the 2011 debate, noting that chess algorithms have benefitted from sharing techniques, but still maintains his overall position (FOOM, 663).

Similarly to the above, Hanson thinks as progress occurs, improvements will begin to be shared.

Yudkowsky is again pretty clearly wrong here.

An actual improvement to say, how Transformers work, would help with speech recognition, language modelling, image recognition, image segmentation, and so on and so forth. Improvements to AI-relevant hardware are a trillion-dollar business. Work compounds so easily on other work that many alignment-concerned people want to conduct all AI research in secret.

Hanson’s position looks entirely correct.

“Algorithms are Much More Important Than Compute for AI Progress”

Different views about the nature of AI imply different things about how quickly AIs could FOOM.

If most of the space between the sub-human AIs of 2008 and potentially superhuman AIs of the future is algorithmic, then growth could be very fast and localized as AI discovers these algorithms. The “a brain in a box in a basement” frequently mentioned in the Jane Street debate could discover algorithms that let it move from merely human to godlike intelligence overnight.

On the other hand, if a lot of the space between AIs of 2008 and superhuman AIs of the future is in size of compute needed—or if greater compute is at least a prerequisite for having superhuman AI—then growth is likely to be slower because AIs need to obtain new hardware or even build new hardware. A computer in a basement somewhere would need to purchase time in the cloud, hack GPUs, or purchase hardware to massively increase its intelligence, which could take more time and is at least more visible.

Yudkowsky uniformly insists that qualitative algorithmic differences are more important than compute, and moreover that great quantities of compute are not a prerequisite.

For instance, he says that “quantity [of minds] < (size, speed) [of minds] < quality [of minds]” (FOOM, 601). He expects “returns on algorithms to dominate” during an intelligence explosion (627). He consistently extends this belief into the past, noting that although human brains are four times bigger than chimpanzee brains “this tells us very little because most of the differences between humans and chimps are almost certainly algorithmic” (FOOM, 613).

When he mentions that compute could contribute to AI progress, he always makes clear that algorithms will be more important :

Let us consider first the prospect of an advanced AI already running on so much computing power that it is hard to speed up. I find this scenario somewhat hard to analyze because I expect AI to be mostly about algorithms rather than lots of hardware, but I can’t rule out scenarios where the AI is developed by some large agency which was running its AI project on huge amounts of hardware from the beginning… Thus I cannot say that the overall scenario is implausible. (FOOM, 628, emphasis mine)

To take another view on how he believes that limited compute is in no way an obstacle to FOOM; he gives a “rough estimate” that you could probably run a mind about as smart as a human’s mind on a 2008 desktop, or “or maybe even a desktop computer from 1996.” (FOOM, 257)

1996 Desktop, Top of the Line

But a desktop from 1996 isn’t even the lower limit. If a superintelligence were doing the design for a mind, he continues, “you could probably have [mind of] roughly human formidability on something substantially smaller” (FOOM, 257).

This view about the non-necessity of compute is thoroughly and deliberately integrated into Yudkowsky’s view, without particular prodding from Hanson—he has several asides in FOOM where he explains how Moravec or Kurzweil’s reasoning about needing human-equivalent compute for AI is entirely wrong (FOOM, 19, 256).

Hanson does not cover topic of compute as much.

To the degree he does, he is extremely dubious that there any small handful of algorithmic insights in intelligence-space that will grant intelligence; he also emphasizes hardware much more.

For instance, he approvingly states that “the usual lore among older artificial intelligence researchers is that new proposed architectural concepts are almost always some sort of rearranging of older architectural concepts.” He continues:

AI successes come when hardware costs fall enough to implement old methods more vigorously. Most recent big AI successes are due to better ability to integrate a diversity of small contributions. See how Watson won, or Peter Norvig on massive data beating elegant theories. New architecture deserves only small credit for recent success.… Future superintelligences will exist, but their vast and broad mental capacities will come mainly from vast mental content and computational resources. (FOOM, 497)

Yudkowsky seems quite wrong here, and Hanson right, about one of the central trends—and maybe the central trend—of the last dozen years of AI. Implementing old methods more vigorously is more or less exactly what got modern deep learning started; algorithms in absence of huge compute have achieved approximately nothing.

The Deep Learning revolution is generally dated from 2012′s AlexNet. The most important thing about AlexNet isn’t any particular algorithm; the most important thing is that the authors wrote their code with CUDA to run on GPUs, which let them make the neural network far bigger then it could otherwise have been while training in a mere week. Pretty much all subsequent progress in DL has hinged on the continuing explosion of compute resources since then. Someone who believed Yudkowsky would have been extremely surprised by 2012-2020, when compute spent on ML runs doubled every 6 months and when that doubling was nearly always key for the improved performance.

Algorithms do matter. I think finding the right algorithms and data, rather than getting enough compute, are probably the biggest current obstacles to extremely compute-rich organizations like OpenAI or Google right now. But it is nevertheless undisputable that algorithms have not had the primary importance Yudkowsky attributed to them, in the absence of vastly increased compute. Put it this way: there still exist comparatively compute-frugal AI startups like Keen Technologies—but even these still need to buy things like a DGX station that would be the most powerful supercomputer in the world if it existed in 2008 by a wide margin. So a comparatively compute-frugal program now is still compute-rich beyond anything Yudkowsky points to over the course of the debate.

(If you’re further interested in the topic you should of course read Gwern on the scaling hypothesis.)

Yudkowsky himself sometimes appears to have changed his mind at least somewhat—if still he thought that algorithms were the key to AGI, he wouldn’t have advocated for banning huge GPU clusters with international law, because that’s the kind of thing which would predictably focus more attention on improved algorithms, no?

On the other hand—he seems (?) to still think that if only AI researchers were smart enough, progress would not involve huge compute? From his discussion with Ngo:

[A] lot of the current interesting results have been from people spending huge compute (as wasn’t the case to nearly the same degree in 2008) and if things happen on short timelines it seems reasonable to guess that the future will look that much like the present. This is very much due to cognitive limitations of the researchers rather than a basic fact about computer science, but cognitive limitations are also facts and often stable ones.

“The last decade of progress has depended on compute because everyone is too stupid to program human-level AI on a 2008 computer,” could be the most Yudkowskan possible response to the evidence of the past ten years.

But—regardless of Yudkowsky’s current position—it still remains that you’d have been extremely surprised by the last decade’s use of compute if you had believed him, and much less surprised if you had believed Hanson.

Predictions—Harder to Call

The above cases seem to me relatively clear.

The below I think seem pretty sensitive to what kind of predictions you take Hanson and Yudkowsky to be making, and how favorably or unfavorably you read them. The are greater interpretive degrees of freedom.

Nevertheless I include it this section, mostly because I’ve seen various claims that evidence supports one person or another.

“Human Content is Unimportant Compared to the Right Architecture”

A topic that comes up over and over again over the course of the debate—particularly later, though—is how important the prior “content” of all prior human civilization might be.

That is, consider of all the explicit knowledge encoded in all the books humans have written. Consider also all the implicit knowledge encoded in human praxis and tradition: how to swing an axe to cut down a tree, how to run a large team of AI science researchers, how to navigate different desired levels of kitchen cleanliness among roommates, how to use an arc-welder, how to calm a crying baby, and so on forever. Consider also all the content encoded not even in anyone’s brains, but in the economic and social relationships without which society does not function.

How important is this kind of “content”?

It could be that this content, built up over the course of human civilization, is actually something AI would likely need. After all, humans take the first two decades or so of their life trying to absorb a big chunk of it. So it might be difficult for an AI to rederive all human scientific knowledge without this content.

Alternately, the vast edifice of prior human civilization and knowledge might fall before a more elegant AI architecture. The AI might find that it could easily recreate most of this knowledge without much difficulty, then quickly vault past it.

Hanson generally thinks that this content is extremely important.

...since a million years ago when humans probably had language, we are now a vastly more powerful species, because we used this ability to collect cultural content and built up a vast society that contains so much more. I think that if you took humans and made some better architectural innovations to them and put a pile of them off in the forest somewhere, we’re still going to outcompete them if they’re isolated from us because we just have this vaster base that we have built up since then. (FOOM, 449, emphasis mine)

And again, Hanson:

I see our main heritage from the past as all the innovations embodied in the design of biological cells/​bodies, of human minds, and of the processes/​habits of our hunting, farming, and industrial economies. These innovations are mostly steadily accumulating modular “content” within our architectures, produced via competitive processes and implicitly containing both beliefs and values.

Yudkowsky on the other hand, thinks that with the right architecture you can just skip over a lot of human content:

It seems to me at least that if we look at the present cognitive landscape, we’re getting really strong information that… humans can develop all sorts of content that lets them totally outcompete other animal species who have been doing things for millions of years longer than we have by virtue of architecture, and anyone who doesn’t have the architecture isn’t really in the running for it. (FOOM, 448)

Notably, Yudkowsky has also claimed, some years after the debate, that the evidence supports him in this domain.

In 2017 AlphaGoZero was released, which was able to learn Go at a superhuman level without learning from any human games at all. Yudkowsky then explained how this was evidence for his position:

I emphasize how all the mighty human edifice of Go knowledge, the joseki and tactics developed over centuries of play, the experts teaching children from an early age, was entirely discarded by AlphaGo Zero with a subsequent performance improvement. These mighty edifices of human knowledge, as I understand the Hansonian thesis, are supposed to be the bulwark against rapid gains in AI capability across multiple domains at once. I was like “Human intelligence is crap and our accumulated skills are crap” and this appears to have been bourne out.

Yes, Go is a closed system allowing for self-play. It still took humans centuries to learn how to play it. Perhaps the new Hansonian bulwark against rapid capability gain can be that the environment has lots of empirical bits that are supposed to be very hard to learn, even in the limit of AI thoughts fast enough to blow past centuries of human-style learning in 3 days; and that humans have learned these vital bits over centuries of cultural accumulation of knowledge, even though we know that humans take centuries to do 3 days of AI learning when humans have all the empirical bits they need; and that AIs cannot absorb this knowledge very quickly using “architecture”, even though humans learn it from each other using architecture. If so, then let’s write down this new world-wrecking assumption (that is, the world ends if the assumption is false) and be on the lookout for further evidence that this assumption might perhaps be wrong.

Tl;dr: As others are already remarking, the situation with AlphaGo Zero looks nothing like the Hansonian hypothesis and a heck of a lot more like the Yudkowskian one.

So Yudkowsky says.

If we round off Hanson’s position to “content from humans is likely to matter a lot” and Yudkowsky’s to “human content is crap,” then I think that AlphaGoZero is some level of evidence in support of Yudkowsky’s view. (Although Hanson responded by saying it was a very small piece of evidence, because his view always permitted narrow tools to make quick progress without content, and AGZ is certainly a narrow tool.)

On the other hand, is it the only piece of evidence reality gives us on this matter? Is it the most important?

One additional piece of data is that some subsequent developments of more complex game-playing AI have not been able to discard human data. Neither DeepMind’s StarCraft II, nor OpenAI’s Dota2 playing agents—both post Go-playing AIs—were able to train without being jumpstarted by human data. Starcraft II and Dota2 are far more like the world than Go—they involve partial information, randomness, and much more complex ontologies. So this might be an iota of evidence for something like a Hansonian view.

But far more importantly, and even further in the same direction—non-narrow tools like GPT-4 are generally trained by dumping a significant fraction of all written human content into them. Training them well currently relies in part on mildly druidical knowledge about the right percent of the different parts of human content to dump into them—should you have 5% code or 15% code? Multilingual or not? More ArXiV or more Stack overflow? There is reasonable speculation that we will run out of sufficient high-quality human content to feed these systems. The recent PaLM-2 paper has 18 authors for the data section—more than it has for the architecture section! (Although both have fewer than the infrastructure section gets, of course—how to employ compute still remains big.) So content is hugely important for LLMs.

Given that GPT-4 and similar programs look to be by far the most generally intelligent AI entities in the real world rather than a game world yet made, it’s hard for me to see this as anything other than some evidence that content in Hanson’s sense might matter a lot. If LLMs matter more for future general intelligence than AlphaGoZero—which is a genuinely uncertain “if” for me—then Hanson probably gets some fractional number of Bayes points over Yudkowsky. If not, maybe the reverse?

I don’t think the predictions are remotely clear enough for either person to claim reality as on their side.

“Simple AI architectures will generalize very well” (Claim probably not made)

Different AI architectures can be more simple or more complex.

AlphaGo, which combines Monte-Carlo Tree Search, a policy network and a value network, is probably more architecturally complex than GPT-3, which is mostly a single giant transformer. Something like DreamerV3 is probably more complex than either, although you very quickly get into discussion of “what counts as complexity?” But there is in any event a spectrum of architectural complexity out there—a system of one giant neural network trained end-to-end is relatively less complex, and a system of multiple neural networks trained with different objective functions is relatively more complex.

Yudkowsky has claimed (since the FOOM debate) that he predicted (in the FOOM debate) something akin to “simple architectures will generalize very well over broad domains.” Thus, during his discussion with Ngo last year:

But you can also see powerful practical hints that these things [intelligence and agency] are much more correlated than, eg, Robin Hanson was imagining during the FOOM debate, because Robin did not think something like GPT-3 should exist; Robin thought you should need to train lots of specific domains that didn’t generalize. I argued then with Robin that it was something of a hint that humans had visual cortex and cerebellar cortex but not Car Design Cortex, in order to design cars. Then in real life, it proved that reality was far to the Eliezer side of Eliezer on the Eliezer-Robin axis, and things like GPT-3 were built with less architectural complexity and generalized more than I was arguing to Robin that complex architectures should generalize over domains.

In general, I think right now it does look like you can get a pretty architecturally simple network doing a lot of cool cross-domain things. So if Yudkowsky had predicted it and Hanson had denied it, it would be some level of evidence for Yudkowsky’s view over Hanson’s.

The problem is that Yudkowsky mostly.… just doesn’t seem to predict this unambiguously? I have ctrl-f’d for “car,” “automobile,” “cortex” through the PDF, and just not found that particular claim.

He does make some similar claims. For instance, Yudkowsky does claim that human level AI will be universally cross-domain.

In other words, trying to get humanlike performance in just one domain is divorcing a final product of that economy from all the work that stands behind it. It’s like having a global economy that can only manufacture toasters, but not dishwashers or light bulbs. You can have something like Deep Blue that beats humans at chess in an inhuman, specialized way; but I don’t think it would be easy to get humanish performance at, say, biology R&D, without a whole mind and architecture standing behind it that would also be able to accomplish other things. Tasks that draw on our cross-domain-ness, or our long-range real-world strategizing, or our ability to formulate new hypotheses, or our ability to use very high-level abstractions—I don’t think that you would be able to replace a human in just that one job, without also having something that would be able to learn many different jobs.

Unfortunately, this is a claim that an architecture will have breadth, but not a claim about the simplicity of the architecture. It is also—granting that we don’t have AIs that can do long-range planning—one for which we haven’t received good information.

Here’s a claim Yudkowsky and Hanson disagree about that could be interpreted as “simple architectures will generalize far”—Yudkowsky says that only a few insights separate AI from being human-level.

On one hand, you’d think that saying a “few insights” separate AI from human-level-ness sort-of implies that the AI would have a simple architecture. But on the other hand, you could truthfully say only a few insights let you steer rockets around, fundamentally… but rockets nevertheless have pretty complex architectures. I’m not sure that the notion of “few insights” really corresponds to “simple architecture.” In the dialog, it more seems to correspond to.… FOOM-ability, to the idea that you can find an insight while thinking in a basement that lets your thinking improve 2x, which is indifferent to the simplicity of the architecture the insight lets you find.

Let me return to what Yudkowsky and Hanson actually say, to show why.

Yudkowsky claims that a small handful of insights will likely propel a ML model from infrahumanity to superhumanity. He characterises the number as “about ten” (FOOM, 445) but also says it might be just one or two important ones (FOOM, 450). He affirms that “intelligence is about architecture” and that “architecture is mostly about deep insights” (FOOM, 406, emphasis his) and thus that the people who make an AI FOOM will have done so because of new deep insights (FOOM, 436).

Hanson, by contrast, thinks “powerful architectural insights are quite rare” (FOOM, 496). He believes that “most tools require lots more than a few key insights to be effective—they also require thousands of small insights that usually accumulate from a large community of tool builders and users” (FOOM, 10). He does think that there are some large insights in general, but insights “are probably distributed something like a power law, with many small-scope insights and a few large-scope” (FOOM, 144).

We shouldn’t underrate the power of insight, but we shouldn’t overrate it either; some systems can just be a mass of details, and to master such systems you must master those details. And if you pin your hopes for AI progress on powerful future insights, you have to ask how often such insights occur, and how many we would need. The track record so far doesn’t look especially encouraging. (FOOM, 351)

So Hanson in general thinks AI will look like most technology—see the progress of planes, cars, guns, and so on—in that progress comes from hundreds of tiny refinements and improvements. There’s no moment in the history of planes where they suddenly become useful—there are 100s of small and big improvements all gradually moving planes from “mostly useless, with rare exceptions” to “incredibly useful.”

Yudkowsky, on the other hand, thinks that AI will look more like a handful of “eureka!” moments, followed up by some coding and subsequent world-transformation. As is witnessed, of course, by MIRI’s /​ the then-Singularity institute plan to build a seed AGI entirely on their own.

If we take this as the disagreement—will AI progress come from a handful of big insights, or many small ones—I think the world right looks a great deal more like Hanson’s view than Yudkowsky’s. In his interview with Lex Fridman, Sam Altman characterizes GPT-4 as improving on GPT-3 in a hundred little things rather than a few big things, and that’s… by far… my impression of current ML progress. So when I interpret their disagreement in terms of the kind of work you need to do before attaining AGI, I tend to agree that Hanson is right.

On the other hand, we could return to saying that “few insights” implies “simple architecture.” I don’t think this is… exactly… implied by the text? I’ll admit that the vibes are for sure more on Yudkowsky’s side. So if we interpret the text that way, then I’d tend to agree that Yudkowsky is right.

Either way, though, I don’t think Yudkowsky and Hanson were really clear about what was going on and about what kind of anticipations they were making.

Misc

I was going to have a whole section of things that didn’t quite make the cut vis-a-vis predictions, but were super suggestive, but that could be seen as trying to influence the results on my part. So I’m just going to bail instead, mostly.

Conclusion

Who was more right?

When I look at the above claims, Hanson’s record looks a little better than Yudkowsky’s, albeit with a small sample size. If you weight the Cyc prediction a ton, maybe you could get them to parity. I think it would be weird not to see the compute prediction as a little more important than the Cyc prediction, though.

Note that Hanson currently thinks the chances of AI doom are < 1%, while Yudkowsky thinks that they are > 99%. (Hanson thinks the chances of doom are… maybe somewhat lower than Yudkowsky, but they seem to have different ontologies of what qualifies as “doom” as the comments point out.)

What Actual Lessons Can We Learn, Other Than Some Stuff About Deferral to Authority That Everyone Will Ignore Because We Like to Pretend We Do Not Defer to Authority, Even Though We All Fucking Do?

I was mildly surprised by how well some economic abstractions hold up.

The big part of the meta-debate in FOOM—which they return to over and over again—is whether you should try to use mostly only mental tools whose results have proven useful in the past.

Hansons’ view is that if you use rules which you think retrodict data well but which haven’t been vetted by actual predictions, you are almost certain to make mistakes because humans psychologically cannot distinguish actual retrodictions from post-hoc fitting. To avoid this post-hoc fitting, you should only use tools which have proven useful for actual predictions. Thus, he prefers to use economic abstractions which have been thus vetted over novel abstractions invented for the purpose.

I think this holds up pretty well. Yudkowsky makes predictions about future use of compute in AI, based on his attempted retrodictions about human evolution, human skull size, and so on. These predictions mostly failed. On the other hand, Hanson makes some predictions about AI converging to more similar systems, about advances in these systems mutually improving competing systems, and so on, based only on economic theory. These predictions succeeded.

Overall, I think “don’t lean heavily on abstractions you haven’t yet gotten actual good predictions from” comes out pretty well from the debate, and I continue to heavily endorse research evaluation proposals related to it.