Reply to Eliezer on Biological Anchors

The “biological anchors” method for forecasting transformative AI is the biggest non-trust-based input into my thinking about likely timelines for transformative AI. While I’m sympathetic to parts of Eliezer Yudkowsky’s recent post on it, I overall disagree with the post, and think it’s easy to get a misimpression of the “biological anchors” report (which I’ll abbreviate as “Bio Anchors”) - and Open Philanthropy’s take on it—by reading it.

This post has three sections:

  • Most of Eliezer’s critique seems directed at assumptions the report explicitly does not make about how transformative AI will be developed, and more broadly, about the connection between its (the report’s) compute estimates and all-things-considered AI timelines. One way of putting this is that most of Eliezer’s critique doesn’t apply to the “bounding-based” interpretation of the report discussed in this post (which is my best explanation for skeptics of why I find the framework valuable; I will also give quotes below from the original report showing that its intended interpretation is along the same lines as mine).

  • Much of Eliezer’s critique is some form of “Look at the reference class you’re in,” invoking “Platt’s Law” and comparing the report to past attempts at biological anchoring. Based on my understanding of the forecasts he’s comparing it to and the salient alternatives, I don’t think this does much to undermine the report.

  • I also make a few minor points.

A few notes before I continue:

  • I think the comments on the post are generally excellent and interesting, and I recommend them. (I will mostly not be repeating things from the comments here.)

  • I generally view Bio Anchors as a tool for informing AI timelines rather than as a comprehensive generator of all-things-considered AI timelines, and will be discussing it as such. Bio Anchors also presents itself this way—see section Translating into views on TAI timelines.

  • Something like half of this post is blockquotes. I’ve often been surprised by the degree to which people (including people I respect a lot, such as Eliezer in this case) seem to mischaracterize specific pieces they critique , and I try to avoid this for myself by quoting extensively from a piece when critiquing it. (This still leaves the possibility that I’m quoting out of context; readers may want to spot-check that.)

  • This post doesn’t address what some have referred to as the “meta-level core thing”, though I might write some thoughts related to that in a future post.

Bounding vs. pinpointing

Here are a number of quotes from Eliezer in which I think he gives the impression that Biological Anchors assumes transformative AI will be arrived at via modern machine learning methods:

OpenPhil: Because AGI isn’t like biology, and in particular, will be trained using gradient descent instead of evolutionary search, which is cheaper. We do note inside our report that this is a key assumption, and that, if it fails, the estimate might be correspondingly wrong - …

OpenPhil: Well, search by evolutionary biology is more costly than training by gradient descent, so in hindsight, it was an overestimate. Are you claiming this was predictable in foresight instead of hindsight?

Eliezer: I’m claiming that, at the time, I snorted and tossed Somebody’s figure out the window while thinking it was ridiculously huge and absurd, yes.

OpenPhil: Because you’d already foreseen in 2006 that gradient descent would be the method of choice for training future AIs, rather than genetic algorithms?

Eliezer: Ha! No. Because it was an insanely costly hypothetical approach whose main point of appeal, to the sort of person who believed in it, was that it didn’t require having any idea whatsoever of what you were doing or how to design a mind.

OpenPhil: Suppose one were to reply: “Somebody” didn’t know better-than-evolutionary methods for designing a mind, just as we currently don’t know better methods than gradient descent for designing a mind; and hence Somebody’s estimate was the best estimate at the time, just as ours is the best estimate now? …

OpenPhil: It seems to us that Moravec’s estimate, and the guess of your nineteen-year-old past self, are both predictably vast underestimates. Estimating the computation consumed by one brain, and calling that your AGI target date, is obviously predictably a vast underestimate because it neglects the computation required for training a brainlike system. It may be a bit uncharitable, but we suggest that Moravec and your nineteen-year-old self may both have been motivatedly credulous, to not notice a gap so very obvious.

Eliezer: I could imagine it seeming that way if you’d grown up never learning about any AI techniques except deep learning, which had, in your wordless mental world, always been the way things were, and would always be that way forever.

I mean, it could be that deep learning will still be the bleeding-edge method of Artificial Intelligence right up until the end of the world. But if so, it’ll be because Vinge was right and the world ended before 2030, not because the deep learning paradigm was as good as any AI paradigm can ever get. That is simply not a kind of thing that I expect Reality to say “Gotcha” to me about, any more than I expect to be told that the human brain, whose neurons and synapses are 500,000 times further away from the thermodynamic efficiency wall than ATP synthase, is the most efficient possible consumer of computations …

OpenPhil: How could anybody possibly miss anything so obvious? There’s so many basic technical ideas and even philosophical ideas about how you do AI which make it supremely obvious that the best and only way to turn computation into intelligence is to have deep nets, lots of parameters, and enormous separate training phases on TPU pods …

OpenPhil: How quaint and archaic! But that was 13 years ago, before time actually got started and history actually started happening in real life. Now we’ve got the paradigm which will actually be used to create AGI, in all probability; so estimation methods centered on that paradigm should be valid.

However, the argument given in Bio Anchors does not hinge on an assumption that modern deep learning is what will be used, nor does it set aside the possibility of paradigm changes.

From the section What if TAI is developed through a different path?:

I believe that this analysis can provide a useful median estimate even if TAI is produced through a very different path: essentially, by the time it is affordable to develop TAI through a particular highlighted route, it is plausible that somebody develops it through that route or any cheaper route. I consider the example of a distributed economic transition facilitated by a broad range of different technologies below, but the same reasoning applies to the possibility that a unified transformative program may be developed using a qualitatively different “AI paradigm” that can’t be usefully considered a descendant of modern machine learning …

Because this model estimates when one particular path toward transformative AI (let’s call it the “big model path”) out of many will be attainable, that means if this analysis is correct (i.e., if I am correct to assume the big model path is possible at all due to the theoretical feasibility of local search, and if we correctly estimated the probability that it would be attainable in year Y for all Y), then the probability estimates generated should be underestimates

However, once sources of distortion (many of which tend to push our estimates upward) are properly taken into account, I think it is fairly unclear whether these estimates should actually be considered underestimates [one such source given is similar to my comments here following “When it comes to translating my ‘sense of mild surprise’ into a probability]

For each biological anchor hypothesis, I am acting on the assumption that there is a relatively broad space of “unknown unknown” paths to solving a transformative task within that range of technical difficulty, not just the particular concrete path I have written down for illustration in association with each hypothesis (which is often fairly conjunctive) …

some of our technical advisors are still relatively confident these probability estimates are low-end estimates. This is partly because they would assign a higher probability to some of the low-end biological anchor hypotheses than I do, partly because they are overall more confident in the argument given above that these numbers ought to be considered underestimates …

For now, I feel that the most reasonable way to interpret the probability estimates generated by the biological anchors framework is as a rough central estimate for when TAI will be developed rather than as particularly conservative or particularly aggressive. In making this judgment, I am admittedly mentally running together a large cloud of heterogeneous considerations which in a maximally-principled and transparent analysis should be handled separately.

That is, Ajeya (the author) sees the “median” estimate as structurally likely to be overly conservative (a soft upper bound) for reasons including those Eliezer gives, but is also adjusting in the opposite direction to account for factors including the generic burden of proof. (More discussion of “soft bounds” provided by Bio Anchors in this section and this section of the report.)

I made similar arguments in a recent piece, “Biological anchors” is about bounding, not pinpointing, AI timelines. This is my best explanation for skeptics of why I find the framework valuable.

As far as I can tell, the only part of Eliezer’s piece that addresses an argument along the lines of the “soft bounding” idea is:

OpenPhil: Doesn’t our calculation at least provide a soft upper bound on how much computation is required to produce human-level intelligence? If a calculation is able to produce an upper bound on a variable, how can it be uninformative about that variable?

Eliezer: You assume that the architecture you’re describing can, in fact, work at all to produce human intelligence. This itself strikes me as not only tentative but probably false. I mostly suspect that if you take the exact GPT architecture, scale it up to what you calculate as human-sized, and start training it using current gradient descent techniques… what mostly happens is that it saturates and asymptotes its loss function at not very far beyond the GPT-3 level—say, it behaves like GPT-4 would, but not much better.

This is what should have been told to Moravec: “Sorry, even if your biology is correct, the assumption that future people can put in X amount of compute and get out Y result is not something you really know.” And that point did in fact just completely trash his ability to predict and time the future.

The same must be said to you. Your model contains supposedly known parameters, “how much computation an AGI must eat per second, and how many parameters must be in the trainable model for that, and how many examples are needed to train those parameters”. Relative to whatever method is actually first used to produce AGI, I expect your estimates to be wildly inapplicable, as wrong as Moravec was about thinking in terms of just using one supercomputer powerful enough to be a brain. Your parameter estimates may not be about properties that the first successful AGI design even has. Why, what if it contains a significant component that isn’t a neural network? I realize this may be scarcely conceivable to somebody from the present generation, but the world was not always as it was now, and it will change if it does not end.

I don’t literally think that the “exact GPT architecture” would work to produce transformative AI, but I think something not too far off would be a strong contender—such that having enough compute to afford this extremely brute-force method, combined with decades more time to produce new innovations and environments, does provide something of a “soft upper bound” on transformative AI timelines.

Another way of putting this is that a slightly modified version of what Eliezer calls “tentative [and] probably false]” seems to me to be “tentative and probably true.” There’s room for disagreement about this, but this is not where most of Eliezer’s piece focused.

While I can’t be confident, I also suspect that the person in the 2006 or thereabouts part of Eliezer’s piece may have intended to argue for something more like a “(soft) upper bound” than a median estimate.

Finally, I want to point out this quote from Bio Anchors, which reinforces that it is intended as a tool for informing AI timelines rather than as a comprehensive generator of all-things-considered AI timelines:

This model is not directly estimating the probability of transformative AI, but rather the probability that the amount of computation that would be required to train a transformative model using contemporary ML methods would be attainable for some AI project, assuming that algorithmic progress, spending, and compute prices progress along a “business-as-usual” trajectory …

How does the probability distribution output by this model relate to TAI timelines? In the very short-term (e.g. 2025), I’d expect this model to overestimate the probability of TAI because it feels especially likely that other elements such as datasets or robustness testing or regulatory compliance will be a bottleneck even if the raw compute is technically affordable, given that a few years is not a lot of time to build up key infrastructure. In the long-term (e.g. 2075), I’d expect it to underestimate the probability of TAI, because it feels especially likely that we would have found an entirely different path to TAI by then.

It seems that Eliezer places higher probability on an “entirely different path” sooner than Bio Anchors, but he does not seem to argue for this (and see below for why I don’t think it would be a great bet). Instead, he largely argues that the possibility is ignored by Bio Anchors, which is not the case.

Platt’s Law and past forecasts

Eliezer writes:

Eliezer: So does the report by any chance say—with however many caveats and however elaborate the probabilistic methods and alternative analyses—that AGI is probably due in about 30 years from now?

OpenPhil: Yes, in fact, our 2020 report’s median estimate is 2050; though, again, with very wide credible intervals around both sides. Is that number significant?

Eliezer: It’s a law generalized by Charles Platt, that any AI forecast will put strong AI thirty years out from when the forecast is made. Vernor Vinge referenced it in the body of his famous 1993 NASA speech, whose abstract begins, “Within thirty years, we will have the technological means to create superhuman intelligence. Shortly after, the human era will be ended.” …

OpenPhil: That part about Charles Platt’s generalization is interesting, but just because we unwittingly chose literally exactly the median that Platt predicted people would always choose in consistent error, that doesn’t justify dismissing our work, right? …

Eliezer: Oh, nice. I was wondering what sort of tunable underdetermined parameters enabled your model to nail the psychologically overdetermined final figure of ’30 years’ so exactly.

I have a couple issues here.

First, I think Eliezer exaggerates the precision of Platt’s Law and its match to the Bio Anchors projection:

  • Some aggregated data for assessing Platt’s Law is in this comment by Matthew Barnett as well as here.

  • While Matthew says “Overall I find the law to be pretty much empirically validated, at least by the standards I’d expect from a half in jest Law of Prediction,” I don’t agree: I don’t think an actual trendline on the chart would be particularly close to the Platt’s Law line. I think it would, instead, predict that Bio Anchors should point to longer timelines than 30 years out.

  • Note that my own median projection for transformative AI is 40 years, not 30, and I know several people who have much shorter medians (15 years and under) based on their own interpretations of the analysis in the report. So I don’t think it’s the case that Bio Anchors “automatically” lands one on a particular view, nor that it obviously pushes against timelines as short as Eliezer’s. It is a tool for informing AI timelines, and after taking it and other data points into account, Ajeya and I both are estimating longer timelines than Eliezer.

I think a softer “It’s suspicious that Bio Anchors is in the same ‘reasonable-sounding’ general range (‘a few decades’) that AI forecasts have been in for a long time” comment would’ve been more reasonable than what Eliezer wrote, so from here I’ll address that. First, I want to comment on Moravec specifically.

Eliezer characterizes Open Philanthropy as though we think that Hans Moravec’s projection was foreseeably silly and overaggressive (see quote above), but now think we have the right approach. This isn’t the case.

  • On one hand, I do think that if Ajeya or I had been talking with Moravec in 1990, we would’ve had a further-out median timeline estimate by some amount. This isn’t because I think we would’ve been doing similar estimates to today (we didn’t have enough information at the time for this to make much sense), or because I think we would’ve rejected the framework as irrelevant without today’s information. It’s simply because we each (myself more than her) have an inclination to apply a fair amount of adjustment in a conservative direction, for generic “burden of proof” reasons, rather than go with the timelines that seem most reasonable based on the report in a vacuum.

  • But more importantly, even if we set the above point aside, I simply don’t think it’s a mark against Bio Anchors to be in the same reference class as Moravec, and I think his prediction was (according to my views, and more so according to Eliezer’s apparent views) impressively good when judged by a reasonable standard and compared to reasonable alternatives.

To expand on what I mean by a reasonable standard and reasonable alternatives:

  • Bio Anchors is, first and foremost, meant as a tool for updating one’s timelines from the place they would naively be after considering broader conventional wisdom and perhaps semi-informative priors. Re: the former, I’m referring not to surveys of experts or conventional wisdom in futurist circles (both of which are often dismissed outside of these circles), but to what I perceive as most people’s “This is nowhere close to happening, ignore it” intuition.

  • According to my current views (median expectation of transformative AI around 2060), Moravec’s 1988 prediction of 2010-2020 looks much better than these alternatives, and even looks impressive. Specifically, it looks impressive by the standards of: “multi-decade forecasting of technologies for which no roadmap exists, with capabilities far exceeding those of anything that exists today.” (The more strongly one expects forecasts in this class to be difficult, the more one should be impressed here, in my view.)

  • Eliezer pretty clearly expects shorter timelines than I do, so according to his views, I think Moravec’s prediction looks more impressive still (by the standards and alternatives I’m using here). It is implied in the dialogue that Eliezer’s median would be somewhere between 2025-2040; if you assume this will turn out to be right, that would make a 1988 prediction of “2010-2020” look extremely good, in my view. (Good enough that, to the extent there’s doubt about whether the underlying reasoning is valid or noise, this should be a noticeable update toward the former.)

  • I suspect Eliezer has a different picture of the salient context and alternatives here. I suspect that he’s mostly operating in a context where it’s near-universal to expect transformative AI at least as early as I do; that he has non-biological-anchor-inspired views that point to much shorter timelines; and that a lot of his piece is a reaction to “Humbali” types (whom he notes are distinct from Open Philanthropy) asking him to update away from his detailed short-timelines views.

  • I’m sympathetic to that, in the sense that I think Bio Anchors is not very useful for the latter purpose. In particular, perhaps it’s helpful for me to say here that if you think timelines are short for reasons unrelated to biological anchors, I don’t think Bio Anchors provides an affirmative argument that you should change your mind. (I do think it is a useful report for deconstructing—or at least clarifying—several specific, biologically inspired short-timelines arguments that have been floating around, none of which I would guess Eliezer has any interest in.) Most of the case I’d make against shorter timelines would come down to a lack of strong affirmative arguments plus a nontrivial burden of proof.

Returning to the softened version of Platt’s Law: according to my current views on timelines (and more so according to Eliezer’s), “a few decades” has been a good range for a prediction to be in for the last few decades (again, keeping in mind what context and alternatives I am using). I think this considerably softens the force of an objection like: “You’re forecasting a few decades, as many others have over the last few decades; this in itself undermines your case.”

None of the above points constitute arguments for the correctness of Bio Anchors. My point is that “Your prediction is like these other predictions” (the thrust of much of Eliezer’s piece) doesn’t seem to undermine the argument, partly because the other predictions look broadly good according to both my and Eliezer’s current views.

A few other reactions to specific parts

Eliezer: … The software for a human brain is not going to be 100% efficient compared to the theoretical maximum, nor 10% efficient, nor 1% efficient, even before taking into account the whole thing with parallelism vs. serialism, precision vs. imprecision, or similarly clear low-level differences …

Eliezer: The makers of AGI aren’t going to be doing 10,000,000,000,000 rounds of gradient descent, on entire brain-sized 300,000,000,000,000-parameter models, algorithmically faster than today. They’re going to get to AGI via some route that you don’t know how to take, at least if it happens in 2040. If it happens in 2025, it may be via a route that some modern researchers do know how to take, but in this case, of course, your model was also wrong.

On one hand, I think it’s a distinct possibility that we’re going to see dramatically new approaches to AI development by the time transformative AI is developed.

On the other, I think quotes like this overstate the likelihood in the short-to-medium term.

  • Deep learning has been the dominant source of AI breakthroughs for nearly the last decade, and the broader “neural networks” paradigm—while it has come in and out of fashion—has broadly been one of the most-attended-to “contenders” throughout the history of AI research.

  • AI research prior to 2012 may have had more frequent “paradigm shifts,” but this is probably related to the fact that it was seeing less progress.

  • With these two points in mind, it seems off to me to confidently expect a new paradigm to be dominant by 2040 (even conditional on AGI being developed), as the second quote above implies. As for the first quote, I think the implication there is less clear, but I read it as expecting AGI to involve software well over 100x as efficient as the human brain, and I wouldn’t bet on that either (in real life, if AGI is developed in the coming decades—not based on what’s possible in principle.)

Eliezer: The problem is that the resource gets consumed differently, so base-rate arguments from resource consumption end up utterly unhelpful in real life. The human brain consumes around 20 watts of power. Can we thereby conclude that an AGI should consume around 20 watts of power, and that, when technology advances to the point of being able to supply around 20 watts of power to computers, we’ll get AGI?

If the world were such that:

  • We had some reasonable framework for “power usage” that didn’t include gratuitously wasted power, and measured the “power used meaningfully to do computations” in some important sense;

  • AI performance seemed to systematically improve as this sort of power usage increased;

  • Power usage was just now coming within a few orders of magnitude of the human brain;

  • We were just now starting to see AIs have success with tasks like vision and speech recognition (tasks that seem likely to have been evolutionarily important, and that we haven’t found ways to precisely describe GOFAI-style);

  • It also looked like AI was starting to have insect-like capabilities somewhere around the time it was consuming insect-level amounts of power;

  • And we didn’t have some clear candidate for a better metric with similar properties (as I think we do in the case of computations, since the main thing I’d expect increased power usage to be useful for is increased computation);

...Then I would be interested in a Bio Anchors-style analysis of projected power usage. As noted above, I would be interested in this as a tool for analysis rather than as “the way to get my probability distribution.” That’s also how I’m interested in Bio Anchors (and how it presents itself).

I also think we have some a priori reason to believe that human scientists can “use computations” somewhere near as efficiently as the brain does (software), more than we have reason to believe that human scientists can “use power” somewhere nearly as efficiently as the brain does (hardware).

(As a side note, there is some analysis of how nature vs. humans use power in this section of Bio Anchors.)

Somebody: All of that seems irrelevant to my novel and different argument. I am not foolishly estimating the resources consumed by a single brain; I’m estimating the resources consumed by evolutionary biology to invent brains!

Eliezer: And the humans wracking their own brains and inventing new AI program architectures and deploying those AI program architectures to themselves learn, will consume computations so utterly differently from evolution that there is no point comparing those consumptions of resources. That is the flaw that you share exactly with Moravec, and that is why I say the same of both of you, “This is a kind of thinking that fails to bind upon reality, it doesn’t work in real life.” I don’t care how much painstaking work you put into your estimate of 10^43 computations performed by biology. It’s just not a relevant fact.

It’s hard for me to understand how it is not a relevant fact: I think we have good reason to believe that humans can use computations at least as intelligently as evolution did.

I think it’s perfectly reasonable to push back on 10^43 as a median estimate, but not as a number that has some sort of relevance.

OpenPhil: We have commissioned a Very Serious report on a biologically inspired estimate of how much computation will be required to achieve Artificial General Intelligence, for purposes of forecasting an AGI timeline. (Summary of report.) (Full draft of report.) Our leadership takes this report Very Seriously.

I thought this was a pretty misleading presentation of how Open Philanthropy has communicated about this work. It’s true that Open Philanthropy’s public communication tends toward a cautious, serious tone (and I think there are good reasons for this); but beyond that, I don’t think we do much to convey the sort of attitude implied above. The report’s publication announcement was on LessWrong as a draft report for comment, and the report is still in the form of several Google docs. We never did any sort of push to have it treated as a fancy report.