ryan_greenblatt comments on Foom & Doom 1: “Brain in a box in a basement”

ryan_greenblatt 23 Jun 2025 23:53 UTC
LW: 45 AF: 19
9
AF
In this comment, I’ll try to respond at the object level arguing for why I expect slower takeoff than “brain in a box in a basement”. I’d also be down to try to do a dialogue/discussion at some point.
1.4.1 Possible counter: “If a different, much more powerful, AI paradigm existed, then someone would have already found it.”
I think of this as a classic @paulfchristiano-style rebuttal (see e.g. Yudkowsky and Christiano discuss “Takeoff Speeds”, 2021).
In terms of reference class forecasting, I concede that it’s rather rare for technologies with extreme profit potential to have sudden breakthroughs unlocking massive new capabilities (see here), that “could have happened” many years earlier but didn’t. But there are at least a few examples, like the 2025 baseball “torpedo bat”, wheels on suitcases, the original Bitcoin, and (arguably) nuclear chain reactions.^[7]
I think the way you describe this argument isn’t quite right. (More precisely, I think the argument you give may also be a (weaker) counterargument that people sometimes say, but I think there is a nearby argument which is much stronger.)
Here’s how I would put this:
Prior to having a complete version of this much more powerful AI paradigm, you’ll first have a weaker version of this paradigm (e.g. you haven’t figured out the most efficient way to do the brain algorithmic etc). Further, the weaker version of this paradigm might initially be used in combination with LLMs (or other techniques) such that it (somewhat continuously) integrates into the old trends. Of course, large paradigm shifts might cause things to proceed substantially faster or bend the trend, but not necessarily.
Further, we should still broadly expect this new paradigm will itself take a reasonable amount of time to transition through the human range and though different levels of usefulness even if it’s very different from LLM-like approaches (or other AI tech). And we should expect this probably happens at massive computational scale where it will first be viable given some level of algorithmic progress (though this depends on the relative difficulty of scaling things up versus improving the algorithms). As in, more than a year prior to the point where you can train a superintelligence on a gaming GPU, I expect someone will train a system which can automate big chunks of AI R&D using a much bigger cluster.
On this prior point, it’s worth noting that of the Paul’s original points in Takeoff Speeds are totally applicable to non-LLM paradigms as is much in Yudkowsky and Christiano discuss “Takeoff Speeds”. (And I don’t think you compellingly respond to these arguments.)
I think your response is that you argue against these perspectives under ‘Very little R&D separating “seemingly irrelevant” from ASI’. But, I just don’t find these specific arguments very compelling. (Maybe also you’d say that you’re just trying to lay out your views rather than compellingly arguing for them. Or maybe you’d say that you can’t argue for your views due to infohazard/forkhazard concerns. In which case, fair enough.) Going through each of these:
I think that, once this next paradigm is doing anything at all that seems impressive and proto-AGI-ish,^[12] there’s just very little extra work required to get to ASI (≈ figuring things out much better and faster than humans in essentially all domains). How much is “very little”? I dunno, maybe 0–30 person-years of R&D? Contrast that with AI-2027’s estimate that crossing that gap will take millions of person-years of R&D.
Why am I expecting this? I think the main reason is what I wrote about the “simple(ish) core of intelligence” in §1.3 above.
I don’t buy that having a “simple(ish) core of intelligence” means that you don’t take a long time to get the resulting algorithms. I’d say that much of modern LLMs does have a simple core and you could transmit this using a short 30 page guide, but nonetheless, it took many years of R&D to reach where we are now. Also, I’d note that the brain seems way more complex than LLMs to me!
For a non-imitation-learning paradigm, getting to “relevant at all” is only slightly easier than getting to superintelligence
My main response would be that basically all paradigms allow for mixing imitation with reinforcement learning. And, it might be possible to mix the new paradigm with LLMs which would smooth out / slow down takeoff.
You note that imitation learning is possible for brains, but don’t explain why we won’t be able to mix the brain like paradigm with more imitation than human brains do which would smooth out takeoff. As in, yes human brains doesn’t use as much imitation as LLMs, but they would probably perform better if you modified the algorthm some and did do 10^26 FLOP worth of imitation on the best data. This would smooth out the takeoff.
Why do I think getting to “relevant at all” takes most of the work? This comes down to a key disanalogy between LLMs and brain-like AGI, one which I’ll discuss much more in the next post.
I’ll consider responding to this in a comment responding to the next post.
Edit: it looks like this is just the argument that LLM capabilities come from imitation due to transforming observations into behavior in a way humans don’t. I basically just think that you could also leverage imitation more effectively to get performance earlier (and thus at a lower level) with an early version of more brain like architecture and I expect people would do this in practice to see earlier returns (even if the brain doesn’t do this).
Instead of imitation learning, a better analogy is to AlphaZero, in that the model starts from scratch and has to laboriously work its way up to human-level understanding.
Noteably, in the domains of chess and go it actually took many years to make it through the human range. And, it was possible to leverage imitation learning and human heuristics to perform quite well at Go (and chess) in practice, up to systems which weren’t that much worse than humans.
it takes a lot of work to get AlphaZero to the level of a skilled human, but then takes very little extra work to make it strongly superhuman.
AlphaZero exhibits returns which are maybe like 2-4 SD (within the human distribution of Go players supposing ~100k to 1 million Go players) per 10x-ing of compute.^[1] So, I’d say it probably would take around 30x to 300x additional compute to go from skilled human (perhaps 2 SD above median) to strongly superhuman (perhaps 3 SD above the best human or 7.5 SD above median) if you properly adapted to each compute level. In some ways 30x to 300x is very small, but also 30x to 300x is not that small...
In practice, I expect returns more like 1.2 SD / 10x of compute at the point when AIs are matching top humans. (I explain this in a future post.)
1.7.2 “Plenty of room at the top”
I agree with this.
1.7.3 What’s the rate-limiter?

[...]
My rebuttal is: for a smooth-takeoff view, there has to be some correspondingly-slow-to-remove bottleneck that limits the rate of progress. In other words, you can say “If Ingredient X is an easy huge source of AGI competence, then it won’t be the rate-limiter, instead something else will be”. But you can’t say that about every ingredient! There has to be a “something else” which is an actual rate-limiter, that doesn’t prevent the paradigm from doing impressive things clearly on track towards AGI, but that does prevent it from being ASI, even after hundreds of person-years of experimentation.^[13] And I’m just not seeing what that could be.
Another point is: once people basically understand how the human brain figures things out in broad outline, there will be a “neuroscience overhang” of 100,000 papers about how the brain works in excruciating detail, and (I claim) it will rapidly become straightforward to understand and integrate all the little tricks that the brain uses into AI, if people get stuck on anything.
I’d say that the rate limiter is that it will take a while to transition from something like “1000x less compute efficient than the human brain (as in, it will take 1000x more compute than human lifetime to match top human experts but simultaneously the AIs will be better at a bunch of specific tasks)” to “as compute efficient as the human brain”. Like, the actual algorithmic progress for this will take a while and I don’t buy your claim that that way this will work is that you’ll go from nothing to having an outline of how the brain works and at this point everything will immediately come together due to the neuroscience literature. Like, I think something like this is possible, but unlikely (especially prior to having AIs that can automate AI R&D).
And, while you have much less efficient algorithms, you’re reasonably likely to get bottlenecked on either how fast you can scale up compute (though this is still pretty fast, especially if all those big datacenters for training LLMs are still just lying around around!) or how fast humanity can produce more compute (which can be much slower).
Part of my disagreement is that I don’t put the majority of the probability on “brain-like AGI” (even if we condition on something very different from LLMs) but this doesn’t explain all of the disagreement.
1. ^
  It looks like a version of AlphaGo Zero goes from 2400 ELO (around 1000th best human) to 4000 ELO (somewhat better than the best human) between hours 15 to 40 of training run (see Figure 3 in this PDF). So, naively this is a bit less than 3x compute for maybe 1.9 SDs (supposing that the “field” of Go players has around 100k to 1 million players) implying that 10x compute would get you closer to 4 SDs. However, in practice, progress around the human range was slower than 4 SDs/OOM would predict. Also, comparing times to reach particular performances within a training run can sometimes make progress look misleadingly fast due to LR decay and suboptimal model size. The final version of AlphaGo Zero used a bigger model size and ran RL for much longer, and it seemingly took more compute to reach the ~2400 ELO and ~4000 ELO which is some evidence for optimal model size making a substantial difference (see Figure 6 in the PDF). Also, my guess based on circumstantial evidence is that the original version of AlphaGo (which was initialized with imitation) moved through the human range substantially slower than 4 SDs/OOMs. Perhaps someone can confirm this. (This footnote is copied from a forthcoming post of mine.)
What links here?
- ryan_greenblatt's comment on Foom & Doom 2: Technical alignment is hard by Steven Byrnes (24 Jun 2025 0:07 UTC; 12 points)
- Noosphere89's comment on Thane Ruthenis’s Shortform by Thane Ruthenis (13 Jul 2025 21:39 UTC; 2 points)
- Lukas Finnveden 7 Jul 2025 0:26 UTC
  LW: 8 AF: 5
  0
  AF Parent
  Prior to having a complete version of this much more powerful AI paradigm, you’ll first have a weaker version of this paradigm (e.g. you haven’t figured out the most efficient way to do the brain algorithmic etc).
  A supporting argument: Since evolution found the human brain algorithm, and evolution only does local search, the human brain algorithm must be built out of many innovations that are individually useful. So we shouldn’t expect the human brain algorithm to be an all-or-nothing affair. (Unless it’s so simple that evolution could find it in ~one step, but that seems implausible.)
  Edit: Though in principle, there could still be a heavy-tailed distribution of how useful each innovation is, with one innovation producing most of the total value. (Even though the steps leading up to that were individually slightly useful.) So this is not a knock-down argument.
  - Donald Hobson 7 Aug 2025 21:51 UTC
    LW: 2 AF: 1
    0
    AF Parent
    Since evolution found the human brain algorithm, and evolution only does local search, the human brain algorithm must be built out of many innovations that are individually useful. So we shouldn’t expect the human brain algorithm to be an all-or-nothing affair.
    If humans are looking at parts of the human brain, and copying it, then it’s quite possible that the last component we look at is the critical piece that nothing else works without. A modern steam engine was developed step by step from simpler and cruder machines. But if you take apart a modern steam engine, and copy each piece, it’s likely that it won’t work at all until you add the final piece, depending on the order you recreate pieces in.
    It’s also possible that rat brains have all the fundamental insights. To get from rats to humans, evolution needed to produce lots of genetic code that grew extra blood vessels to supply the oxygen and that prevented brain cancer. (Also, evolution needed to spend time on alignment) A human researcher can just change one number, and maybe buy some more GPU’s.
  - Steven Byrnes 7 Jul 2025 1:25 UTC
    LW: 2 AF: 2
    0
    AF Parent
    My claim was “I think that, once this next paradigm is doing anything at all that seems impressive and proto-AGI-ish,^[12] there’s just very little extra work required to get to ASI (≈ figuring things out much better and faster than humans in essentially all domains).”
    I don’t think anything about human brains and their evolution cuts against this claim.
    If your argument is “brain-like AGI will work worse before it works better”, then sure, but my claim is that you only get “impressive and proto-AGI-ish” when you’re almost done, and “before” can be “before by 0–30 person-years of R&D” like I said. There are lots of parts of the human brain that are doing essential-for-AGI stuff, but if they’re not in place, then you also fail to pass the earlier threshold of “impressive and proto-AGI-ish”, e.g. by doing things that LLMs (and other existing techniques) cannot already do.
    Or maybe your argument is “brain-like AGI will involve lots of useful components, and we can graft those components onto LLMs”? If so, I’m skeptical. I think the cortex is the secret sauce, and the other components are either irrelevant for LLMs, or things that LLM capabilities researchers already know about. For example, the brain has negative feedback loops, and the brain has TD learning, and the brain has supervised learning and self-supervised learning, etc., but LLM capabilities researchers already know about all those things, and are already using them to the extent that they are useful.
    - Lukas Finnveden 7 Jul 2025 2:26 UTC
      LW: 4 AF: 3
      0
      AF Parent
      To be clear: I’m not sure that my “supporting argument” above addressed an objection to Ryan that you had. It’s plausible that your objections were elsewhere.
      But I’ll respond with my view.
      If your argument is “brain-like AGI will work worse before it works better”, then sure, but my claim is that you only get “impressive and proto-AGI-ish” when you’re almost done, and “before” can be “before by 0–30 person-years of R&D” like I said.
      Ok, so this describes a story where there’s a lot of work to get proto-AGI and then not very much work to get superintelligence from there. But I don’t understand what’s the argument for thinking this is the case vs. thinking that there’s a lot of work to get proto-AGI and then also a lot of work to get superintelligence from there.
      Going through your arguments in section 1.7:
      “I think the main reason is what I wrote about the “simple(ish) core of intelligence” in §1.3 above.”
      But I think what you wrote about the simple(ish) core of intelligence in 1.3 is compatible with there being like (making up a number) 20 different innovations involved in how the brain operates, each of which gets you a somewhat smarter AI, each of which could be individually difficult to figure out. So maybe you get a few, you have proto-AGI, and then it takes a lot of work to get the rest.
      Certainly the genome is large enough to fit 20 things.
      I’m not sure if the “6-ish characteristic layers with correspondingly different neuron types and connection patterns, and so on” is complex enough to encompass 20 different innovations. Certainly seems like it should be complex enough to encompass 6.
      (My argument above was that we shouldn’t expect the brain to run an algorithm that only is useful once you have 20 hypothetical components in place, and does nothing beforehand. Because it was found via local search, so each of the 20 things should be useful on their own.)
      “Plenty of room at the top” — I agree.
      “What’s the rate limiter?” — The rate limiter would be to come up with the thinking and experimenting needed to find the hypothesized 20 different innovations mentioned above. (What would you get if you only had some of the innovations? Maybe AGI that’s incredibly expensive. Or AGIs similarly capable as unskilled humans.)
      “For a non-imitation-learning paradigm, getting to “relevant at all” is only slightly easier than getting to superintelligence”
      I agree that there are reasons to expect imitation learning to plateau around human-level that don’t apply to fully non-imitation learning.
      That said...
      For some of the same reasons that “imitation learning” plateaus around human level, you might also expect “the thing that humans do when they learn from other humans” (whether you want to call that “imitation learning” or “predictive learning” or something else) to slow down skill-acquisition around human level.
      There could also be another reason for why non-imitation-learning approaches could spend a long while in the human range. Namely: Perhaps the human range is just pretty large, and so it takes a lot of gas to traverse. I think this is somewhat supported by the empirical evidence, see this AI impacts page (discussed in this SSC).
- Steven Byrnes 26 Jun 2025 19:58 UTC
  LW: 4 AF: 2
  0
  AF Parent
  Thanks! Here’s a partial response, as I mull it over.
  Also, I’d note that the brain seems way more complex than LLMs to me!
  See “Brain complexity is easy to overstate” section here.
  basically all paradigms allow for mixing imitation with reinforcement learning
  As in the §2.3.2, if an LLM sees output X in context Y during pretraining, it will automatically start outputting X in context Y. Whereas if smart human Alice hears Bob say X in context Y, Alice will not necessarily start saying X in context Y. Instead she might say “Huh? Wtf are you talking about Bob?”
  Let’s imagine installing an imitation learning module in Alice’s brain that makes her reflexively say X in context Y upon hearing Bob say it. I think I’d expect that module to hinder her learning and understanding, not accelerate it, right?
  (If Alice is able says to herself “in this situation, Bob would say X”, then she has a shoulder-Bob, and that’s definitely a benefit not a cost. But that’s predictive learning, not imitative learning. No question that predictive learning is helpful. That’s not what I’m talking about.)
  …So there’s my intuitive argument that the next paradigm would be hindered rather than helped by mixing in some imitative learning. (Or I guess more precisely, as long as imitative learning is part of the mix, I expect the result to be no better than LLMs, and probably worse. And as long as we’re in “no better than LLM” territory, I’m off the hook, because I’m only making a claim that there will be little R&D between “doing impressive things that LLMs can’t do” and ASI, not between zero and ASI.)
  Noteably, in the domains of chess and go it actually took many years to make it through the human range. And, it was possible to leverage imitation learning and human heuristics to perform quite well at Go (and chess) in practice, up to systems which weren’t that much worse than humans.
  In my mind, the (imperfect!) analogy here would be (LLMs, new paradigm) ↔ (previous Go engines, AlphaGo and successors).
  In particular, LLMs today are in many (not all!) respects “in the human range” and “perform quite well” and “aren’t that much worse than humans”.
  algorithmic progress
  I started writing a reply to this part … but first I’m actually kinda curious what “algorithmic progress” has looked like for LLMs, concretely—I mean, the part where people can now get the same results from less compute. Like what are the specific things that people are doing differently today than in 2019? Is there a list somewhere? A paper I could read? (Or is it all proprietary?) (Epoch talks about how much improvement has happened, but not what the improvement consists of.) Thanks in advance.
  - ryan_greenblatt 26 Jun 2025 20:31 UTC
    LW: 6 AF: 2
    4
    AF Parent
    
    See “Brain complexity is easy to overstate” section here.
    
    Sure, but I still think it’s probably more way more complex than LLMs even if we’re just looking at the parts key for AGI performance (in particular, the parts which learn from scratch). And, my guess would be that performance is ~~substantially~~ greatly degraded if you only take only as much complexity as the core LLM learning algorithm.
    
    Let’s imagine installing an imitation learning module in Alice’s brain that makes her reflexively say X in context Y upon hearing Bob say it. I think I’d expect that module to hinder her learning and understanding, not accelerate it, right?
    
    This isn’t really what I’m imagining, nor do I think this is how LLMs work in many cases. In particular, LLMs can transfer from training on random github repos to being better in all kinds of different contexts. I think humans can do something similar, but have much worse memory.
    
    I think in the case of humans and LLMs, this is substantially subconcious/non-explicit, so I don’t think this is well described as having a shoulder Bob.
    
    Also, I would say that humans do learn from imitation! (You can call it prediction, but it doesn’t matter what you call it as long as it implies that data from humans makes things scale more continuously through the human ragne.) I just think that you can do better at this than humans based on the LLM case, mostly because humans aren’t exposed to as much data.
    
    Also, I think the question is “can you somehow make use of imitation data” not “can the brain learning algorithm immediately use of imitation”?
    
    In my mind, the (imperfect!) analogy here would be (LLMs, new paradigm) ↔ (previous Go engines, AlphaGo and successors).
    
    Notably this analogy implies LLMs will be able to automate substantial fractions of human work prior to a new paradigm which (over the course of a year or two and using vast computational resources) beats the best humans. This is very different from the “brain in a basement” model IMO. I get that you think the analogy is imperfect (and I agree), but it seems worth noting that the analogy you’re drawing suggests something very different from what you expect to happen.
    
    Is there a list somewhere? A paper I could read? (Or is it all proprietary?)
    
    It’s substantially proprietary, but you could consider looking at the Deepseek V3 paper. We don’t actually have great understanding of the quantity and nature of algorithmic improvment after GPT-3. It would be useful for someone to do a more up to date review based on the best available evidence.
    - Gunnar_Zarncke 27 Jun 2025 12:11 UTC
      6 points
      0
      Parent
      I’m not sure that complexity is protecting us. On the one hand, there are just 1MB of bases coding for the brain (and less for the connectome), but that doesn’t mean we can read it and it may take a long time to reverse engineer.
      source: https://xkcd.com/1605/
      On the other hand, our existing systems of LLMs are already much more complex than that. Likely more than a GB of source code for modern LLM-running compute center servers. And here the relationship between the code and the result is better known and can be iterated on much faster. We may not need to reverse engineer the brain. Experimentation may be sufficient.

ryan_greenblatt comments on Foom & Doom 1: “Brain in a box in a basement”

1.4.1 Possible counter: “If a different, much more powerful, AI paradigm existed, then someone would have already found it.”

For a non-imitation-learning paradigm, getting to “relevant at all” is only slightly easier than getting to superintelligence

1.7.2 “Plenty of room at the top”

1.7.3 What’s the rate-limiter?