Steven Byrnes comments on Foom & Doom 1: “Brain in a box in a basement”

Steven Byrnes 7 Jul 2025 15:19 UTC
4 points
2
Right, there’s a possible position which is: “I’ll accept for the sake of argument your claim there will be an egregiously misaligned ASI requiring very little compute (maybe ≲1 chip per human equivalent including continuous online learning), emerging into a world not terribly different from today’s. But even if so, that’s OK! While the ASI will be a much faster learner than humans, it will not magically know things that it has no way to have figured out (§1.8.1), and that includes developing nanotechnology. So it will be reliant on humans and human infrastructure during a gradual process.”
Or something like that?
Anyway, if so, yeah I disagree, even if I grant (for the sake of argument) that exotic nanotech does not exist.
I’m not an ASI and haven’t thought very hard about it, so my strategies might be suboptimal, but for example it seems to me that an ASI could quite rapidly (days or weeks not months) earn or steal tons of money, and hack into basically every computer system in the world (even APT groups are generally unable to avoid getting hacked by other APT groups!), and then the AI (which now exists in a zillion copies around the world) can get people around the world to do whatever it wants via hiring them, bribing them, persuading them, threatening them, tricking them, etc.
And what does it get the people to do? Mainly “don’t allow other ASIs to be built” and “do build and release novel pandemics”. The latter should be pretty quick—making pandemics is worryingly easy IIUC (see Kevin Esvelt). If infrastructure and the electric grid starts going down, fine, the AI can rebuild, as long as it has at least one solar-cell-connected chip and a teleoperated robot that can build more robots and scavenge more chips and solar panels (see here), and realistically it will have many of those spread all around.
(See also Carl Shulman on AI takeover.)
There are other possibilities too, but hopefully that’s suggestive of “AI doom doesn’t require zero-shot designs of nanotech” (except insofar as viruses are arguably nanotech).
Oh, I guess we also disagree RE “currently we don’t have the resources outside of AI companies to actually support a superintelligent AI outside the lab, due to interconnect issues”. I expect future ASI to be much more compute-efficient. Actually, even frontier LLMs are extraordinarily expensive to train, but if we’re talking about inference rather than training, the requirements are not so stringent I think, and people keep working on it.
- Noosphere89 7 Jul 2025 19:30 UTC
  3 points
  0
  Parent
  Right, there’s a possible position which is: “I’ll accept for the sake of argument your claim there will be an egregiously misaligned ASI requiring very little compute (maybe ≲1 chip per human equivalent including continuous online learning), emerging into a world not terribly different from today’s. But even if so, that’s OK! While the ASI will be a much faster learner than humans, it will not magically know things that it has no way to have figured out (§1.8.1), and that includes developing nanotechnology. So it will be reliant on humans and human infrastructure during a gradual process.”
  Basically this, and in particular I’m willing to grant the premise that for the sake of argument there is technology that eliminates the need for most logistics, but that all such technology will take at least a year or more of real-world experimentation that means that the AI can’t immediately take over.
  On this:
  I’m not an ASI and haven’t thought very hard about it, so my strategies might be suboptimal, but for example it seems to me that an ASI could quite rapidly (days or weeks not months) earn or steal tons of money, and hack into basically every computer system in the world (even APT groups are generally unable to avoid getting hacked by other APT groups!), and then the AI (which now exists in a zillion copies around the world) can get people around the world to do whatever it wants via hiring them, bribing them, persuading them, threatening them, tricking them, etc.
  And what does it get the people to do? Mainly “don’t allow other ASIs to be built” and “do build and release novel pandemics”. The latter should be pretty quick—making pandemics is worryingly easy IIUC (see Kevin Esvelt). If infrastructure and the electric grid starts going down, fine, the AI can rebuild, as long as it has at least one solar-cell-connected chip and a teleoperated robot that can build more robots and scavenge more chips and solar panels (see here), and realistically it will have many of those spread all around.
  I think the entire crux is that all of those robots/solar cell chips you referenced currently depend on human industry/modern civilization to actually work, and they’d quickly degrade and become non-functional on the order of weeks or months if modern civilization didn’t exist, and this is arguably somewhat inevitable due to economics (until you can have tech that obviates the need for long supply chains).
  And in particular, in most takeover scenarios where AIs don’t automate the economy first, I don’t expect AIs to be able to keep producing robots for a very long time, and I’d bump it up to 300-3,000 years at minimum because there is less easily accessible resources combined with AIs being much less capable due to having very little compute relative to modern civilization.
  In particular, I think that disrupting modern civilization to a degree such that humans are disempowered (assuming no tech that obviates the need for logistics) pretty much as a consequence breaks the industries/logistics needed to fuel further AI growth, because there’s no more trade, which utterly fucks up modern economies.
  And your references argue that human civilization wouldn’t go extinct very soon because of civilizational collapse, and that AIs can hack existing human industry to help them, and I do think this is correct (modulo the issue that defense is easier than offense for the cybersecurity realm specifically, and importantly, a key reason for this is that once you catch the AI doing it, there are major consequences for AIs and humans, which actually matter for AI safety):
  https://x.com/MaxNadeau_/status/1912568930079781015
  The important thing is that this paper rocks and I’d love to see bigger/better versions of it, but I’ll use it as a jumping-off point for a take: The headline result is that their best defenses work 93% of the time. Is that anything? If you’ve been reading Simon Willison (whose blog I think is great!), you might say “that’s useless, because in application security 99% is a failing grade” (https://simonwillison.net/2025/Apr/11/camel/#the-best-part-is-it-doesn-t-use-more-ai). And no wonder, because the techniques in this paper are just variations of “solve AI security problems with more AI”, which Simon has emphasized are a dead end (https://simonwillison.net/2022/Sep/17/prompt-injection-more-ai/). So why are the Redwood folks rosier about 93%? The crux here is whether your threat model assumes attackers can try over and over again. Simon is assuming they can, in which case 93% would be useless, but Redwood folks are assuming that there are major costs to the attacker trying and failing (https://lesswrong.com/posts/i2nmBfCXnadeGmhzW/catching-ais-red-handed) This assumption is not at all unique to misalignment threats. It’s the same assumption that Debenedetti, Carlini, and Tramer make here (https://arxiv.org/abs/2306.02895), which they call evasion “without breaking eggs”. I think the right way to model a variety of security problems, e.g. insider threats from employees. One’s assumptions about the costs of attacker failure have huge implications for which security measures look helpful, and I think this is an important factor to mentally track in conversations about these topics.
  (It talks about a control technique but the discussion easily transfers outside of the example).
  I actually agree cyber-attacks to subvert human industry are a threat and are worth keeping in mind, but none of your references support the idea that AIs can keep going without modern civilization’s logistics, and I think people vastly underestimate how necessary modern civilization logistics are to support industry, and how fragile they are to even somewhat minor disruptions, let alone the disruptions that would follow after takeover (assuming it doesn’t already have sufficient resources to be self-sustaining).
  There are other possibilities too, but hopefully that’s suggestive of “AI doom doesn’t require zero-shot designs of nanotech” (except insofar as viruses are arguably nanotech).
  I agree with this, but fairly critically I do think it actually matters quite a lot for AI strategy purposes if we don’t assume AIs can quickly rebuild stuff/can obviate logistics through future tech quickly, and it matters pretty greatly to a lot of people’s stories of doom, even if AIs can doom us through just trying to hijack modern civilization and then wait for humans to automate themselves away and then once humans have been fully cut out of the loop and AIs can self-sustain an economy without us, bioweapons are used to attack humans, it matters that we have time.
  This makes AI control protocols for example a lot more effective, because we can assume that independent AIs outside of the central servers of stuff like Deepmind won’t be able to affect things much.
  Oh, I guess we also disagree RE “currently we don’t have the resources outside of AI companies to actually support a superintelligent AI outside the lab, due to interconnect issues”. I expect future ASI to be much more compute-efficient. Actually, even frontier LLMs are extraordinarily expensive to train, but if we’re talking about inference rather than training, the requirements are not so stringent I think, and people keep working on it.
  I actually do expect future AIs to be more compute efficient, but I think that at the point where superintelligent AIs can support themselves purely based off of stuff like personal computers, all control of the situation is lost and either the AIs are aligned and grant us a benevolent personal utopia, or they’re misaligned and we are extinct/mostly dead.
  So the limits of computational/data efficiency being very large don’t matter much for the immediate situation on AI risk.
  The point of no return happens earlier than this, and the reason is that even in a future where imitation learning/LLMs do not go all the way to AGI in practice and must have something more brain-like like continuous learning and long-term memories, that imitation learning continues to be useful and will be used by AIs, and there’s a very important difference between imitation learning alone not scaling all the way to AGI and imitation learning not being useful at all, and I think LLMs provide good evidence that imitation is surprisingly useful even if it doesn’t scale to AGI.
  I think a general worldview clash is that I tend to think technological change is mostly driven by early prototypes that at first are pretty inefficient, and there require many changes to get the system to become more efficient, and while there are thresholds of usefulness for the AI case, change operates more continuously than people think.
  Finally, we have good reason to believe that the human range is actually pretty large, such that AIs do take a noticeable amount of time from being human level to being outright superintelligent:
  There could also be another reason for why non-imitation-learning approaches could spend a long while in the human range. Namely: Perhaps the human range is just pretty large, and so it takes a lot of gas to traverse. I think this is somewhat supported by the empirical evidence, see this AI impacts page (discussed in this SSC).
  - Steven Byrnes 7 Jul 2025 20:43 UTC
    3 points
    0
    Parent
    I think the entire crux is that all of those robots/solar cell chips you referenced currently depend on human industry/modern civilization to actually work, and they’d quickly degrade and become non-functional on the order of weeks or months if modern civilization didn’t exist, and this is arguably somewhat inevitable due to economics (until you can have tech that obviates the need for long supply chains).
    OK, imagine (for simplicity) that all humans on Earth drop dead simultaneously, but there’s a John-von-Neumann-level AI on a chip connected to a solar panel with two teleoperated robots. Every time they scavenge another chip and solar cell, there becomes another human-level AI copy. Every time a robot builds another teleoperated robot from scavenged parts, there’s that too. What exactly is going to break in “weeks or months”? Solar cells can work for 30 years, no problem. GPUs are also reported to last for decades. (Note that, as long as GPUs are a non-renewable resource, the AI would presumably take extremely good care of them, keeping them dust-free, cooling them well below the nominal temperature spec, etc.) The AI can find decent GPUs in every house on the street, and I think hundreds of millions more by breaking into big data centers. Similar for solar panels. If one robot breaks, another robot can repair it. Janky teleoperated robots without fingers made by students for $20K can vacuum, make coffee, cook a meal, etc. Competent human engineers can make pretty impressive mechanical hands using widely-available parts. I grant that it would take a long while before the growing AI clone army could run a semiconductor supply chain by itself, but it has all the time in the world. I expect it to succeed, and thus to sustain itself into the indefinite future, and I’m confused why you don’t. (Or maybe you do and I’m misunderstanding.)
    BTW I also think that a minimal semiconductor supply chain would be very very much simpler than the actual semiconductor supply chain that exists in our human world, which has been relentlessly optimized for cost, not simplicity. For example, EBL (e-beam lithography) has better resolution than EUV and is a zillion times easier to build, but the human economy would never support building out km²-scale warehouses full of millions of EBL machines to compensate for their crappy throughput. But for an AI bootstrapping its way back up, why not?
    (I’m continuing to assume no weird nanotech for the sake of argument, but I will point out that, since brains exist, it follows that it is possible to grow self-assembling brain-like computing devices (in vats, tended by robots), using only widely-available raw materials like plants and oxygen.)
    I’m confused about other parts of your comment as well. Joseph Stalin was able to use his (non-superhuman) intelligence and charisma to wind up in dictatorial control of Russia. What’s your argument that an AI could not similarly wind up with dictatorial control over humans? Don’t the same arguments apply? “If we catch the AI trying to gain power in bad ways, we’ll shut it down.” “If we catch Stalin trying to gain power in bad ways, we’ll throw him in jail.” But the latter didn’t happen. What’s the disanalogy, from your perspective?
    - Noosphere89 10 Jul 2025 17:43 UTC
      4 points
      0
      Parent
      OK, imagine (for simplicity) that all humans on Earth drop dead simultaneously, but there’s a John-von-Neumann-level AI on a chip connected to a solar panel with two teleoperated robots. Every time they scavenge another chip and solar cell, there becomes another human-level AI copy. Every time a robot builds another teleoperated robot from scavenged parts, there’s that too. What exactly is going to break in “weeks or months”? Solar cells can work for 30 years, no problem. GPUs are also reported to last for decades. (Note that, as long as GPUs are a non-renewable resource, the AI would presumably take extremely good care of them, keeping them dust-free, cooling them well below the nominal temperature spec, etc.) The AI can find decent GPUs in every house on the street, and I think hundreds of millions more by breaking into big data centers. Similar for solar panels. If one robot breaks, another robot can repair it. Janky teleoperated robots without fingers made by students for $20K can vacuum, make coffee, cook a meal, etc. Competent human engineers can make pretty impressive mechanical hands using widely-available parts. I grant that it would take a long while before the growing AI clone army could run a semiconductor supply chain by itself, but it has all the time in the world. I expect it to succeed, and thus to sustain itself into the indefinite future, and I’m confused why you don’t. (Or maybe you do and I’m misunderstanding.)
      BTW I also think that a minimal semiconductor supply chain would be very very much simpler than the actual semiconductor supply chain that exists in our human world, which has been relentlessly optimized for cost, not simplicity. For example, EBL (e-beam lithography) has better resolution than EUV and is a zillion times easier to build, but the human economy would never support building out km²-scale warehouses full of millions of EBL machines to compensate for their crappy throughput. But for an AI bootstrapping its way back up, why not?
      The key trouble is all the power generators that sustain the AI would break within weeks or months, and the issue is even if they could build GPUs, they’d have no power to run them within at most 2 weeks:
      https://www.reddit.com/r/ZombieSurvivalTactics/comments/s6augo/comment/ht4iqej/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button
      https://www.reddit.com/r/explainlikeimfive/comments/klupbw/comment/ghb0fer/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button
      Realistically, we are looking at power grid collapses within days.
      And without power, none of the other building projects could work, because they’d stop receiving energy, and importantly this means the AI is on a tight timer, and some of this is partially due to expectations that the first transformative useful AI will use more compute than you project, even conditional on a different paradigm being introduced like brain-like AGIs, but another part of my view is that this is just one of many examples where humans need to constantly maintain stuff in order for the stuff to work, and if we don’t assume tech that can just solve logistics is available within say 1 year, it will take time for AIs to actually survive without humans, and this time is almost certainly closer to months or years than weeks or days.
      The hard part of AI takeover isn’t killing all humans, it’s in automating enough of the economy (including developing tech like nanotech) such that the humans stop mattering, and while AIs can do this, it takes actual time, and that time is really valuable in fast moving scenarios.
      I’m confused about other parts of your comment as well. Joseph Stalin was able to use his (non-superhuman) intelligence and charisma to wind up in dictatorial control of Russia. What’s your argument that an AI could not similarly wind up with dictatorial control over humans? Don’t the same arguments apply? “If we catch the AI trying to gain power in bad ways, we’ll shut it down.” “If we catch Stalin trying to gain power in bad ways, we’ll throw him in jail.” But the latter didn’t happen. What’s the disanalogy, from your perspective?
      I didn’t say AIs can’t take over, and I very critically did not say that AI takeover can’t happen in the long run.
      I only said AI takeover isn’t trivial if we don’t assume logistics are solvable.
      But to deal with the Stalin example, the answer for how he took over was basically that he was willing to wait a long time, and in particular he used both persuasion and the fact that he already had a significant amount of power by having the General Secretary, and his takeover was basically by allying with loyalists and in particular strategically breaking alliances that he had made, and violence was used later on to show that no one was safe from him.
      Which is actually how I expect successful AI takeover to happen in practice, if it does happen.
      Very importantly, Stalin didn’t need to create an entire civilization out of nothing, or nearly nothing, and other people like Trotsky handled the logistics, though the takeover situation was far more preferable to the communist party as they both had popular support and didn’t have as long supply lines as the opposition forces like the Whites did, and they had a preexisting base of industry that was much easier to seize than modern industries.
      This applies to most coups/transitions of power in that most of the successful coups aren’t battles between factions, but rather one group managing to make itself the new Schelling point over other groups.
      @Richard_Ngo explains more below:
      https://www.lesswrong.com/posts/d4armqGcbPywR3Ptc/power-lies-trembling-a-three-book-review#The_revolutionary_s_handbook
      Most of my commentary in the last comment is either arguing that things can be made more continuous and slow than your story depicts, or arguing that your references don’t support what you claimed, and I did say that the cyberattack story is plausible, just that it didn’t support the idea that AIs could entirely replace civilization without automating away us first, which takes time.
      This doesn’t show AI doom can’t happen, but it does matter for the probability estimates of many LWers on here, because it’s a hidden background assumption disagreement that underlies a lot of other disagreements.
      - Steven Byrnes 18 Jul 2025 2:09 UTC
        4 points
        0
        Parent
        I wrote:
        OK, imagine (for simplicity) that all humans on Earth drop dead simultaneously, but there’s a John-von-Neumann-level AI on a chip connected to a solar panel with two teleoperated robots. Every time they scavenge another chip and solar cell, there becomes another human-level AI copy. Every time a robot builds another teleoperated robot from scavenged parts, there’s that too. What exactly is going to break in “weeks or months”?
        Then your response included:
        The key trouble is all the power generators that sustain the AI would break within weeks or months, and the issue is even if they could build GPUs, they’d have no power to run them within at most 2 weeks…
        I included solar panels in my story precisely so that there would be no need for an electric grid. Right?
        I grant that powering a chip off a solar panel is not completely trivial. For example, where I live, residential solar cells are wired in such a way that they shut down when the grid goes down (ironically). But, while it’s not completely trivial to power a chip off a solar cell, it’s also not that hard. I believe that a skilled and resourceful human electrical engineer would be able to jury-rig a solution to that problem without much difficulty, using widely-available parts, like the electronics already attached to the solar panel, plus car batteries, wires, etc. Therefore our hypothetical “John-von-Neumann-level AI with a teleoperated robot” should be able to solve that problem too. Right?
        (Or were you responding to something else? I’m not saying “all humans on Earth drop dead simultaneously” is necessarily realistic, I’m just trying to narrow down where we disagree.)
        Noosphere89 20 Jul 2025 15:59 UTC
        10 points
        0
        Parent
        I did not realize you were assuming that the AI was powered solely by solar power that isn’t connected to the grid.
        
        Given your assumption, I agree that AGI can rebuild supply chains from scratch, albeit paiinfully and slowly, so I agree that AGI is an existential threat assuming it isn’t aligned.
        
        I was addressing a different scenario because I didn’t read the part of your comment where you said the AI is independent of the grid.