On balance, I support banning/obstructing datacenter buildout. That said, I’m not actually sure whether that impacts the omnicide risk positively or negatively.
I don’t think the LLM paradigm is AGI-complete. I don’t have utter confidence in that, but I think it’s more likely than not. And in worlds in which it is non-AGI-complete, the labs’ obsession with them is very helpful. They’re wasting macroeconomic amounts of money on it, pouring most of our generation’s AI-researcher talent into it, distracting funding from other AGI approaches. If LLMs then fail to usher in the Singularity, if it does turn out to be a bubble that pops (e. g., in 2030-2032, when the ability to aggressively scale compute is supposed to run out), this should cause another AI winter. AGI would become a decidedly unsexy thing to work on once more, in industry and probably in academia both.
What would obstructing the LLM paradigm (via various compute limits) do? Well, it may cause the AGI megacorps to start looking into other directions now, while they still have massive amount of unspent manpower and capital. Which may lead to someone “succeeding” sooner.[1]
Perhaps one shouldn’t interrupt their enemy while they’re making a mistake.
Or perhaps that’s not how the story goes. Perhaps: LLMs are a strong signal that AI is a massively powerful technology, a signal legible to many more people than theoretical arguments. They’re attracting macroeconomic amounts of funding to AI, funneling a large fraction of our generation’s talent towards working on AI, fueling inter-company and geopolitical AI-race dynamics. And while they cause other AGI approaches to be relatively neglected, they cause so much more absolute attention to be pointed at the AI industry that the amounts of funding/talent going into non-LLM AGI routes is still much greater than in the no-LLMs counterfactual. On top of that, even if LLMs aren’t AGI-complete in themselves, they may still be useful enough to speed up SWE/research along other AGI paradigms as well.
And perhaps effectively banning LLMs would cause a chilling effect much greater than if they were merely a research bet that didn’t work out. It would be a signal that AGI research is something the world does not want to see in any form, turning funding and talent away from the whole endeavor.
I’m not really sure which model is correct. Which AI Winter would be colder and broader: one caused by a proactive LLM ban, or one caused by LLMs’ dramatic betrayal of investor hopes? Do LLMs currently cause net increase or decrease of attention to non-LLM AGI research? Can non-AGI LLMs significantly speed up non-LLM AGI research if allowed to continue scaling? And, of course: is the premise that LLMs are non-AGI-complete even correct?
I’m on balance in favor of obstructing LLMs and upper-bounding the available compute. “We must not get in the way of the current paradigm because they are totally making a mistake so all this investment is actually delaying AGI!” has the flavor of some galaxy-brained 4D chess nonsense. But I’m not fully sure.
(Also, worlds in which politicians/the public are concerned about AI enough to ban datacenter buildout is of course a more promising world than the one in which they’re not. This is true regardless of whether the direct effects of banning datacenter construction are positive or negative. In such worlds, they’d probably be willing to engage in additional anti-AGI interventions. So perhaps datacenter-impeding is worth supporting just to put us in those worlds, shift the Overton window that way? Not sure whether it’s causal or evidential, though.)
There are also arguments that this would be bad even in the LLMs-are-AGI-complete worlds, because – the argument would go – while LLMs may be nontrivial to align, they’re easier to align than some paradigms that may replace them. I don’t put much stock into this argument, it usually relies on assuming the LLM masks/personas is the entity of interest that we need to align. I don’t put zero stock in it, though, I guess.
You seem to be assuming that the LLM paradigm either is or isn’t AGI-complete. I think add-ons will bring it to AGI-complete even if it’s not by itself.
My position has always been that it’s missing specific cognitive systems the brain has, and when those are added (which may be arbitrarily easy since some analogues for the missing systems already exist at low quality), it will be AGI-complete.
I don’t think there’s strong evidence or arguments against LLMs getting there. I’ve read yours and all the others I’ve found, carefully. I wish I could believe them.
I feel like you’re assuming human cognition has some magic it just doesn’t have. We make tons of mistakes and are bad at generalizing, just like LLMs. We just think a little more carefully (that is, better metacognitive skills) and learn a little better than current LLM systems. That gives us more tries and lets us learn from our few successes in new domains.
Note how much more capable CC and Codex are than the base models (and OpenClaw even if it is still tripping on its own claws and doesn’t really have a use-case yet). We’re just starting to add on to LLMs. Dismissing the potentials would not be wise.
I think it’s important that somebody thinks about what happens if LLMs don’t get there and something else does (a la Steve Byrnes) but I’d noticeably become (even) more pessimistic if most AF thinkers like you started discounting LLMs and working elsewhere. The AF perspective seems really valuable for aligning LLM AGI even if the original AF approaches won’t really work in this domain.
And perhaps effectively banning LLMs would cause a chilling effect much greater than if they were merely a research bet that didn’t work out. It would be a signal that AGI research is something the world does not want to see in any form, turning funding and talent away from the whole endeavor.
This is something I have some hope in (cf. https://www.lesswrong.com/posts/uBs6RJYtQbxZCRxSk/adam_scholl-s-shortform?commentId=fz5N4xn7Lema7RQJ9 ). A model that I got from @Davidmanheim (though I may have misunderstood) is that part of how “this is supposed to go” is not that there’s some big climactic treaty that does the final ban; but rather that there’s a treaty that bans something, with the stated intent of preventing the creation of AGI. Then they continue banning more stuff that’s in line with that intent, as it becomes clear that they should have banned it / meant to ban it. Or something like that.
I actually disagree with this, and would say that if you believe AI alignment is hard and there isn’t a way to make superhuman AI safe without immense capabilities restraint, then data-center bans are net positive for the following reason:
Even under the assumption that new paradigms are required, training and experiment compute is still helpful because of scale-dependent algorithmic efficiency, which means that algorithmic progress requires training compute to increase, and it’s a significant portion of the algorithmic efficiency that we do get in practice, as Epoch notes below:
For example, @MITFutureTech found that shifting from LSTMs (green) to Modern Transformers (purple) has an efficiency gain that depends on the compute scale: - At 1e15 FLOP, the gain is 6.3× - At 3e16 FLOP, the gain is 26×
Naively extrapolating to 1e23 FLOP, the gain is 20,000×!
Also the AI Futures Model argues for a 4x slowdown (but this has to be appropriately timed, but even later pauses slow down takeoff).
You’d need the alternative workable approach to not be basically runnable on GPUs, which is maybe plausible, but seems optimistic?
(E.g. anything that can run on a computer would most likely profit quite a bit from the cheap GPU compute even if it’s overall more complex and the current optimizations aren’t as targeted)
Yeah, strategic planning under massive uncertainty is mostly guesswork.
My preferred policy is a halt. (And not a short one, because I figure ending the halt means we have an excellent chance of dying.) Anthropic’s preferred policy appears to be “try to build a better-than-replacement superintelligence before someone builds an awful one.” (Assuming I understand their actions and writing correctly.) Other people are all in on trying to find some way to improve how much models like happy, thriving humans. Who’s right? None of us know all the details about how this will play out.
Banning data centers would be more promising if it actually affects enough countries to make a difference. Ideally, I would like to see a worldwide frontier training ban with teeth, enforced by at least the US and China. I think this might buy us decades with humans in control of what happens to us, if we’re lucky.
But my model is very much “How much time can we buy?”
Hm, my disagreement with this mental model is that I view current models as already helpful on research, and the further iterations on those models which AI companies will acquire over the next couple years are going to substantially improve on that. Even if LLMs are AGI-complete, in that they can be “boosted” to AGI, it is likely that given the ability to point a thousand automated researchers at foundational problems they’ll… just find that alternate architecture if it exists. This is part of what fuels my shorter timelines, to me they haven’t had to reach far at all yet. When you have that many GPUs to run copies of Claude/ChatGPT you can throw some at wide scattershot in the hope of an advantage in the race or more optimistically an advantage in alignment.
As well, I have the uncertainty of whether LLMs need to be AGI complete to still fill out many investor’s hopes and dreams. Like if OpenAI/Anthropic stalls out on investment in datacenters due to lowered confidence, it chokes and perhaps sells off a bunch, but then hires N-thousand software engineers eager for a job to chomp up massive parts of the industry using Claude 5.9-super-duper and become a giant ala Google/Apple/Microsoft regardless. That is, while it’d be a “winter” in terms of far lower mania, but that it won’t really stop them from their dreams too harshly.
(Though perhaps I’m underestimating how hard they’d falter, like I know Dario said Anthropic was being cautious to avoid collapsing if they overestimate growth, and OpenAI was being less so? I don’t know what constraints they have that might lead to aggressive clawback or other treatment)
Wouldn’t most other alternative routes to AGI also need GPU’s? I would expect less GPU datacenter available also make it harder to pursue neuromorphic AI by putting a large amount of compute into it to scale it up?
I think focusing on the risk in the “AGI achieved” branch which is unlikely given LLM paradigm obscures the fact that there’s x risk in the “aggressively RL an LLM paradigm in a narrow branch that has sufficiently powerful actuators”. The labs are locked in an RL race to the bottom now, and it’s not clear to me that a narrow ASI with sufficiently good coding ability is handleable.
My preferred policy is “we’ll nuke you if you can’t prove you’ve destroyed every cpu above 50m transistors in your territory. This is now your national priority or we launch in two months.” But holy shit is that not on the table, the people who could institute such a policy know how drastic destroying that much good compute is, aren’t convinced robot swarms are doom rather than tools, and have urgent intl conflicts they are constantly preparing for. A nuke threat like that would be difficult to even make believable, to put it lightly.
They’re already looking other directions. Grad student descent takes time and transformers really are quite hard to beat. Most improvements end up turning out to be incremental on top of transformers.
Governments are not going to ban or obstruct data center buildout, given the immense power of Big Tech, the vast amounts of finance involved (which is going to increase), and geopolitical completion.
If it needs to be done, it cannot be done by legal methods, by lobbying or campaigning, but only by Force.
“One man’s Modus Tollens is another man’s Modus Ponens”.
..makes me wonder if 7 people were like “I don’t necessarily disagree, but let me strongly discourage this person to discuss violence in the public”
as if violence was actual 4D chess strategy that could actually work but only if we kept silent about nuking data centers after the Yudko’s unfortunate blunder, as if murder of Brian Thomson improved the healthcare regulation, Suchir Balaji taking his own life stopped the LLM scaling race, Russia became the economic tiger of the 21st century after the quick reunification campaign that went so well, Hamas is now soo de-extremified after the genocide, killing Osama bin Laden stopped all terrorism, Tigray peace has hold forever and ever with no hiccups this year, .. and if black lives matter is now more true than before, it’s not as if the people of the movement killed George Floyd and Breonna Taylor themselves to prove a point..
anyway, live long and midi-chlorians be with you to rule them all
On balance, I support banning/obstructing datacenter buildout. That said, I’m not actually sure whether that impacts the omnicide risk positively or negatively.
I don’t think the LLM paradigm is AGI-complete. I don’t have utter confidence in that, but I think it’s more likely than not. And in worlds in which it is non-AGI-complete, the labs’ obsession with them is very helpful. They’re wasting macroeconomic amounts of money on it, pouring most of our generation’s AI-researcher talent into it, distracting funding from other AGI approaches. If LLMs then fail to usher in the Singularity, if it does turn out to be a bubble that pops (e. g., in 2030-2032, when the ability to aggressively scale compute is supposed to run out), this should cause another AI winter. AGI would become a decidedly unsexy thing to work on once more, in industry and probably in academia both.
What would obstructing the LLM paradigm (via various compute limits) do? Well, it may cause the AGI megacorps to start looking into other directions now, while they still have massive amount of unspent manpower and capital. Which may lead to someone “succeeding” sooner.[1]
Perhaps one shouldn’t interrupt their enemy while they’re making a mistake.
Or perhaps that’s not how the story goes. Perhaps: LLMs are a strong signal that AI is a massively powerful technology, a signal legible to many more people than theoretical arguments. They’re attracting macroeconomic amounts of funding to AI, funneling a large fraction of our generation’s talent towards working on AI, fueling inter-company and geopolitical AI-race dynamics. And while they cause other AGI approaches to be relatively neglected, they cause so much more absolute attention to be pointed at the AI industry that the amounts of funding/talent going into non-LLM AGI routes is still much greater than in the no-LLMs counterfactual. On top of that, even if LLMs aren’t AGI-complete in themselves, they may still be useful enough to speed up SWE/research along other AGI paradigms as well.
And perhaps effectively banning LLMs would cause a chilling effect much greater than if they were merely a research bet that didn’t work out. It would be a signal that AGI research is something the world does not want to see in any form, turning funding and talent away from the whole endeavor.
I’m not really sure which model is correct. Which AI Winter would be colder and broader: one caused by a proactive LLM ban, or one caused by LLMs’ dramatic betrayal of investor hopes? Do LLMs currently cause net increase or decrease of attention to non-LLM AGI research? Can non-AGI LLMs significantly speed up non-LLM AGI research if allowed to continue scaling? And, of course: is the premise that LLMs are non-AGI-complete even correct?
I’m on balance in favor of obstructing LLMs and upper-bounding the available compute. “We must not get in the way of the current paradigm because they are totally making a mistake so all this investment is actually delaying AGI!” has the flavor of some galaxy-brained 4D chess nonsense. But I’m not fully sure.
(Also, worlds in which politicians/the public are concerned about AI enough to ban datacenter buildout is of course a more promising world than the one in which they’re not. This is true regardless of whether the direct effects of banning datacenter construction are positive or negative. In such worlds, they’d probably be willing to engage in additional anti-AGI interventions. So perhaps datacenter-impeding is worth supporting just to put us in those worlds, shift the Overton window that way? Not sure whether it’s causal or evidential, though.)
There are also arguments that this would be bad even in the LLMs-are-AGI-complete worlds, because – the argument would go – while LLMs may be nontrivial to align, they’re easier to align than some paradigms that may replace them. I don’t put much stock into this argument, it usually relies on assuming the LLM masks/personas is the entity of interest that we need to align. I don’t put zero stock in it, though, I guess.
You seem to be assuming that the LLM paradigm either is or isn’t AGI-complete. I think add-ons will bring it to AGI-complete even if it’s not by itself.
My position has always been that it’s missing specific cognitive systems the brain has, and when those are added (which may be arbitrarily easy since some analogues for the missing systems already exist at low quality), it will be AGI-complete.
I don’t think there’s strong evidence or arguments against LLMs getting there. I’ve read yours and all the others I’ve found, carefully. I wish I could believe them.
I feel like you’re assuming human cognition has some magic it just doesn’t have. We make tons of mistakes and are bad at generalizing, just like LLMs. We just think a little more carefully (that is, better metacognitive skills) and learn a little better than current LLM systems. That gives us more tries and lets us learn from our few successes in new domains.
Note how much more capable CC and Codex are than the base models (and OpenClaw even if it is still tripping on its own claws and doesn’t really have a use-case yet). We’re just starting to add on to LLMs. Dismissing the potentials would not be wise.
I think it’s important that somebody thinks about what happens if LLMs don’t get there and something else does (a la Steve Byrnes) but I’d noticeably become (even) more pessimistic if most AF thinkers like you started discounting LLMs and working elsewhere. The AF perspective seems really valuable for aligning LLM AGI even if the original AF approaches won’t really work in this domain.
Yeah, I also think that AGI might be only a skill away.
This is something I have some hope in (cf. https://www.lesswrong.com/posts/uBs6RJYtQbxZCRxSk/adam_scholl-s-shortform?commentId=fz5N4xn7Lema7RQJ9 ). A model that I got from @Davidmanheim (though I may have misunderstood) is that part of how “this is supposed to go” is not that there’s some big climactic treaty that does the final ban; but rather that there’s a treaty that bans something, with the stated intent of preventing the creation of AGI. Then they continue banning more stuff that’s in line with that intent, as it becomes clear that they should have banned it / meant to ban it. Or something like that.
That’s mostly right, I’m iterative on a writeup of the point here: https://docs.google.com/document/d/1EYiUz8fiTe1w0_vt9OHSY_SfIkbjhevwNlsKhn1quZc/edit?usp=sharing—happy for feedback until then. (And hopefully I remember to come back and replace this link with the LW post when it’s done.)
I actually disagree with this, and would say that if you believe AI alignment is hard and there isn’t a way to make superhuman AI safe without immense capabilities restraint, then data-center bans are net positive for the following reason:
Even under the assumption that new paradigms are required, training and experiment compute is still helpful because of scale-dependent algorithmic efficiency, which means that algorithmic progress requires training compute to increase, and it’s a significant portion of the algorithmic efficiency that we do get in practice, as Epoch notes below:
Twitter thread:
Also the AI Futures Model argues for a 4x slowdown (but this has to be appropriately timed, but even later pauses slow down takeoff).
You’d need the alternative workable approach to not be basically runnable on GPUs, which is maybe plausible, but seems optimistic?
(E.g. anything that can run on a computer would most likely profit quite a bit from the cheap GPU compute even if it’s overall more complex and the current optimizations aren’t as targeted)
Yeah, strategic planning under massive uncertainty is mostly guesswork.
My preferred policy is a halt. (And not a short one, because I figure ending the halt means we have an excellent chance of dying.) Anthropic’s preferred policy appears to be “try to build a better-than-replacement superintelligence before someone builds an awful one.” (Assuming I understand their actions and writing correctly.) Other people are all in on trying to find some way to improve how much models like happy, thriving humans. Who’s right? None of us know all the details about how this will play out.
Banning data centers would be more promising if it actually affects enough countries to make a difference. Ideally, I would like to see a worldwide frontier training ban with teeth, enforced by at least the US and China. I think this might buy us decades with humans in control of what happens to us, if we’re lucky.
But my model is very much “How much time can we buy?”
Hm, my disagreement with this mental model is that I view current models as already helpful on research, and the further iterations on those models which AI companies will acquire over the next couple years are going to substantially improve on that. Even if LLMs are AGI-complete, in that they can be “boosted” to AGI, it is likely that given the ability to point a thousand automated researchers at foundational problems they’ll… just find that alternate architecture if it exists. This is part of what fuels my shorter timelines, to me they haven’t had to reach far at all yet. When you have that many GPUs to run copies of Claude/ChatGPT you can throw some at wide scattershot in the hope of an advantage in the race or more optimistically an advantage in alignment.
As well, I have the uncertainty of whether LLMs need to be AGI complete to still fill out many investor’s hopes and dreams. Like if OpenAI/Anthropic stalls out on investment in datacenters due to lowered confidence, it chokes and perhaps sells off a bunch, but then hires N-thousand software engineers eager for a job to chomp up massive parts of the industry using Claude 5.9-super-duper and become a giant ala Google/Apple/Microsoft regardless. That is, while it’d be a “winter” in terms of far lower mania, but that it won’t really stop them from their dreams too harshly. (Though perhaps I’m underestimating how hard they’d falter, like I know Dario said Anthropic was being cautious to avoid collapsing if they overestimate growth, and OpenAI was being less so? I don’t know what constraints they have that might lead to aggressive clawback or other treatment)
Wouldn’t most other alternative routes to AGI also need GPU’s? I would expect less GPU datacenter available also make it harder to pursue neuromorphic AI by putting a large amount of compute into it to scale it up?
I think focusing on the risk in the “AGI achieved” branch which is unlikely given LLM paradigm obscures the fact that there’s x risk in the “aggressively RL an LLM paradigm in a narrow branch that has sufficiently powerful actuators”. The labs are locked in an RL race to the bottom now, and it’s not clear to me that a narrow ASI with sufficiently good coding ability is handleable.
My preferred policy is “we’ll nuke you if you can’t prove you’ve destroyed every cpu above 50m transistors in your territory. This is now your national priority or we launch in two months.” But holy shit is that not on the table, the people who could institute such a policy know how drastic destroying that much good compute is, aren’t convinced robot swarms are doom rather than tools, and have urgent intl conflicts they are constantly preparing for. A nuke threat like that would be difficult to even make believable, to put it lightly.
They’re already looking other directions. Grad student descent takes time and transformers really are quite hard to beat. Most improvements end up turning out to be incremental on top of transformers.
Governments are not going to ban or obstruct data center buildout, given the immense power of Big Tech, the vast amounts of finance involved (which is going to increase), and geopolitical completion.
If it needs to be done, it cannot be done by legal methods, by lobbying or campaigning, but only by Force.
“One man’s Modus Tollens is another man’s Modus Ponens”.
..makes me wonder if 7 people were like “I don’t necessarily disagree, but let me strongly discourage this person to discuss violence in the public”
as if violence was actual 4D chess strategy that could actually work but only if we kept silent about nuking data centers after the Yudko’s unfortunate blunder, as if murder of Brian Thomson improved the healthcare regulation, Suchir Balaji taking his own life stopped the LLM scaling race, Russia became the economic tiger of the 21st century after the quick reunification campaign that went so well, Hamas is now soo de-extremified after the genocide, killing Osama bin Laden stopped all terrorism, Tigray peace has hold forever and ever with no hiccups this year, .. and if black lives matter is now more true than before, it’s not as if the people of the movement killed George Floyd and Breonna Taylor themselves to prove a point..
anyway, live long and midi-chlorians be with you to rule them all