Seth Herd comments on Foom & Doom 1: “Brain in a box in a basement”

Seth Herd 25 Jun 2025 0:59 UTC
7 points
3
I agree with you that a system that learns efficiently can foom (improve rapidly with little warning). This is why I’ve been concerned with short timelines for LLM-based systems if they have online, self-directed learning added in the form of RAG and or fine-tuning (e.g. LLM AGI will have memory).
My hope for those systems and for the more brainlike AGI you’re addressing here is that they learn badly before they learn well. I hope that seeing a system learn (and thereby self-improve) before ones’ eyes brings the gravity of the situation into focus. The majority of humanity thinks hard about things only when they’re immediate and obviously important. So the speed of takeoff is critical.
I expect LLM-based AGI to arrive before strictly brainlike AGI, but I actually agree with you that LLMs themselves are hitting a wall and would not progress (at least quickly) to really transformative AGI. I am now an LLM plateauist. Yet I still think that LLMs that are cleverly scaffolded into cognitive architectures can achieve AGI. I think this is probably possible even with current LLMs, memory systems, and clever prompting (as in Google’s co-scientist) once all of those are integrated.
But what level of AGI those systems start at, and their speed of progresses beyond human intelligence matter a lot. That’s why your prediction of rapid progress for brainlike AGI is alarming and makes me think we might be better off trying to achieve AGI with scaffolded LLMs. I think early LLM-based agents will be general in that they can learn about any new problem or skill, but they might start below the human level of general intelligence, and progress slowly beyond the human level. That might be too optimistic, but I’m pinning much of my hope for successful alignment here, because I do not think the ease of aligning LLMs means that fully general agents based on them will be easy to align.
Such an architecture might advance slowly because it shares some weaknesses of LLMs, and through them, shares some limitations of human thought and human learning. I very much hope that brainlike AGI like you’re envisioning will also share those weaknesses, giving us at least a hope of controlling it long enough to align it, before it’s well beyond our capabilities.
You don’t think that slow progression of brainlike AGI is likely. That’s fascinating because we share a pretty similar view of brain function. I would think that reproducing cortical learning would require a good deal of work and experimentation, and I wouldn’t expect working out the “algorithm” to happen all at once or to be vastly more efficient than LLMs (since they are optimized for the computers used to simulate them, whereas cortical learning is optimized for the spatially localized processing available to biology. Sharing your reasoning would be an infohazard, so I won’t ask. I will ask you to consider privately if it isn’t more likely that such systems work badly for a good while before they work well, giving their developers a little time to seriously think about their dangers and how to align them.
Anyway, your concern with fooming brainlike AGI shares many of my concerns with self-teaching LLM agents. They also share many of the same alignment challenges. LLM agents aren’t currently RL agents even though the base networks are partially trained with RL; future versions might closer to model-based RL agents, although I hope that’s too obviously dangerous for the labs to adopt as their first approach. The only real advantage to aligning LLM agents over model-based RL agents seems to be their currently-largely-faithful chains of thought, but that’s easy to lose if developers decides that’s too large an alignment tax to pay.
Speed of takeoff also would seem to modulate alignment difficulty pretty dramatically, so I hope you’re wrong that there’s a breakthrough waiting to be made in understanding the cortical learning algorithm. I spent a lot of time thinking about cortical learning, since I worked for a long time in one of the labs making those detailed models of cortical function. But I spent more time thinking about system-level interactions and dynamics, because it seemed clear to me that the available data and integration techniques (hand-built network simulations that were allowed to vary to an unspecified degree between toy benchmarks) weren’t adequate to constrain detailed models of cortical learning.
Anyway, it seems possible you’re right. I hope there aren’t breakthroughs in either empirical techniques or theory of cortical function soon.
- Steven Byrnes 27 Jun 2025 16:53 UTC
  6 points
  0
  Parent
  I would think that reproducing cortical learning would require a good deal of work and experimentation, and I wouldn’t expect working out the “algorithm” to happen all at once
  I agree; working out the “algorithm” is already happening, and has been for decades. My claim instead is that by the time you can get the algorithm to do something importantly useful and impressive—something that LLMs and deep learning can’t already do much cheaper and better—then you’re almost at ASI. Note that we have not passed this threshold yet (no offense). See §1.7.1.
  or to be vastly more efficient than LLMs (since they are optimized for the computers used to simulate them, whereas cortical learning is optimized for the spatially localized processing available to biology
  I think people will try to get the algorithms to work efficiently on computers in the toy-model phase, long before the algorithms are doing anything importantly useful and impressive. Indeed, people are already doing that today (e.g.). So in that gap between “doing something importantly useful and impressive” and ASI, people won’t be starting from scratch on the question of “how do we make this run efficiently on existing chips”, instead they’ll be building on all the progress they made during the toy-model phase.