Seth Herd comments on MichaelDickens’s Shortform

Seth Herd 15 Oct 2025 21:16 UTC
4 points
0
I agree, and I’m glad to hear you express this. Continual learning might be the missing piece for takeoff because it can fill arbitrary gaps.
And it might not even need to work a lot better than current systems before it’s pretty dangerous. I really hope we see a slower takeoff with limited continual learning that’s good enough to be scary but not good enough for fast takeoff.
I need to write up something like the below as a post, but in the meantime, here’s a dashed-off version.
ReasoningBank (Google) and SEAL (MIT) are just two examples of the large amount of work going into memory systems. They all have sharp limitations right now, but also definitely add capability. It really doesn’t look like any breakthroughs are necessary, just improvements.
We should be asking if next-gen LLMs are takeover capable with scaffolding, including memory and tools. That answer is scarier. There seems to be an assumption that scaffolding doesn’t work very well, since research has moved away from it. But Google’s co-scientist project, reputedly paralleling top-tier research teams for creating new theories based on lage empirical literatures, indicates that scaffolding is quite useful in at least some ways—in particular, for addressing LLMs notorious lack of taste or judgment, by cuing it to self-critique and evolve its theories in response to evidence, something like humans do for complex/important judgments.
In Capabilities and alignment of LLM cognitive architectures I tried to lay out the case for why LLMs might be nearly AGI, needing only memory and better executive function, and a little more progress on the core LLMs. Human executive function is an important form of “cognitive dark matter”, subtle stuff we have and LLMs lack. Humans learn our executive function slowly and painfully; LLMs with continous learning could do the same. This argument for short timelines being too plausible is my best shot to date at making this argument, but I feel I’m still failing to convey why I think this is far more possible than most other safety researchers think.
Should this happen soon, I hope that limitations in memory, reasoning, and continual learning will give us some sharp warning shots. It seems fairly likely we’ll have useful but limited versions of those for a little before we get versions good enough for rapid learning and takeoff.

But “a little” could be a few months.

I don’t like this possibility and very much hope this is somehow totally wrong. But I haven’t seen any convincing counterarguments despite spending a lot of time looking for them and steelmanning. “Maybe not” and “people don’t do stuff” and “progress is usually slower than it could theoretically be” are the best I’ve found so far. “LLMs are fundamentally limited” arguments don’t address memory and continual learning having potential unblocking or synergistic effects on total intelligence/competence—like they do for humans.

So I think we’re probably okay on another generation or so of LLMs plus expected scaffolding and memory; but we probably shouldn’t be too sure.
Sorry again for this being dashed off; this deserves a more careful writup up-to-date writeup than this or the linked arguments.
- Vladimir_Nesov 15 Oct 2025 21:33 UTC
  4 points
  0
  Parent
  HBM size increases (per scale-up world) and IMO results are cruxes for my argument though (regarding the next generation of LLMs being potentially takeoff-capable). It’s not about scaffoldings in general.
  
  The 8-chip servers are not just much smaller than GB200 NVL72 (let alone Ironwood), but smaller than compute optimal sizes for dense models at even 2024 levels of training compute (which is about 1T active params), thus any MoE models are driving the number of active params substantially below what’s compute optimal (on the pain of overly slow/expensive inference and RLVR training). But with 20-50 TB of HBM, this constraint will be lifted almost completely, and MoE models can soak up the spare HBM above compute optimal active params as available (in the form of total params), without incurring much of an overhead while the total params are only taking up less than ~half of a scale-up world (or two).
  
  The IMO results strongly suggest that the current manual methods of adaptation are good enough to tackle any given problem domain (that is sufficiently specialized, but including those where only informal fuzzy feedback is available) at the level of performance of the most capable humans. So plausibly all that remains is automating something that already works, rather than developing something new.
  
  And in 2026, there is a confluence of these factors as well as continual learning being in the spotlight, so the probability of a significant advancement seems unusually high, beyond what hardware scaling at 2022-2026 levels (3.5x per year, plus adoption of lower precisions) would still be promising (compared to the 2028+ slowdown).