Seth Herd comments on Prosaic Continual Learning

Seth Herd 25 Feb 2026 18:05 UTC
5 points
0
Absolutely. We might not need “real” continual learning to serve the purpose and achieve better-than-human AGI. This is one more reason to be prepared for short timelines. But I think you’re overestimating how hard it is to implement some amount of “real” weight-based continual learning. This just adds to the danger.

Fortunately, both have limitations, so we might get a slower progression from parahuman AI to full AGI to ASI. This would marginally improve our odds of good outcomes.

I recently reviewed different types of continual learning/memory in LLM AGI will have memory, and memory changes alignment. I called the thing you’re talking about “context engineering” for the current terminology in the agent enthusiast community; I don’t know if they’ve moved on now. I also discuss a little how fine-tuning can be used for skill acquisition, not the skill evocation it’s usually used for. Catastrophic forgetting is a problem if you try to acquire a whole bunch of new skills simultaneously. But it’s (anecdotally) not a problem at all with modern methods if you’re just trying to teach skills relevant to your workflow in a local instance.

I find your model of rapid retraining on sequences to be interesting and to make more explicit some vague ideas floating around of the difference between generations and instances becoming more blurred.

Re-training on everything is one big part of how the human brain seems to avoid catastrophic interference. We replay/think about important memories more, which is effectively continually retraining on a wide variety of stuff (including all of the skills and knowledge we use). This avoids catastrophic forgetting for important things, while creating it for unimportant things.
But there’s a big problem with what you suggest: proprietary data. Businesses can’t allow training on all of their internal use, and individuals often don’t want to. So that route will be of limited use (although definitely of some use, and possibly very nontrivial even with limited data!).

So a hybrid model I’d imagine is: locally fine-tune a model on the skills you need it to have. Filter the data for sensitive information and take it out of the training set (of course this won’t work that well, but perhaps it’s enough to shift legal responsibility onto the developer or leave it ambiguous enough for businesses to buy in). Then the next generation of the model won’t need to be fine-tuned on those skills, and you can go fine-tune on more specific skills, using your “budget” of catastrophic forgetting elsewhere.

Anyway, I think your point about training on deployment sequences using RL to improve memory use is quite important. The key feature is that it’s easy. Developers have lots of that data (minus the large amount that users don’t want in the training set). Applying a reward model to that data seems pretty easy, as you describe.

So, one more route to high capabilities. Better get on that whole alignment thing!
- HunterJay 26 Feb 2026 0:44 UTC
  3 points
  0
  Parent
  Thanks for replying. I definitely do expect real continual learning to be developed too, to be clear. I don’t know on what timeline, but if there’s any benefit to be gained by it, it will remain an R&D target and eventually be cracked, possibly by automated R&D. My main argument is that theoretical breakthroughs aren’t required to get most of the supposed benefits of continual learning.
  
  I think context engineering is a fair description of what I am talking about, yes! Except that this is explicitly the subset of that where we are getting the AI to intelligently handle its own context. I hadn’t heard that about fine-tuning narrow skills, interesting if true, do you have a source going into that?
  
  Regarding proprietary data, I am talking about a system where the information (memories + documentation) is kept out of the weights, and out of the training, which seems like it’s much better for proprietary data. The data never has to be shared, and to get the same performance again, you just load the memories into context without needing to fine-tune. Did I misunderstand you here?
  
  Regarding the training, I’m not actually suggesting training on data produced at runtime, at least not in any way that is different to what happens today—I’m saying that you can take the post-training already being done (provide the model with some task, reward success) and expand it to allow models to learn to pass information between runs (provide the model with some task, let it write notes after, run another task with those notes, then reward based on the success on both tasks combined).
  
  Interesting thoughts on replay with human memories, I think I agree. It effectively means humans are selecting what to remember using our full(?) intelligence, which is an interesting thing to think about in light of having the LLMs select what to remember by writing notes (and thinking about why designing state space models to learn to choose what to keep implicitly rather than explicitly has been so hard).