Many ideas in the vicinity of continual learning by design don’t involve full fine-tuning where every weight of the model changes, and those that do could probably still be made almost as capable with LoRA. Given the practical importance of only updating maybe 100x fewer parameters than the full model (or less) to keep batched processing of user requests working the same as with KV-cache, I think the first methods dubbed “continual learning” will be doing exactly this.
Maybe at some point there will be an “agent swarm” use case where all the requests in a batch are working on the same problem for the same user, and so their full model can keep being updated in sync for that single problem. But this seems sufficiently niche that it’s not the first thing that gets deployed, and the method for continual learning needs to involve full weights updating at all for this to be relevant.
Another issue with continual learning is that it likely doesn’t have the efficiency of today’s cloud-based LLMs:
Many ideas in the vicinity of continual learning by design don’t involve full fine-tuning where every weight of the model changes, and those that do could probably still be made almost as capable with LoRA. Given the practical importance of only updating maybe 100x fewer parameters than the full model (or less) to keep batched processing of user requests working the same as with KV-cache, I think the first methods dubbed “continual learning” will be doing exactly this.
Maybe at some point there will be an “agent swarm” use case where all the requests in a batch are working on the same problem for the same user, and so their full model can keep being updated in sync for that single problem. But this seems sufficiently niche that it’s not the first thing that gets deployed, and the method for continual learning needs to involve full weights updating at all for this to be relevant.