Yes, it’s not difficult to custom-run a continual learning on a modestly sized LLM.
Although, interestingly enough, people do try to avoid overhead of gradient training while doing that. For example, a recent Sakana approach uses hypernetworks to instantly generate LoRA adapters: https://pub.sakana.ai/doc-to-lora/.
Yes, it’s not difficult to custom-run a continual learning on a modestly sized LLM.
Although, interestingly enough, people do try to avoid overhead of gradient training while doing that. For example, a recent Sakana approach uses hypernetworks to instantly generate LoRA adapters: https://pub.sakana.ai/doc-to-lora/.