If you only update weights in LoRAs, they can be like KV-caches, different for each request and taking up only a reasonable amount of memory, preserving all the benefits of batching and keeping the users isolated from each other.
If something LoRA-shaped usefully cracks continual learning things a lot of things in general are going to get very crazy very quickly.
If you only update weights in LoRAs, they can be like KV-caches, different for each request and taking up only a reasonable amount of memory, preserving all the benefits of batching and keeping the users isolated from each other.
If something LoRA-shaped usefully cracks continual learning things a lot of things in general are going to get very crazy very quickly.