If continual learning is cracked before jailbreak resistance, and the deployment model of “the same weights are used for inference for all customers” holds, the world of corporate espionage is going to get wild.
Right now, you need to be careful not to include sensitive information, include untrusted external information, AND have a method of sending arbitrary data to the outside world in a single context window since the LLM might be tricked by the external content. Any two of those, however, are fine.
If (sample-efficient) continual learning is cracked, and models are still shared across multiple customers, you will need to be sure to never share sensitive information with a model that will learn that information and then be available for your competitors to do inference on OR fully trust any model that has learned off of possibly-advesarial-to-you data.
And if continual learning is cracked without major architectural changes, giving up on using the same model for all customers means giving up on many of the benefits of batching.
If you only update weights in LoRAs, they can be like KV-caches, different for each request and taking up only a reasonable amount of memory, preserving all the benefits of batching and keeping the users isolated from each other.
If continual learning is cracked before jailbreak resistance, and the deployment model of “the same weights are used for inference for all customers” holds, the world of corporate espionage is going to get wild.
Right now, you need to be careful not to include sensitive information, include untrusted external information, AND have a method of sending arbitrary data to the outside world in a single context window since the LLM might be tricked by the external content. Any two of those, however, are fine.
If (sample-efficient) continual learning is cracked, and models are still shared across multiple customers, you will need to be sure to never share sensitive information with a model that will learn that information and then be available for your competitors to do inference on OR fully trust any model that has learned off of possibly-advesarial-to-you data.
And if continual learning is cracked without major architectural changes, giving up on using the same model for all customers means giving up on many of the benefits of batching.
If you only update weights in LoRAs, they can be like KV-caches, different for each request and taking up only a reasonable amount of memory, preserving all the benefits of batching and keeping the users isolated from each other.
If something LoRA-shaped usefully cracks continual learning things a lot of things in general are going to get very crazy very quickly.