I don’t think I understand. I think of the data for continuous learning as coming from deployment—sessions and evaluations of what’s worth learning/remembering. Are you referring to data appropriate for learning-to-learn in initial training?
I agree that that’s scarse. And it would be nice to have. The Transformers and Hope architectures need to learn-to-learn, and humans do too, to some extent. But to some extent, we have built-in emotional registers for what’s important to learn, what’s surprising and important. Loosely similar mechanisms might work for continuous learning.
Yes, I am referring to the lack of learning-to-learn data during initial training.
Your point that humans have built-in mechanisms for continual learning is similar to what I’m saying about inductive biases: if we don’t have the data to train continual learning into models, we need to build it into the architecture.
However, I think the ‘data’ from which humans learn during development (on-policy interactions with the environment with constant feedback and something like rewards) is much more aligned to continual learning than books and pdfs.
I don’t think I understand. I think of the data for continuous learning as coming from deployment—sessions and evaluations of what’s worth learning/remembering. Are you referring to data appropriate for learning-to-learn in initial training?
I agree that that’s scarse. And it would be nice to have. The Transformers and Hope architectures need to learn-to-learn, and humans do too, to some extent. But to some extent, we have built-in emotional registers for what’s important to learn, what’s surprising and important. Loosely similar mechanisms might work for continuous learning.
Yes, I am referring to the lack of learning-to-learn data during initial training.
Your point that humans have built-in mechanisms for continual learning is similar to what I’m saying about inductive biases: if we don’t have the data to train continual learning into models, we need to build it into the architecture.
However, I think the ‘data’ from which humans learn during development (on-policy interactions with the environment with constant feedback and something like rewards) is much more aligned to continual learning than books and pdfs.