I believe humans have much lower memory capacities because we perform continual learning. We experience catastrophic forgetting because we’re learning on a self-selected narrow dataset. Models titrate their learning rates and intermix all of their training examples, so as to preserve previous learning. In humans, learning is always-on and clustered by topic, so it tends to overwrite previous knowledge unless that knowledge is replayed and re-learned. This is workable because we tend to re-use important skills and knowledge, but it does drastically reduce our memory capacity.
It would be fairly straightforward to emulate this setup for LLMs, at the cost of that high memory capacity. But that’s a large cost to pay.
One component of fluid intelligence that models seem to particularly lack, probably because they’re rarely explicitly present in either the base corpus or RL training sets, is Human-like metacognitive skills.
I believe humans have much lower memory capacities because we perform continual learning. We experience catastrophic forgetting because we’re learning on a self-selected narrow dataset. Models titrate their learning rates and intermix all of their training examples, so as to preserve previous learning. In humans, learning is always-on and clustered by topic, so it tends to overwrite previous knowledge unless that knowledge is replayed and re-learned. This is workable because we tend to re-use important skills and knowledge, but it does drastically reduce our memory capacity.
It would be fairly straightforward to emulate this setup for LLMs, at the cost of that high memory capacity. But that’s a large cost to pay.
One component of fluid intelligence that models seem to particularly lack, probably because they’re rarely explicitly present in either the base corpus or RL training sets, is Human-like metacognitive skills.