I will provide two estimates that both suggest it would be feasible to have at least ~1 million copies of a model learning in parallel at human speed. This corresponds to 2500 human-equivalent years of learning per day, since 1 million days = 2500 years.
This is potentially a little misleading, I think?
A human who learning does *active exploration”. They seek out things that they don’t know, and try to find blind spots. They loop around and try to connect pieces of their knowledge that were unconnected. They balance exploration and exploitation. 2500 years of this is a lot of time to dive deeply into individual topics, pursue them in depth, write new papers on them, talk with other people, connect them carefully, and so on.
An LLM learning from feedback on what it says doesn’t do any of this. It isn’t pursuing long-running threads over 2500-equivalent years, or seeking out blindspots in its knowledge—it isn’t trying to balance exploration and exploitation at all, because it’s just trying to provide accurate answers to the questions given to it. There’s even anti-blindspot feedback—people are disproportionately going to ask LLMs about what people predict they’ll do well at, rather than what they’ll do poorly at! Which will limit the skills it picks up badly.
I don’t know what that looks like in the limit. You could maintain that it’s frighteningly smart, or still really stupid, or more likely both on different topics. But not sure human-equivalent years is going to give you a useful intuition for what this looks like at all. Like it’s… some large amount of knowledge that is attainable from this info, but it isn’t human-equivalent. Just a different kind of thing.
This is potentially a little misleading, I think?
A human who learning does *active exploration”. They seek out things that they don’t know, and try to find blind spots. They loop around and try to connect pieces of their knowledge that were unconnected. They balance exploration and exploitation. 2500 years of this is a lot of time to dive deeply into individual topics, pursue them in depth, write new papers on them, talk with other people, connect them carefully, and so on.
An LLM learning from feedback on what it says doesn’t do any of this. It isn’t pursuing long-running threads over 2500-equivalent years, or seeking out blindspots in its knowledge—it isn’t trying to balance exploration and exploitation at all, because it’s just trying to provide accurate answers to the questions given to it. There’s even anti-blindspot feedback—people are disproportionately going to ask LLMs about what people predict they’ll do well at, rather than what they’ll do poorly at! Which will limit the skills it picks up badly.
I don’t know what that looks like in the limit. You could maintain that it’s frighteningly smart, or still really stupid, or more likely both on different topics. But not sure human-equivalent years is going to give you a useful intuition for what this looks like at all. Like it’s… some large amount of knowledge that is attainable from this info, but it isn’t human-equivalent. Just a different kind of thing.