I think a version of this pretty much has to be true for at least a subset of skills/declarative knowledge, like factual knowledge (being a walking Wikipedia) or programming. A large model has read more of Wikipedia, and memorized more of Wikipedia (as well as Arxiv, Pubmed...), than any single human ever has. One trained on Github has also learned more languages to more depth than any human has: a human programmer will be superior on a few specific languages, doubtless, but they will still know less in aggregate. So when it conditions on an individual prompt/human, the imitation will be limited, but that vast pool of knowledge is still there. Across all the prompts one might test, one can elicit more knowledge than an individual human has.
In order for both of the points to be true, that is equivalent to claiming that it cannot tap into the full pool under all possible conditions, including invasive ones like RL training or prompt finetuning, which is to make a truly remarkable universal claim with a heavy burden of proof. Somehow all that knowledge is locked up inside the model parameters, but in so fiendishly encrypted a way that only small subsets—which always just happen to correspond to human subsets—can ever be used in a response...?
So, since at least some modest version is definitely true, the only question is how far it goes. Since the ways in which imitation learning can exceed experts are quite broad and general, it’s hard to see why you would then be able to cavil at any particular point. It just seems like an empirical question of engineering & capabilities about where the model lands in terms of its unshackled capabilities—the sort of thing you can’t really deduce in general, and just have to measure it directly, like prompting or gradient-ascending a Decision Transformer for its highest possible reward trajectory to see what it does.
“which is to make a truly remarkable universal claim with a heavy burden of proof.”
Having thought about this way less than you, it doesn’t seem at first sight to me as remarkable as you seem to say. Note that the claim wouldn’t be that you can’t write a set of prompts to get the fully unversal reasoner, but that you can’t write a single prompt that gets you this universal reasoner. It doesn’t sound so crazy to me at all that knowledge is dispersed in the network in a way that e.g. some knowledge can only be accessed if the prompt has the feel of being generated by an american gun rights activist, or something similar. By the way, here we generate a few alternative hypotheses here.
“In order for both of the points to be true, that is equivalent to claiming that it cannot tap into the full pool under all possible conditions”
I might be misunderstanding, but it seems like this is the opposite of both my implication 1 and 2? implication 1 is that it can tap into this, in sufficiently out-of-distribution contexts. implication 2 is that with fine tuning you can make it tap into fairly quickly in specific contexts. EDIT: oh maybe you simply made a typo and meant to say “to be false”.
By the way we write some alternative hypotheses here. All of this is based on probably less than 1 hour of thinking.
I think a version of this pretty much has to be true for at least a subset of skills/declarative knowledge, like factual knowledge (being a walking Wikipedia) or programming. A large model has read more of Wikipedia, and memorized more of Wikipedia (as well as Arxiv, Pubmed...), than any single human ever has. One trained on Github has also learned more languages to more depth than any human has: a human programmer will be superior on a few specific languages, doubtless, but they will still know less in aggregate. So when it conditions on an individual prompt/human, the imitation will be limited, but that vast pool of knowledge is still there. Across all the prompts one might test, one can elicit more knowledge than an individual human has.
In order for both of the points to be true, that is equivalent to claiming that it cannot tap into the full pool under all possible conditions, including invasive ones like RL training or prompt finetuning, which is to make a truly remarkable universal claim with a heavy burden of proof. Somehow all that knowledge is locked up inside the model parameters, but in so fiendishly encrypted a way that only small subsets—which always just happen to correspond to human subsets—can ever be used in a response...?
So, since at least some modest version is definitely true, the only question is how far it goes. Since the ways in which imitation learning can exceed experts are quite broad and general, it’s hard to see why you would then be able to cavil at any particular point. It just seems like an empirical question of engineering & capabilities about where the model lands in terms of its unshackled capabilities—the sort of thing you can’t really deduce in general, and just have to measure it directly, like prompting or gradient-ascending a Decision Transformer for its highest possible reward trajectory to see what it does.
“which is to make a truly remarkable universal claim with a heavy burden of proof.”
Having thought about this way less than you, it doesn’t seem at first sight to me as remarkable as you seem to say. Note that the claim wouldn’t be that you can’t write a set of prompts to get the fully unversal reasoner, but that you can’t write a single prompt that gets you this universal reasoner. It doesn’t sound so crazy to me at all that knowledge is dispersed in the network in a way that e.g. some knowledge can only be accessed if the prompt has the feel of being generated by an american gun rights activist, or something similar. By the way, here we generate a few alternative hypotheses here.
“In order for both of the points to be true, that is equivalent to claiming that it cannot tap into the full pool under all possible conditions”
I might be misunderstanding, but it seems like this is the opposite of both my implication 1 and 2? implication 1 is that it can tap into this, in sufficiently out-of-distribution contexts. implication 2 is that with fine tuning you can make it tap into fairly quickly in specific contexts. EDIT: oh maybe you simply made a typo and meant to say “to be false”.
By the way we write some alternative hypotheses here. All of this is based on probably less than 1 hour of thinking.