Are human imitators superhuman models with explicit constraints on capabilities?

epistemic status: A speculative hypothesis, don’t know if this already exists. The only real evidence I have for this is a vague analogy based on some other speculative (though less speculative) hypothesis. I don’t think this is particularly likely to be true, having thought about it for about a minute. Edit: having thought about it for two minutes, it seems somewhat more likely.


There is a hypothesis that has been floating around in the context of explaining double descent, that what so called over-parameterized models do is store in parallel two algorithms: (1) the simplest model of the data without (label) noise, and (2) a “lookup table” for deviations from that model, to represent the (label) noise, because this is the most simple representation of the data and big neural nets are biased towards simplicity.

Maybe something vaguely similar would happen if we throw sufficient compute at generative models of collections of humans, e.g. language models:

Hypothesis: The simplest way to represent the data generated by various humans on the internet, in the limit of infinite model size and data, is to have (1) a single idealized super-human model for reasoning and writing and knowledge-retrieval and so forth, and (2) memorized constraints on this model for various specific groups of humans and contexts to represent their deviations from rationality, specific biases, cognitive limitations, lack of knowledge of certain areas, etc.

This maybe implies, using some eye squinting and handwaving and additional implicit assumptions, some of the following (vague, speculative) implications about GPT-N:

  • In the limit of N, GPT-N will produce text that sufficiently looks like human-written text within contexts (prompts) that humans typically produce. GPT-N will use human-level reasoning, world-modeling, and planning abilities to produce this text. However, if you give it sufficiently out-of-distribution prompts, its lookup table for specific irrationalities will not apply, and it will apply superhuman planning and world-modeling and reasoning abilities that are more competent and more free of biases than the most rational human.

  • In the limit of N, If you take GPT-N and fine tune it on an RL task that requires good reasoning, it might be possible to get a system that seems to behave far more intelligently than it seemed to be on the imitation task, as the fine tuning essentially turns off constraints on its reasoning abilities.