An observation that I think is missing here is that this world is biased towards general-purpose search too. As in, it is frequently the case that agents operating in reality face the need to problem-solve in off-distribution circumstances; circumstances to which they could not have memorized correct responses (or even near-correct responses), because they’d never faced them. And if failure is fatal, that creates a pressure towards generality. Not simply a “bias” towards it; a direct pressure.
A supercharged version of that pressure is when the agent is selected for the ability to thrive not only in off-distribution tasks in some environment, but in entire off-distribution environments, which I suspect is how human intelligence was incentivized.
An observation that I think is missing here is that this world is biased towards general-purpose search too. As in, it is frequently the case that agents operating in reality face the need to problem-solve in off-distribution circumstances; circumstances to which they could not have memorized correct responses (or even near-correct responses), because they’d never faced them. And if failure is fatal, that creates a pressure towards generality. Not simply a “bias” towards it; a direct pressure.
And we’re already doing something similar with ML models today, where we’re not repeating training examples.
A supercharged version of that pressure is when the agent is selected for the ability to thrive not only in off-distribution tasks in some environment, but in entire off-distribution environments, which I suspect is how human intelligence was incentivized.
You’re right, that was missing. Very good and important point.