But humans seem to have some way of (sometimes) noticing out-of-distribution inputs, and can feel confused instead of just confidently use their existing training to respond to it.
I think what you’re describing can be approximated by a Bayesian agent having a wide prior, and feeling “confused” when some new piece of evidence makes its posterior more diffuse. Evolutionarily it makes sense to have that feeling, because it tells the agent to do more exploration and less exploitation.
For example, if you flip a coin 1000 times and always get heads, your posterior is very concentrated around “the coin always comes up heads”. But then it comes up tails once, your posterior becomes more diffuse, you feel confused, and you change your betting behavior until you can learn more.
I think it is driven by a general heuristic of finding compressibility. If a distribution seems complex we assume we’re accidentally conflating two variables and seek the decomposition that makes the two resultant distributions approximate-able by simpler functions.
I think what you’re describing can be approximated by a Bayesian agent having a wide prior, and feeling “confused” when some new piece of evidence makes its posterior more diffuse. Evolutionarily it makes sense to have that feeling, because it tells the agent to do more exploration and less exploitation.
For example, if you flip a coin 1000 times and always get heads, your posterior is very concentrated around “the coin always comes up heads”. But then it comes up tails once, your posterior becomes more diffuse, you feel confused, and you change your betting behavior until you can learn more.
I think it is driven by a general heuristic of finding compressibility. If a distribution seems complex we assume we’re accidentally conflating two variables and seek the decomposition that makes the two resultant distributions approximate-able by simpler functions.