Chantiel comments on Latent Variables and Model Mis-Specification

Chantiel 2 Aug 2021 23:00 UTC
1 point
0
“Call a model well-specified if there is some parameter $θ^{*}$ for which $p_{θ^{*}} (y, x)$ matches the true distribution over $y$ , and call a model mis-specified if no such $θ^{*}$ exists.”
Is it even possible to come up with a well-specified model of human behavior or preferences? I wouldn’t be surprised if creating a model of a person with perfect predictive accuracy would require you to model every neuron in their brain to figure out what exactly would be done in every situation. But I don’t think realistic AI’s would even have the storage base to represent a single model like that.
“In the regression setting where we cared about identifying $θ$ , it was obvious that there was no meaningful “true” value of $θ$ when the model was mis-specified.”
I’m worried about this. If we can’t tractably make a well-specified model of behavior, then how do we learn preferences if there is no meaningful “true” value of any latent variables representing preferences in a model?
But perhaps there actually is a way to assign a reasonable meaning to the “true” value of a model in a mis-specified model. Perhaps you could go with whichever performs best and call that the “true” one. I mean, I think this is how people realistically come up with answers to what’s “true”. For example, someone who only knows Newtonian physics couldn’t come up, using the Newtonian model, of a well-specified model of the actual physical results on Earth,. However, I think it would still be meaningful if they said something like, “the gravity on Earth’s surface is 9.8 meters/second”, because that’s what would work best with their model.