I think I basically agree with you on that; whenever feasible the full posterior (as opposed to the maximum-likelihood model) is what you should be using. So instead of using “Bayesian model selection” to decide whether to pick cubics or quadratics, and then fitting the best cubic or the best quadratic depending on the answer, the “right” thing to do is to just look at the posterior distribution over possible functions f, and use that to get a posterior distribution over f(x) for any given x.
The problem is that this is not always reasonable for the application you have in mind, and I’m not sure if we have good general methods for coming up with the right way to get a good approximation. But certainly an average over the models is what we should be trying to approximate.
I think I basically agree with you on that; whenever feasible the full posterior (as opposed to the maximum-likelihood model) is what you should be using. So instead of using “Bayesian model selection” to decide whether to pick cubics or quadratics, and then fitting the best cubic or the best quadratic depending on the answer, the “right” thing to do is to just look at the posterior distribution over possible functions f, and use that to get a posterior distribution over f(x) for any given x.
The problem is that this is not always reasonable for the application you have in mind, and I’m not sure if we have good general methods for coming up with the right way to get a good approximation. But certainly an average over the models is what we should be trying to approximate.