Vaniver comments on Open thread, Jan. 19 - Jan. 25, 2015

Vaniver 19 Jan 2015 22:13 UTC
3 points
0
I think you’ll get somewhere by searching for the phrase “complexity penalty.” The idea is that we have a prior probability for any explanation that depends on how many terms / free parameters are in the explanation. For your particular example, I think you need to argue that their prior probability should be different than it is.

I think it’s easier to give a ‘frequentist’ explanation of why this makes sense, though, by looking at overfitting. If you look at the uncertainty in the parameter estimates, they roughly depend on the number of sample points per parameter. Thus the fewer parameters in a model, the more we think each of those parameters will generalize. One way to think about this is the more free parameters you have in a model, the more explanatory power you get “for free,” and so we need to penalize the model to account for that. Consider the Akaike information criterion and Bayesian information criterion.