Donald Hobson comments on Named Distributions as Artifacts

Donald Hobson 4 May 2020 22:58 UTC
3 points
Here is why you use simple models.
The blue crosses are the data. The red line is the line of best fit. The black line is a polynomial of degree 50 of best fit. High dimensional models have a tendency to fit the data by wiggling wildly.
- johnswentworth 5 May 2020 2:15 UTC
  3 points
  Parent
  That problem would be handled by cross-validation; the OP is saying that a simple model doesn’t have an obvious advantage assuming that both validate.
  Given that both models validate, the main reason to prefer a simpler model is the sort of thing in Gears vs Behavior: the simpler model is more likely to contain physically-realistic internal structure, to generalize beyond the testing/training sets, to handle distribution shifts, etc.
  - Donald Hobson 6 May 2020 14:33 UTC
    1 point
    Parent
    It depends on what cross validation you are using. I would expect complex models to rarely cross validate.