What does “test-set performance” represent in Bayesian machine learning? In Bayesian ML:
we have some data D={(y1,x1),…,(ym,xm)}
we assume a model M (this includes any assumptions we make about the prior densities)
and we compute the posterior predictive density p(~y|~x,D,M)
I have seen people argue that we need a test-set to compare between two models M1 and M2 as we do not know what “the one true model” is. I don’t fully understand how “evaluating performance” on “out-of-sample” data helps us with comparing two models but isn’t this what the quantity p(M|D) is for?
[Question] Question about Test-sets and Bayesian machine learning
What does “test-set performance” represent in Bayesian machine learning? In Bayesian ML:
we have some data D={(y1,x1),…,(ym,xm)}
we assume a model M (this includes any assumptions we make about the prior densities)
and we compute the posterior predictive density p(~y|~x,D,M)
I have seen people argue that we need a test-set to compare between two models M1 and M2 as we do not know what “the one true model” is. I don’t fully understand how “evaluating performance” on “out-of-sample” data helps us with comparing two models but isn’t this what the quantity p(M|D) is for?