# Model Comparison

20 coin flips yield 16 heads and 4 tails. Is the coin bi­ased? Given data on 20000 rolls of an im­perfect die, can we de­duce not just the die’s bias, but the phys­i­cal asym­me­tries of the die? Given a set of x-y data, should we use a lin­ear or quadratic re­gres­sion? Th­ese are ques­tions of model com­par­i­son.

This se­quence tack­les model com­par­i­son from a Bayesian first-prin­ci­ples ap­proach.

Out­line:

• Very Short In­tro­duc­tion is ex­actly what it sounds like. It in­tro­duces the main idea, and walks through a sim­ple ex­am­ple: calcu­lat­ing the prob­a­bil­ity that a coin is bi­ased, given some data. Wolf’s Dice is a similar but more in-depth ex­am­ple, which also sets up for later.

• In Wolf’s Dice II, we try to figure out not just the bi­ases of a die, but what phys­i­cal asym­me­tries give rise to those bi­ases. This ex­am­ple comes up again later when dis­cussing cross-val­i­da­tion.

• The next three posts talk about two meth­ods to ap­prox­i­mate Bayesian model com­par­i­son in prac­tice: Laplace ap­prox­i­ma­tion and BIC. We also com­pare their perfor­mance. Th­ese three posts are mainly for peo­ple who want to im­ple­ment Bayesian model com­par­i­son on larger-scale prob­lems (i.e. for ma­chine learn­ing) and need to un­der­stand the ap­prox­i­ma­tion trade-offs; oth­ers will likely want to skip them.

• Fi­nally, we com­pare Bayesian model com­par­i­son to cross-val­i­da­tion. We talk about the differ­ent ques­tions asked by each, and when one or the other should be used. We wrap up with some com­ments on what it means for two mod­els to make differ­ent pre­dic­tions, and why it mat­ters in prac­tice.