By considering models in the first place, one is already using Occam’s razor. With no preference for simplicity in the priors at all, one would start with uniform priors for all possible data sequences, not finite-parameter models of data sequences. If you formalize models as being programs for Turing machines which have a separate tape for inputting the program, and your prior is a uniform distribution over possible inputs on that tape, you exactly recover the 2^-k Occam’s razor law, where k is the number of program bits that the Turing machine reads during its execution (i.e. the Kolmogorov complexity).
Interestingly, the degree to which one can outperform this distribution is provably bounded. Suppose you are considering some distribution of priors and you want to see how much it outperforms the 2^-k distribution. It can do so if its prior for the true model is higher than 2^-k. So you compute the ratio of this prior to 2^-k for every model. If I remember correctly, there is a theorem which says that this set of ratios will have a finite supremum for any computable prior distribution (which is not obvious since the set of models is infinite). Of course, in some respects, this is an unfair comparison since the 2^-k distribution is itself uncomputable. Hopefully I am not misstating the theorem. I think it is stated and proved in the book by Li and Vitanyi.
To see why ignoring Occam’s razor is unthinkable (and to be amused), consider the following joke.
An astronaut visits a planet with intelligent aliens. These aliens believe in the reverse induction principle. That is, whatever has happened in the past is unlikely to be what will happen in the future. That the sun has risen every previous day is to them not evidence that it will rise tomorrow, but the contrary. Unsurprisingly, this causes all sorts of problems for the aliens, and they are starving and miserable. The astronaut asks them, “Why are you clinging to this belief, given that it obviously causes you so much suffering?” The aliens respond, “Well...it’s never worked well for us in the past!”
William,
By considering models in the first place, one is already using Occam’s razor. With no preference for simplicity in the priors at all, one would start with uniform priors for all possible data sequences, not finite-parameter models of data sequences. If you formalize models as being programs for Turing machines which have a separate tape for inputting the program, and your prior is a uniform distribution over possible inputs on that tape, you exactly recover the 2^-k Occam’s razor law, where k is the number of program bits that the Turing machine reads during its execution (i.e. the Kolmogorov complexity).
Interestingly, the degree to which one can outperform this distribution is provably bounded. Suppose you are considering some distribution of priors and you want to see how much it outperforms the 2^-k distribution. It can do so if its prior for the true model is higher than 2^-k. So you compute the ratio of this prior to 2^-k for every model. If I remember correctly, there is a theorem which says that this set of ratios will have a finite supremum for any computable prior distribution (which is not obvious since the set of models is infinite). Of course, in some respects, this is an unfair comparison since the 2^-k distribution is itself uncomputable. Hopefully I am not misstating the theorem. I think it is stated and proved in the book by Li and Vitanyi.
To see why ignoring Occam’s razor is unthinkable (and to be amused), consider the following joke.
An astronaut visits a planet with intelligent aliens. These aliens believe in the reverse induction principle. That is, whatever has happened in the past is unlikely to be what will happen in the future. That the sun has risen every previous day is to them not evidence that it will rise tomorrow, but the contrary. Unsurprisingly, this causes all sorts of problems for the aliens, and they are starving and miserable. The astronaut asks them, “Why are you clinging to this belief, given that it obviously causes you so much suffering?” The aliens respond, “Well...it’s never worked well for us in the past!”