Bayesian inference uses aspects of the scientific method, which involves collecting evidence that is meant to be consistent or inconsistent with a given hypothesis. As evidence accumulates, the degree of belief in a hypothesis ought to change. With enough evidence, it should become very high or very low. . . . Bayesian inference uses a numerical estimate of the degree of belief in a hypothesis before evidence has been observed and calculates a numerical estimate of the degree of belief in the hypothesis after evidence has been observed. . . . Bayesian inference usually relies on degrees of belief, or subjective probabilities, in the induction process and does not necessarily claim to provide an objective method of induction.
He then writes:
This does not describe what I do in my applied work. I do go through models, sometimes starting with something simple and building up from there, other times starting with my first guess at a full model and then trimming it down until I can understand it in the context of data. And in any reasonably large problem I will at some point discard a model and replace it with something new (see Gelman and Shalizi 2011a,b, for more detailed discussion of this process and how it roughly fits in to the philosophies of Popper and Kuhn).
But I do not make these decisions on altering, rejecting, and expanding models based on the posterior probability that a model is true. Rather, knowing ahead of time that my assumptions are false, I abandon a model when a new model allows me to incorporate new data or to fit existing data better.
I don’t disagree with Gelman’s statistical practice, but I disagree with his justification. Statistical models are models of our uncertainty about a particular problem. Model checks are a great way to check how well the model is actually modeling our uncertainty and building models up in the fashion Gelman suggests are great ways to find a reasonable model for our uncertainty. Posterior model probabilities—when they can be calculated—are one way of assessing when we have found a better model, but they aren’t the only way (and aren’t necessarily the best way either).
If we 100% knew our priors (both the prior distribution and the rule by which we update on our prior disttribution, i.e. the likelihood) then Gelman’s methods would be useless. Just do the Bayesian update! But we don’t actually know our priors, so we must take care to model them as accurately as we can and Gelman’s methods are pretty good at helping us do this.
Quoting Gelman himself on page 77 of the linked paper:
If you could really express your uncertainty as a prior distribution, then you could just as well observe data and directly write your subjective posterior distribution, and there would be no need for statistical analysis at all.
In full context of the paper, Gelman is noting this as a problem with standard Bayesian analysis. He doesn’t argue, as I’m arguing, that we’re trying to model our priors or the structure of our uncertainty, i.e. that we’re trying to approximate the fully Baysian answer.
After going back and re-reading this, I realized your comments are more prescient than I gave them credit for in the past. I’m now struggling with the Gelman-Shalizi article (link). Do you know of any LessWrong sources that discuss this. I need to really sit back and think, but it seems to me that Gelman and Shalizi are making some serious mistakes here. And they are two of the best practitioners I know of. That scares me a great deal.
I don’t know of any sources, short of an allusion or two in my comment history, but I don’t recommend digging for them. One point I think I’ve made in the past is that an implication of viewing statistics as a method of modeling and thus approximating our uncertainty is that Gelman’s posterior predictive checks have limits, though they’re still useful. If posterior predictive checking tells you some part of your model is wrong but you otherwise have good reason to believe that part is an accurate representation of your true uncertainty, it might still be a good idea to leave that part alone.
Gelman quotes from wikipedia:
He then writes:
I don’t disagree with Gelman’s statistical practice, but I disagree with his justification. Statistical models are models of our uncertainty about a particular problem. Model checks are a great way to check how well the model is actually modeling our uncertainty and building models up in the fashion Gelman suggests are great ways to find a reasonable model for our uncertainty. Posterior model probabilities—when they can be calculated—are one way of assessing when we have found a better model, but they aren’t the only way (and aren’t necessarily the best way either).
If we 100% knew our priors (both the prior distribution and the rule by which we update on our prior disttribution, i.e. the likelihood) then Gelman’s methods would be useless. Just do the Bayesian update! But we don’t actually know our priors, so we must take care to model them as accurately as we can and Gelman’s methods are pretty good at helping us do this.
Quoting Gelman himself on page 77 of the linked paper:
In full context of the paper, Gelman is noting this as a problem with standard Bayesian analysis. He doesn’t argue, as I’m arguing, that we’re trying to model our priors or the structure of our uncertainty, i.e. that we’re trying to approximate the fully Baysian answer.
After going back and re-reading this, I realized your comments are more prescient than I gave them credit for in the past. I’m now struggling with the Gelman-Shalizi article (link). Do you know of any LessWrong sources that discuss this. I need to really sit back and think, but it seems to me that Gelman and Shalizi are making some serious mistakes here. And they are two of the best practitioners I know of. That scares me a great deal.
I don’t know of any sources, short of an allusion or two in my comment history, but I don’t recommend digging for them. One point I think I’ve made in the past is that an implication of viewing statistics as a method of modeling and thus approximating our uncertainty is that Gelman’s posterior predictive checks have limits, though they’re still useful. If posterior predictive checking tells you some part of your model is wrong but you otherwise have good reason to believe that part is an accurate representation of your true uncertainty, it might still be a good idea to leave that part alone.