See, after locating the hypothesis, we can run some simple statistical checks on the hypothesis and the data to see if our prior was wrong. For example, plot the data as a histogram, and plot the hypothesis as another histogram, and if there’s a lot of data and the two histograms are wildly different, we know almost for certain that the prior was wrong. As a responsible scientist, I’d do this kind of check. The catch is, a perfect Bayesian wouldn’t. The question is, why?
Model checking is completely compatible with “perfect Bayesianism.” In the practice of Bayesian statistics, how often is the prior distribution you use exactly the same as your actual prior distribution? The answer is never. Really, do you think your actual prior follows a gamma distribution exactly? The prior distribution you use in the computation is a model of your actual prior distribution. It’s a map of your current map. With this in mind, model checking is an extremely handy way to make sure that your model of your prior is reasonable.
However, a difference in the data and a simulation from your model doesn’t necessarily mean that you have an unreasonable model of your prior. You could just have really wrong priors. So you have to think about what’s going on to be sure. This does somewhat limit the role of model checking relative to what Gelman is pushing.
With this in mind, model checking is an extremely handy way to make sure that your model of your prior is reasonable.
You shouldn’t need real-world data to determine if your model of your own prior was reasonable or not. Something else is going on here. Model checking uses the data to figure out if your prior was reasonable, which is a reasonable but non-Bayesian idea.
Well, if you’re just checking your prior, then I suppose you don’t need real data at all. Make up some numbers and see what happens. What you’re really checking (if you’re being a Bayesian about it, i.e. not like Gelman and company) is not whether your data could come from a model with that prior, but rather whether the properties of the prior you chose seems to match up with the prior you’re modeling. For example, maybe the prior you chose forces two parameters, a and b, to be independent no matter what the data say. In reality, though, you think it’s perfectly reasonable for there to be some association between those two parameters. If you don’t already know that your prior is deficient in this way, posterior predictive checking can pick it up.
In reality, you’re usually checking both your prior and the other parts of your model at the same time, so you might as well use your data, but I could see using different fake data sets in order to check your prior in different ways.
Model checking is completely compatible with “perfect Bayesianism.” In the practice of Bayesian statistics, how often is the prior distribution you use exactly the same as your actual prior distribution? The answer is never. Really, do you think your actual prior follows a gamma distribution exactly? The prior distribution you use in the computation is a model of your actual prior distribution. It’s a map of your current map. With this in mind, model checking is an extremely handy way to make sure that your model of your prior is reasonable.
However, a difference in the data and a simulation from your model doesn’t necessarily mean that you have an unreasonable model of your prior. You could just have really wrong priors. So you have to think about what’s going on to be sure. This does somewhat limit the role of model checking relative to what Gelman is pushing.
You shouldn’t need real-world data to determine if your model of your own prior was reasonable or not. Something else is going on here. Model checking uses the data to figure out if your prior was reasonable, which is a reasonable but non-Bayesian idea.
Well, if you’re just checking your prior, then I suppose you don’t need real data at all. Make up some numbers and see what happens. What you’re really checking (if you’re being a Bayesian about it, i.e. not like Gelman and company) is not whether your data could come from a model with that prior, but rather whether the properties of the prior you chose seems to match up with the prior you’re modeling. For example, maybe the prior you chose forces two parameters, a and b, to be independent no matter what the data say. In reality, though, you think it’s perfectly reasonable for there to be some association between those two parameters. If you don’t already know that your prior is deficient in this way, posterior predictive checking can pick it up.
In reality, you’re usually checking both your prior and the other parts of your model at the same time, so you might as well use your data, but I could see using different fake data sets in order to check your prior in different ways.