By calling something the “p-value” you are elevating the null hypothesis to a special status
Of course when you call something a p-value you should in your mind add “(for a particular choice of null hypothesis)”, calling something “the” p-value doesn’t really make sense. But I don’t think it’s correct to say that this elevates the null hypothesis to a special status; in many (perhaps even most) cases, there’s a hypothesis which already has special status, and addressing that hypothesis in particular is productive in a way that cataloguing and grouping several alternate hypotheses is not.
(and usually leaving various things about it underspecified)
This is precisely the opposite of the truth. As I tried to explain in the original post, roughly the entire benefit of reporting p-values instead of full bayesian updates is that if we try to do a full update, we will necessarily underspecify many hypothesis classes, but by focusing on one hypothesis (in the cases where there’s a hypothesis that it makes sense to focus on), especially if that hypothesis is deliberately “simple” (something like, this intervention does not have an effect on this observable), we can fully specify it. This is exactly the problem that p-values solve.
Like, it seems like your post is trying to say something like “ah, no, don’t do bayesian statistics, p-values are better sometimes actually”. But no, bayesian statistics in this sense is just better and more straightforward, as far as I can tell, and you get the things you would get from a p-value by default if you did any bayesian statistics.
I don’t think this is what my post says or that this is a plausible reading of it. A p-value is a conditional probability and since a principled bayesian update involves considering all of the relevant conditional probabilities, of course it contains all of the information that a p-value gives, and indeed the only way to interpret a p-value is in this light. But we can hardly do principled bayesian updates for ourselves and we can’t effectively communicate them, and moreover, if there are competing explanations for the data, we often can’t tell which explanation is best or if a correct explanation is among the hypotheses we’re really considering. In these cases, the correct amount of our internal update to report is often just a p-value.
The concrete examples I have in mind are the discoveries of CMBR and of the muon—Penzias and Wilson were not aware of the prediction of relic radiation, and nobody had even hypothesized the muon. It turned out that some Big Bang theorists at the time had started thinking about the possibility of microwave radiation left over from the Big Bang, and so Penzias and Wilson were eventually made aware of this work and realized what they had discovered, but when they conducted their experiment, what was relevant was not the full bayesian update but one conditional: their observations simply were not consistent with a steady state universe. They did not need to know about the alternative hypotheses to know that this was important. In the other example, Yukawa had predicted the existence of mesons before the muon was discovered, in particular he had predicted what we now call the pi meson, and since the mass of the muon matched the predicted mass of the pi meson, many people at the time guessed that Anderson and Neddermeyer had observed the pi meson. But they had not! Of course it wasn’t necessarily a mistake to update toward the best available hypothesis, but nonetheless, it was incorrect. The relevant discovery here was not that Yukawa’s prediction fit the newly-observed particle better than any other available explanation, it was simply that the newly-observed particle could not be any previously-observed particle, and so something new had been discovered.
Of course when you call something a p-value you should in your mind add “(for a particular choice of null hypothesis)”, calling something “the” p-value doesn’t really make sense. But I don’t think it’s correct to say that this elevates the null hypothesis to a special status; in many (perhaps even most) cases, there’s a hypothesis which already has special status, and addressing that hypothesis in particular is productive in a way that cataloguing and grouping several alternate hypotheses is not.
This is precisely the opposite of the truth. As I tried to explain in the original post, roughly the entire benefit of reporting p-values instead of full bayesian updates is that if we try to do a full update, we will necessarily underspecify many hypothesis classes, but by focusing on one hypothesis (in the cases where there’s a hypothesis that it makes sense to focus on), especially if that hypothesis is deliberately “simple” (something like, this intervention does not have an effect on this observable), we can fully specify it. This is exactly the problem that p-values solve.
I don’t think this is what my post says or that this is a plausible reading of it. A p-value is a conditional probability and since a principled bayesian update involves considering all of the relevant conditional probabilities, of course it contains all of the information that a p-value gives, and indeed the only way to interpret a p-value is in this light. But we can hardly do principled bayesian updates for ourselves and we can’t effectively communicate them, and moreover, if there are competing explanations for the data, we often can’t tell which explanation is best or if a correct explanation is among the hypotheses we’re really considering. In these cases, the correct amount of our internal update to report is often just a p-value.
The concrete examples I have in mind are the discoveries of CMBR and of the muon—Penzias and Wilson were not aware of the prediction of relic radiation, and nobody had even hypothesized the muon. It turned out that some Big Bang theorists at the time had started thinking about the possibility of microwave radiation left over from the Big Bang, and so Penzias and Wilson were eventually made aware of this work and realized what they had discovered, but when they conducted their experiment, what was relevant was not the full bayesian update but one conditional: their observations simply were not consistent with a steady state universe. They did not need to know about the alternative hypotheses to know that this was important. In the other example, Yukawa had predicted the existence of mesons before the muon was discovered, in particular he had predicted what we now call the pi meson, and since the mass of the muon matched the predicted mass of the pi meson, many people at the time guessed that Anderson and Neddermeyer had observed the pi meson. But they had not! Of course it wasn’t necessarily a mistake to update toward the best available hypothesis, but nonetheless, it was incorrect. The relevant discovery here was not that Yukawa’s prediction fit the newly-observed particle better than any other available explanation, it was simply that the newly-observed particle could not be any previously-observed particle, and so something new had been discovered.