I think that, in this case, the underlying problem was not caused by the
way frequentist statistics are commonly taught and practiced by working
scientists:
In the present case, the null hypothesis is that the old method and the
new method produce data from the same distribution; the authors would
like to see data that do not lead to rejection of the null hypothesis.
I’m no statistician, but I’m pretty sure you’re not supposed to make
your favored hypothesis the null hypothesis. That’s a pretty simple
rule and I think it’s drilled into students and enforced in peer review.
I see that as the underlying problem because it reverses the burden of
proof. If they had done it the right way around, six data points would
have been not enough to support their method instead of being not enough
to reject it. Making your favored hypothesis the null hypothesis can
allow you, in the extreme, to rely on a single data point.
Now even from a frequentist perspective, this is wacky. Hypothesis testing can reject a null hypothesis, but cannot confirm it, as discussed in the first paragraph of the Wikipedia article on null hypotheses.
You wrote:
That’s a pretty simple rule and I think it’s drilled into students and enforced in peer review.
Not all papers are reviewed by people who know the rule. I was taught that rule over ten years ago, and I didn’t remember it when my colleague showed me the analysis. (I did recall it eventually, just after I ran the sanity check. Evidence against my competence!) My colleague whose job it was to review the paper didn’t know/recall the rule either.
I think that, in this case, the underlying problem was not caused by the way frequentist statistics are commonly taught and practiced by working scientists:
I’m no statistician, but I’m pretty sure you’re not supposed to make your favored hypothesis the null hypothesis. That’s a pretty simple rule and I think it’s drilled into students and enforced in peer review.
I see that as the underlying problem because it reverses the burden of proof. If they had done it the right way around, six data points would have been not enough to support their method instead of being not enough to reject it. Making your favored hypothesis the null hypothesis can allow you, in the extreme, to rely on a single data point.
In the OP I did refer to that when I wrote:
You wrote:
Not all papers are reviewed by people who know the rule. I was taught that rule over ten years ago, and I didn’t remember it when my colleague showed me the analysis. (I did recall it eventually, just after I ran the sanity check. Evidence against my competence!) My colleague whose job it was to review the paper didn’t know/recall the rule either.