I’ve just come across a more technical explanation than usual of “The Mendel-Fisher Controversy” which frames it as having been about formalizing an intuition of data “too good to be true” using chi-squared.
It is less well known, however, that in 1936, the great British statistician and biologist R. A. Fisher analyzed Mendel’s data and found that the fit to Mendel’s theoretical expectations was too good (Fisher 1936). Using χ2 analysis, Fisher found that the probability of obtaining a fit as good as Mendel’s was only 7 in 100,000. (source)
Incidentally a very high P (>0.9) is suspicious, as it means that the results are just too good to be true! This suggests that there is some bias in the experiment, whether deliberate or accidental.
So, ISTM, gwern’s analysis here leads to the “too good to be true” conclusion.
I’ve just come across a more technical explanation than usual of “The Mendel-Fisher Controversy” which frames it as having been about formalizing an intuition of data “too good to be true” using chi-squared.
And this PDF or this page say pretty much the same.
So, ISTM, gwern’s analysis here leads to the “too good to be true” conclusion.