The Statistician’s Fallacy

[Epistemic status | Contains generalization based on like three data points.]

In grad school, I took a philosophy of science class that was based around looking for examples of bad reasoning in the scientific literature. The kinds of objections to published scientific studies we talked about were not stupid ones. The professor had a background in statistics, and as far as I could tell knew her stuff in that area (though she dismissed Bayesianism in favor of frequentism). And no, unlike some of the professors in the department, she wasn’t an anti-evolutionist or anything like that.

Instead she was convinced that cellphones cause cancer. In spite of the fact that there’s scant evidence for that claim, and there’s no plausible physial mechanism for how that could happen. This along with a number of other borderline-fringe beliefs that I won’t get into here, but that was the big screaming red flag.*

Over the course of the semester, I got a pretty good idea of what was going on. She had an agenda—it happened to be an environmentalist, populist, pro-”natural”-things agenda, but that’s incidental. The problem was that when she saw a scientific study that seemed at odds with her agenda, she went looking for flaws. And often she could find them! Real flaws, not ones she was imagining! But people who’ve read the rationalization sequence will see a problem here...

In my last post, I quoted Robin Hanson on the tendency of some physicists to be unduly dismissive of other fields. But based the above case and a couple others like it, I’ve come to suspect statistics may be even worse than physics in that way. That fluency in statistics sometimes causes a supercharged sophistication effect.

For example, some anthropogenic global warming skeptics make a big deal of alleged statistical errors in global warming research, but as I wrote in my post Trusting Expert Consensus:

Michael Mann et al’s so-called “hockey stick” graph has come under a lot of fire from skeptics, but (a) many other reconstructions have reached the same conclusion and (b) a panel formed by the National Research Council concluded that, while there were some problems with Mann et al’s statistical analysis, these problems did not affect the conclusion. Furthermore, even if we didn’t have the pre-1800 reconstructions, I understand that given what we know about CO2′s heat-trapping properties, and given the increase in atmospheric CO2 levels due to burning fossil fuels, it would be surprising if humans hadn’t caused significant warming.

Most recently, I got into a Twitter argument with someone who claimed that “IQ is demonstrably statistically meaningless” and that this was widely accepted among statisticians. Not only did this set off my “academic clique!” alarm bells, but I’d just come off doing a spurt of reading about intelligence, including the excellent Intelligence: A Very Short Introduction. The claim that IQ is meaningless was wildly contrary to what I understood was the consensus among people who study intelligence for a living.

In response to my surprise, I got an article that contained lengthy and impressive-looking statistical arguments… but completely ignored a couple key points from the intelligence literature I’d read: first, that there’s a strong correlation between IQ and real-world performance, and second that correlations between the components of intelligence we know how to test for turn out to be really strong. If IQ is actually made up of several independent factors, we haven’t been able to find them. Maybe some people in intelligence research really did make the mistakes alleged, but there was more to intelligence research than the statistician who wrote the article let on.

It would be fair to shout a warning about correspondence bias before inferring anything from these cases. But consider two facts:

  1. Essentially all scientific fields rely heavily on statistics.

  2. There’s a lot more to mastering a scientific discipline than learning statistics, which limits how well most scientists will ever master statistics.

The first fact may make it tempting to think that if you know a lot of statistics, you’re in a priviledged position to judge the validity of any scientific claim you come across. But the second fact means that if you’ve specialized in statistics, you’ll probably be better at it than most scientists, even good scientists. So if you go scrutinizing their papers, there’s a good chance you’ll find clear mistakes in their stats, and an even better chance you’ll find arguable ones.

Bayesians will realize that, since there’s a good chance that of happening even when the conclusion is correct and well-supported by the evidence, finding mistakes in the statistics is only weak evidence that the conclusion is wrong. Call it the statistician’s fallacy: thinking that finding a mistake in the statistics is sufficient grounds to dismiss a finding.

Oh, if you’re dealing with a novel finding that experts in the field aren’t sure what to make of yet, and the statistics turns out to be wrong, then that may be enough. You may have better things to do than investigate further. But when a solid majority of the experts agree on a conclusion, and you see flaws in their statistics, I think the default assumption should be that they still know the issue better than you and very likely the sum total of the available evidence does support the conclusion. Even if the specific statistical arguments youv’e seen from them are wrong.

*Note: I’ve done some Googling to try to find rebuttals to this link, and most of what I found confirms it. I did find some people talking about multi-photon effects and heating, but couldn’t find defenses of these suggestions that rise beyond people saying, “well there’s a chance.”