Essentially all scientific fields rely heavily on statistics.
This is true in a technical sense but misses a crucial distinction. Hard sciences (basically physics and its relatives), are far less vulnerable to statistical pitfalls because practitioners in those fields have the ability to generate effectively unlimited quantities of data by simply repeating experiments as many times as necessary. This makes statistical reasoning largely irrelevant: in the limit of infinite data, you don’t need to do Bayesian updates because the weight of the prior is insignificant compared to the weight of the observations. Rutherford, for example, did not bother to state a prior probability for the plum pudding model of the atom compared to the planetary model; he just amassed a bunch of experimental data, and showed that the plum pudding model could not explain it. This large-data-generation ability of physics is largely why that field has succeeded in spite of continuing debates and confusion about the fundamentals of statistical philosophy. Researchers in fields like economics, nutrition, and medicine simply cannot obtain data on the same scale that physicists can.
Hard sciences (basically physics and its relatives), are far less vulnerable to statistical pitfalls because practitioners in those fields have the ability to generate effectively unlimited quantities of data by simply repeating experiments as many times as necessary.
There are exceptions such as ultra-high-energy cosmic ray physics, where it’d take decades to take enough data for naive frequentist statistics to be reliable.
The statistics also remains important at the frontier of high energy physics. Trying to do reasoning about what models are likely to replace the Standard Model is plagued by every issue in the philosophy of statistics that you can imagine. And the arguments about this affect where billions of dollars worth of research funding end up (build bigger colliders? more dark matter detectors? satellites?)
I agree that hard sciences are far less vulnerable to statistical pitfalls. However, I’d point at three factors other than data generation to explain it:
The hard sciences have theories that define specific, quantitative models, which makes it far easier to test the theories. Fitting a misspecified model is much less of a risk, and a model may make such a specific prediction that fewer data are needed to falsify it.
Signal-to-noise ratios are often much less in the hard sciences. Where that’s the case, you generally don’t need such advanced statistics to analyse results, and you’re more likely to notice when you do the statistics incorrectly and get a wrong answer. And even if a model doesn’t truly fit the data, it may still explain the vast majority of the variation in the data; you can get an R² of 0.999 in physics, while if you get an R² of 0.999 in the social sciences it means you did something stupid in Excel or SPSS and accidentally regressed something against itself.
In the hard sciences, one has a good chance of accounting for all of the important causes of an effect of interest. In the social sciences this is usually impossible; often one doesn’t even know the important causes of an effect, making it difficult to rule out confounding (unless one can sever unknown causal links via e.g. randomization).
This is true in a technical sense but misses a crucial distinction. Hard sciences (basically physics and its relatives), are far less vulnerable to statistical pitfalls because practitioners in those fields have the ability to generate effectively unlimited quantities of data by simply repeating experiments as many times as necessary. This makes statistical reasoning largely irrelevant: in the limit of infinite data, you don’t need to do Bayesian updates because the weight of the prior is insignificant compared to the weight of the observations. Rutherford, for example, did not bother to state a prior probability for the plum pudding model of the atom compared to the planetary model; he just amassed a bunch of experimental data, and showed that the plum pudding model could not explain it. This large-data-generation ability of physics is largely why that field has succeeded in spite of continuing debates and confusion about the fundamentals of statistical philosophy. Researchers in fields like economics, nutrition, and medicine simply cannot obtain data on the same scale that physicists can.
There are exceptions such as ultra-high-energy cosmic ray physics, where it’d take decades to take enough data for naive frequentist statistics to be reliable.
The statistics also remains important at the frontier of high energy physics. Trying to do reasoning about what models are likely to replace the Standard Model is plagued by every issue in the philosophy of statistics that you can imagine. And the arguments about this affect where billions of dollars worth of research funding end up (build bigger colliders? more dark matter detectors? satellites?)
Sure; if we had enough data to conclusively answer a question it would no longer be at the frontier. :-)
(I disagree with several of the claims in the linked post, but that’s another story.)
I agree that hard sciences are far less vulnerable to statistical pitfalls. However, I’d point at three factors other than data generation to explain it:
The hard sciences have theories that define specific, quantitative models, which makes it far easier to test the theories. Fitting a misspecified model is much less of a risk, and a model may make such a specific prediction that fewer data are needed to falsify it.
Signal-to-noise ratios are often much less in the hard sciences. Where that’s the case, you generally don’t need such advanced statistics to analyse results, and you’re more likely to notice when you do the statistics incorrectly and get a wrong answer. And even if a model doesn’t truly fit the data, it may still explain the vast majority of the variation in the data; you can get an R² of 0.999 in physics, while if you get an R² of 0.999 in the social sciences it means you did something stupid in Excel or SPSS and accidentally regressed something against itself.
In the hard sciences, one has a good chance of accounting for all of the important causes of an effect of interest. In the social sciences this is usually impossible; often one doesn’t even know the important causes of an effect, making it difficult to rule out confounding (unless one can sever unknown causal links via e.g. randomization).
I suspect it’s not so much the amount of data as the fact that the underlying causal structure tends to be much simpler.
With, e.g., biology you the problem of the Harvard law.