# Problems with p-values

I’m taking a really cool course right now on statistics education, which covers:

• replication

• bayesian statistics

• meta-analyses

A lot of our readings are really fun, and I’ll likely write more posts based on them, maybe into a “why social science studies are bad” sequence.

Anyway, this is the first entry that goes over into misinterpretations and issues with p-values and null hypothesis significance testing.

• The first link in the article is broken...

• Thanks! Should be fixed now.

• I hope you do keep updating the series. Will be interested in seeing it.

A couple of reactions.

If the p-values are really about the data given a true null hypothesis then we really want to use them in the standard hypothesis testing way—only good for refuting, not proving. In other words, the data is consistent with a model and so we might want to pay attention but we still don’t know if the model is all that good. Is that one way to take your thinking?

I am pretty sure you can expand “bad” to more than social sciences. My understanding is that just about all medical studies have been given some pretty bad grades in terms of using p-value analysis.

Might change the “why X is bad” to “how X can be much better”?

• Hmmm, I am still figuring this out as I take the course, but to respond to your thoughts:

• I think your formulation makes sense. p-values can tell us when to pay attention to our results, e.g. if it’s “expected” or not, to see a difference as large as what we’ve observed, assuming the null hypothesis is true. (As I mention, there are theoretical reasons this breaks down in the limit because the null hypothesis is technically never true, but I think this is the real-world use of p-values, as most sample sizes aren’t that big anyway.)

• Yeah, I know certain areas in biology and medicine have also been hit by these issues, but I don’t have the sources at hand to back up those claims, so I tried to make more defensible ones for now.

“why is X is bad” to “how X can be much better”?

Is this about a specific sentence in the post, or more about the general framing of the issue?

• That specific statement. I think the general framing of the issue is good and (but I am not perhaps the right person to assess) seems to get people viewing/​interpreting/​applying better.

As for the other areas of science, probably doesn’t really matter as I don’t see how this is a problem about what is studied (underlying source of data) as it is about the application of tools to understand data.