Everyone knows that correlation is not causation. Many people don’t know that in scientific jargon, “predict” and “explain” are also not causation. They are forms of correlation.

(Technically, “association” might be a better term than “correlation”, which can have a narrower technical meaning in statistics. But since I’m writing this for non-experts, I’m going to use the term “correlation” in the colloquial, wider sense.)

These terms can cause extreme miscommunication:

In lay usage, “X predicts Y” implies that X comes before Y. Predictions are about the future. In statistics, there is no time implication at all. It is just a type of correlation. If I said that I could use 2020 data to “predict” things that happened in 2019 (or 1920), most people would laugh at me. But this is a perfectly legitimate usage of statistical “prediction”.

Similarly, the general sense of “explanation” means a conceptual understanding of a phenomenon. In statistics, “explanation” implies no such understanding, only a certain type of correlation.

Because of the time-implication of “prediction” and the conceptual-understanding implication of “explanation”, most people are likely to interpret these as evidence for or even proof of causation. But in many cases, they merely mean correlation.

If you are reading scientific papers, be careful with how you interpret these terms. If you are reading reports about science, know that the reporter might not be clear on this point. If you are reporting science yourself, be responsible with what you write.

(By the way, without @NeuroStats I wouldn’t know this stuff either. Any value in this post is thanks to her; errors are mine alone.)

## “Prediction” and “explanation” are not causation

Link post

Everyone knows that correlation is not causation. Many people don’t know that in scientific jargon, “predict” and “explain” are

alsonot causation. They are forms of correlation.(Technically, “association” might be a better term than “correlation”, which can have a narrower technical meaning in statistics. But since I’m writing this for non-experts, I’m going to use the term “correlation” in the colloquial, wider sense.)

These terms can cause extreme miscommunication:

In lay usage, “X predicts Y” implies that X comes

beforeY. Predictions are about the future. In statistics, there is no time implication at all. It is just a type of correlation. If I said that I could use 2020 data to “predict” things that happened in 2019 (or 1920), most people would laugh at me. But this is a perfectly legitimate usage of statistical “prediction”.Similarly, the general sense of “explanation” means a conceptual understanding of a phenomenon. In statistics, “explanation” implies no such understanding, only a certain type of correlation.

Because of the time-implication of “prediction” and the conceptual-understanding implication of “explanation”, most people are likely to interpret these as evidence for or even proof of causation. But in many cases, they merely mean correlation.

If you are reading scientific papers, be careful with how you interpret these terms. If you are reading reports about science, know that the reporter might not be clear on this point. If you are reporting science yourself, be responsible with what you write.

(By the way, without @NeuroStats I wouldn’t know this stuff either. Any value in this post is thanks to her; errors are mine alone.)