I don’t think these are very good examples. Those lines hardly look correlated, let alone casually related. I once read an article with a much better example, but I can’t find it now. It first talked about how if you looked through enough examples you could find any correlation, and then showed a very closely correlated graph of the stock market versus something about Venus, like its surface temperature or distance from the sun or something.
My microeconometrics professor used to show off his icecream consumption versus drownings dataset that could pass all the significance tests he would be teaching that semester. That one always stuck with me.
Is that the best example to use, though? Ideally to promote skepticism you want correlations which are the result of sifting through mountains of data for coincidences, or correlations where the only underlying causation is something grossly general like “things often change monotonically for decades as time advances”. With “ice cream consumption versus drownings”, I wouldn’t be surprised if there’s a real, specific common factor: high temperatures motiving people to eat more cold treats and go swimming more often.
Those lines hardly look correlated, let alone casually related.
They don’t look that bad compared to the sorts of correlations one gets in messy data. The Facebook-Greek Debt one looks like something I wouldn’t be surprised to see for a genuine correlation with messy, real world data.
I don’t think these are very good examples. Those lines hardly look correlated, let alone casually related. I once read an article with a much better example, but I can’t find it now. It first talked about how if you looked through enough examples you could find any correlation, and then showed a very closely correlated graph of the stock market versus something about Venus, like its surface temperature or distance from the sun or something.
You can easily generate correlation examples with Google Correlate, such as how AppleWorks is causing the decline of the Japanese language.
My microeconometrics professor used to show off his icecream consumption versus drownings dataset that could pass all the significance tests he would be teaching that semester. That one always stuck with me.
Is that the best example to use, though? Ideally to promote skepticism you want correlations which are the result of sifting through mountains of data for coincidences, or correlations where the only underlying causation is something grossly general like “things often change monotonically for decades as time advances”. With “ice cream consumption versus drownings”, I wouldn’t be surprised if there’s a real, specific common factor: high temperatures motiving people to eat more cold treats and go swimming more often.
They don’t look that bad compared to the sorts of correlations one gets in messy data. The Facebook-Greek Debt one looks like something I wouldn’t be surprised to see for a genuine correlation with messy, real world data.