How are we to decide which partitions are useful? If someone tells us that women born under Aries, Leo or Sagittarius do better with treatment A, as do those born under the Earth, Air and Water signs, would we really be willing to switch treatments?
Assume I have not heard of Simpson’s Paradox, have no more time to research and must make a decision now.
Am I justified in not switching treatments using the reasoning that I don’t want Astrology to have any substance to it, and it must not be allowed to have any, so I’m going to wishful think this data away and therefore ignore the evidence I have (as I understand it)?
Or am I more rational to say “I will accept the evidence I have as far as I understand it and switch treatments, even though I expect there is something else going on which is nothing to do with Astrology, but I have no time to find out what that is”?
The second has a better thought process but leads to a worse conclusion due to lack of understanding or lack of information, but the first one is based on a worse thought process which could lead to much worse outcomes in future if it is kept up.
Well, an important question to ask is how the data were generated. If the only thing we know about each patient is whether they were male or female and whether they were born under a Fire sign, and being born under a Fire sign seems to have some explanatory power, then by all means go for it. As Dave suggests below—it is perfectly possible that the astrological data is hiding some genuine phenomenon.
However, if someone collected every possible piece of astrological data, and tried splitting the patients along every one of the 2^11 possible partitions of the twelve starsigns, you would not be surprised to find that at least one of them displayed this sort of behaviour.
I think the key message is that you shouldn’t be making causal inferences from correlational conclusions unless you have some good reason to do so.
The urge to infer causation from correlation must be powerful. We can easily spot errors of unwarranted causal inferences, apparently from overtraining the recognition of certain patterns, but as soon as the same caveat is expressed in a novel way, we have to work to apply the principle to novelties of form. Simpson’s Paradox seems not just the bearer of the message that you shouldn’t make automatic causal inferences from mere correlation; it is an explanation of why that inference is invalid.. A blind correlation 1) doesn’t screen out confounds, and 2) might screen out the causal factor.
It seems that we’ve learned part 1 well, but the complete explanation for the possibility that correlations hide causes includes part 2. It seems part 2 is harder. While we’ve all learned to spot instances of part 1, we still founder on part 2. We’re inclined to think partitioning the data can’t make the situation epistemically worse, but it can by screening out the wrong variable, that is, the causal variable.
So in the real life example, we don’t find it so counter-intuitive that data about the success rates of men and women fail to prove discrimination when you don’t control for the confounds. But we do stumble when it goes the other way. If we had the data that women do better than men for the competitive petitions as well as the easy positions, we continue to find it hard to see that this doesn’t prove that women overall don’t do better than men.
I think the key message is that you shouldn’t be making causal inferences from correlational conclusions
I’ve sorted out what I was thinking a bit more. I was not saying “am I justified in believing that the alignment of stars and planets is the cause here”, what I was saying is:
If someone tells us that women born under Aries, Leo or Sagittarius do better with treatment A, as do those born under the Earth, Air and Water signs, would we really be willing to switch treatments?
Yes we should be willing to act in a way that appears to support astrology—this paragraph is supporting wisdom as the opposite of stupidity, or decision making by fear of public embarrassment.
It might even lead to worse outcomes in the current case, if it turns out that the reason Water signs do better with treatment A in this data set is that the assignment of subjects to treatments in the study was in some way related to their date of birth.
If I have good reasons to believe that factor X doesn’t cause events of class Y, and I have data that seems to demonstrate that factor X is causing an event of class Y in one particular case, and I don’t have the time to explore that data further, I ought to take seriously the theory that the causation is not what it seems to be.
Assume I have not heard of Simpson’s Paradox, have no more time to research and must make a decision now.
Am I justified in not switching treatments using the reasoning that I don’t want Astrology to have any substance to it, and it must not be allowed to have any, so I’m going to wishful think this data away and therefore ignore the evidence I have (as I understand it)?
Or am I more rational to say “I will accept the evidence I have as far as I understand it and switch treatments, even though I expect there is something else going on which is nothing to do with Astrology, but I have no time to find out what that is”?
The second has a better thought process but leads to a worse conclusion due to lack of understanding or lack of information, but the first one is based on a worse thought process which could lead to much worse outcomes in future if it is kept up.
Well, an important question to ask is how the data were generated. If the only thing we know about each patient is whether they were male or female and whether they were born under a Fire sign, and being born under a Fire sign seems to have some explanatory power, then by all means go for it. As Dave suggests below—it is perfectly possible that the astrological data is hiding some genuine phenomenon.
However, if someone collected every possible piece of astrological data, and tried splitting the patients along every one of the 2^11 possible partitions of the twelve starsigns, you would not be surprised to find that at least one of them displayed this sort of behaviour.
I think the key message is that you shouldn’t be making causal inferences from correlational conclusions unless you have some good reason to do so.
The urge to infer causation from correlation must be powerful. We can easily spot errors of unwarranted causal inferences, apparently from overtraining the recognition of certain patterns, but as soon as the same caveat is expressed in a novel way, we have to work to apply the principle to novelties of form. Simpson’s Paradox seems not just the bearer of the message that you shouldn’t make automatic causal inferences from mere correlation; it is an explanation of why that inference is invalid.. A blind correlation 1) doesn’t screen out confounds, and 2) might screen out the causal factor.
It seems that we’ve learned part 1 well, but the complete explanation for the possibility that correlations hide causes includes part 2. It seems part 2 is harder. While we’ve all learned to spot instances of part 1, we still founder on part 2. We’re inclined to think partitioning the data can’t make the situation epistemically worse, but it can by screening out the wrong variable, that is, the causal variable.
So in the real life example, we don’t find it so counter-intuitive that data about the success rates of men and women fail to prove discrimination when you don’t control for the confounds. But we do stumble when it goes the other way. If we had the data that women do better than men for the competitive petitions as well as the easy positions, we continue to find it hard to see that this doesn’t prove that women overall don’t do better than men.
I’ve sorted out what I was thinking a bit more. I was not saying “am I justified in believing that the alignment of stars and planets is the cause here”, what I was saying is:
Yes we should be willing to act in a way that appears to support astrology—this paragraph is supporting wisdom as the opposite of stupidity, or decision making by fear of public embarrassment.
It might even lead to worse outcomes in the current case, if it turns out that the reason Water signs do better with treatment A in this data set is that the assignment of subjects to treatments in the study was in some way related to their date of birth.
If I have good reasons to believe that factor X doesn’t cause events of class Y, and I have data that seems to demonstrate that factor X is causing an event of class Y in one particular case, and I don’t have the time to explore that data further, I ought to take seriously the theory that the causation is not what it seems to be.