If the effect is so small that a sample of several thousand is not sufficient to reliably observe it, then it doesn’t even matter that it is positive.
I strongly disagree.
An old comment of mine gives us a counterexample. A couple of years ago, a meta-analysis of RCTs found that taking aspirin daily reduces the risk of dying from cancer by ~20% in middle-aged and older adults. This is very much a practically significant effect, and it’s probably an underestimate for reasons I’ll omit for brevity — look at the paper if you’re curious.
If you do look at the paper, notice figure 1, which summarizes the results of the 8 individual RCTs the meta-analysis used. Even though all of the RCTs had sample sizes in the thousands, 7 of them failed to show a statistically significant effect, including the 4 largest (sample sizes 5139, 5085, 3711 & 3310). The effect is therefore “so small that a sample of several thousand is not sufficient to reliably observe it”, but we would be absolutely wrong to infer that “it doesn’t even matter that it is positive”!
The heuristic that a hard-to-detect effect is probably too small to care about is a fair rule of thumb, but it’s only a heuristic. EHeller & Unnamed are quite right to point out that statistical significance and practical significance correlate only imperfectly.
Does vitamin D reduce all-cause mortality in the elderly? The point-estimates from pretty much all of the various studies are around a 5% reduction in risk of dying for any reason—pretty nontrivial, one would say, no? Yet the results are almost all not ‘statistically significant’! So do we follow Rolf and say ‘fans of vitamin D ought to update on vitamin D not helping overall’… or do we, applying power considerations about the likelihood of making the hard cutoffs at p<0.05 given the small sample sizes & plausible effect sizes, note that the point-estimates are in favor of the hypothesis? (And how does this interact with two-sided tests—vitamin D could’ve increased mortality, after all. Positive point-estimates are consistent with vitamin D helping, and less consistent with no effect, and even less consistent with it harming; so why are we supposed to update in favor of no help or harm when we see a positive point-estimate?)
If we accept Rolf’s argument, then we’d be in the odd position of, as we read through one non-statistically-significant study after another, decreasing the probability of ‘non-zero reduction in mortality’… right up until we get the Autier or Cochrane data summarizing the exact same studies & plug it into a Bayesian meta-analysis like Salvatier did & abruptly flip to ’92% chance of non-zero reduction in mortality’.
A couple of years ago, a meta-analysis of RCTs found that taking aspirin daily reduces the risk of dying from cancer by ~20% in middle-aged and older adults.
That’s a curious metric to choose. By that standard taking aspirin is about as healthy as playing a round of Russian Roulette.
It’s a fairly natural metric to choose if one wishes to gauge aspirin’s effect on cancer risk, as the study’s authors did.
By that standard taking aspirin is about as healthy as playing a round of Russian Roulette.
Fortunately, the study’s authors and I also interpreted the data by another standard. Daily aspirin reduced all-cause mortality, and didn’t increase non-cancer deaths (except for “a transient increase in risk of vascular death in the aspirin groups during the first year after completion of the trials”). These are not results we would see if aspirin effected its anti-cancer magic by a similar mechanism to Russian Roulette.
It’s a fairly natural metric to choose if one wishes to gauge aspirin’s effect on cancer risk, as the study’s authors did.
Pardon me. Mentioning only curiosity was politeness. The more significant meanings I would supplement with are ‘naive or suspicious’. By itself that metric really is worthless and reading this kind of health claim should set off warning bells. Lost purposes are a big problem when it comes to medicine. Partly because it is hard, mostly because there is more money in the area than nearly anywhere else.
Fortunately, the study’s authors and I also interpreted the data by another standard. Daily aspirin reduced all-cause mortality, and didn’t increase non-cancer deaths (except for “a transient increase in risk of vascular death in the aspirin groups during the first year after completion of the trials”).
And this is the reason low dose asprin is part of my daily supplement regime (while statins are not).
And this is the reason low dose asprin is part of my daily supplement regime (while statins are not).
I recently stopped with the low dose aspirin, the bleeding when I accidentally cut myself has proven to be too much of an inconvenience. For the time being, at least.
I’d assume they mean something like the per-year risk of dying from cancer conditional on previous survival—if they indeed mean the total lifetime risk of dying from cancer I agree it’s ridiculous.
Yeah, pretty much. There are other examples of this where something harmful appears to be helpful when you don’t take into account possible selection biases (like being put into the ‘non-cancer death’ category); for example, this is an issue in smoking—you can find various correlations where smokers are healthier than non-smokers, but this is just because the unhealthier smokers got pushed over the edge by smoking and died earlier.
I strongly disagree.
An old comment of mine gives us a counterexample. A couple of years ago, a meta-analysis of RCTs found that taking aspirin daily reduces the risk of dying from cancer by ~20% in middle-aged and older adults. This is very much a practically significant effect, and it’s probably an underestimate for reasons I’ll omit for brevity — look at the paper if you’re curious.
If you do look at the paper, notice figure 1, which summarizes the results of the 8 individual RCTs the meta-analysis used. Even though all of the RCTs had sample sizes in the thousands, 7 of them failed to show a statistically significant effect, including the 4 largest (sample sizes 5139, 5085, 3711 & 3310). The effect is therefore “so small that a sample of several thousand is not sufficient to reliably observe it”, but we would be absolutely wrong to infer that “it doesn’t even matter that it is positive”!
The heuristic that a hard-to-detect effect is probably too small to care about is a fair rule of thumb, but it’s only a heuristic. EHeller & Unnamed are quite right to point out that statistical significance and practical significance correlate only imperfectly.
tl;dr: NHST and Bayesian-style subjective probability do not mix easily.
Another example of this problem: http://slatestarcodex.com/2014/01/25/beware-mass-produced-medical-recommendations/
Does vitamin D reduce all-cause mortality in the elderly? The point-estimates from pretty much all of the various studies are around a 5% reduction in risk of dying for any reason—pretty nontrivial, one would say, no? Yet the results are almost all not ‘statistically significant’! So do we follow Rolf and say ‘fans of vitamin D ought to update on vitamin D not helping overall’… or do we, applying power considerations about the likelihood of making the hard cutoffs at p<0.05 given the small sample sizes & plausible effect sizes, note that the point-estimates are in favor of the hypothesis? (And how does this interact with two-sided tests—vitamin D could’ve increased mortality, after all. Positive point-estimates are consistent with vitamin D helping, and less consistent with no effect, and even less consistent with it harming; so why are we supposed to update in favor of no help or harm when we see a positive point-estimate?)
If we accept Rolf’s argument, then we’d be in the odd position of, as we read through one non-statistically-significant study after another, decreasing the probability of ‘non-zero reduction in mortality’… right up until we get the Autier or Cochrane data summarizing the exact same studies & plug it into a Bayesian meta-analysis like Salvatier did & abruptly flip to ’92% chance of non-zero reduction in mortality’.
That’s a curious metric to choose. By that standard taking aspirin is about as healthy as playing a round of Russian Roulette.
It’s a fairly natural metric to choose if one wishes to gauge aspirin’s effect on cancer risk, as the study’s authors did.
Fortunately, the study’s authors and I also interpreted the data by another standard. Daily aspirin reduced all-cause mortality, and didn’t increase non-cancer deaths (except for “a transient increase in risk of vascular death in the aspirin groups during the first year after completion of the trials”). These are not results we would see if aspirin effected its anti-cancer magic by a similar mechanism to Russian Roulette.
Pardon me. Mentioning only curiosity was politeness. The more significant meanings I would supplement with are ‘naive or suspicious’. By itself that metric really is worthless and reading this kind of health claim should set off warning bells. Lost purposes are a big problem when it comes to medicine. Partly because it is hard, mostly because there is more money in the area than nearly anywhere else.
And this is the reason low dose asprin is part of my daily supplement regime (while statins are not).
“All cause mortality” is a magical phrase.
I recently stopped with the low dose aspirin, the bleeding when I accidentally cut myself has proven to be too much of an inconvenience. For the time being, at least.
I’d assume they mean something like the per-year risk of dying from cancer conditional on previous survival—if they indeed mean the total lifetime risk of dying from cancer I agree it’s ridiculous.
Am I missing a subtlety here, or is it just that cancer is usually one of those things that you hope to live long enough to get?
Yeah, pretty much. There are other examples of this where something harmful appears to be helpful when you don’t take into account possible selection biases (like being put into the ‘non-cancer death’ category); for example, this is an issue in smoking—you can find various correlations where smokers are healthier than non-smokers, but this is just because the unhealthier smokers got pushed over the edge by smoking and died earlier.