There is no real question about whether most published research findings are false or not. We know that’s the case due to replication attempts. Ioannides’ paper isn’t really _about_ plugging in specific numbers, or showing that a priori that must be the case, so I think you’re going at it from a slightly wrong angle.
Of 49 highly cited original clinical research studies, 45 claimed that the intervention was effective. Of these, 7 (16%) were contradicted by subsequent studies, 7 others (16%) had found effects that were stronger than those of subsequent studies, 20 (44%) were replicated, and 11 (24%) remained largely unchallenged.
If 44% of those unchallenged studies in turn replicated, then total replication rates would be 54%. Of course, Ioannidis himself gives a possible reason why some of these haven’t been replicated: “Sometimes the evidence from the original study may seem so overwhelming that further similar studies are deemed unethical to perform.” So perhaps we should think that more than 44% of the unchallenged studies would replicate.
If we count the 16% that found relationships with weaker but still statistically significant effects as replications rather than failures to replicate, and add in 16% of the 24% of unchallenged studies, then we might expect that a total of 74% of biomedical papers in high-impact journals with over 1,000 citations have found a real effect. Is that legit? Well, it’s his binary, not mine, and in WMPRFAF he’s talking about the existence, not the strength, of relationships.
Although this paper looked at highly-cited papers, Ioannidis also notes that “The current analysis found that matched studies that were not so highly cited had a greater proportion of “negative” findings and similar or smaller proportions of contradicted results as the highly cited ones.” I.e. less-highly-cited findings have fewer problems with lack of replication. So that 74% is, if anything, most likely a lower bound on replication rates in the biomedical literature more broadly.
I don’t think that paper allows any such estimate because it’s based on published results, which are highly biased toward “significant” findings. It’s why, for example, in psychology meta-analyses have effect sizes 3x larger than those of registered replications. For an estimate of the replicability of a field you need something like the Many Labs project (~54% replication, median effect size 1⁄4 of the original study).
Just glancing at that Many Labs paper, it’s looking specifically at psych studies replicable through a web browser. Who knows to what extent that generalizes to psych studies more broadly, or to biomedical research?
I don’t think that paper allows any such estimate because it’s based on published results, which are highly biased toward “significant” findings.
So it sounds like you’re worried that a bunch of failed replication attempts got put in the file drawer, even after there was a published significant finding for the replication attempt to be pushing back against?
I think the OSC’s reproducibility project is much more of what you’re looking for, if you’re worried that Many Labs is selecting only for a specific type of effect.
They focus on selecting studies quasi-randomly and use a variety of reproducibility measures (confidence interval, p-value, effect size magnitude + direction, subjective assessment). They find that around 30-50% of effects replicate, depending on the criteria used. They looked at 100 studies, in total.
I don’t know enough about the biomedical field, but a brief search on the web yields the following links, which might be useful?
There is no real question about whether most published research findings are false or not. We know that’s the case due to replication attempts. Ioannides’ paper isn’t really _about_ plugging in specific numbers, or showing that a priori that must be the case, so I think you’re going at it from a slightly wrong angle.
From another of Ioannidis’s own papers:
If 44% of those unchallenged studies in turn replicated, then total replication rates would be 54%. Of course, Ioannidis himself gives a possible reason why some of these haven’t been replicated: “Sometimes the evidence from the original study may seem so overwhelming that further similar studies are deemed unethical to perform.” So perhaps we should think that more than 44% of the unchallenged studies would replicate.
If we count the 16% that found relationships with weaker but still statistically significant effects as replications rather than failures to replicate, and add in 16% of the 24% of unchallenged studies, then we might expect that a total of 74% of biomedical papers in high-impact journals with over 1,000 citations have found a real effect. Is that legit? Well, it’s his binary, not mine, and in WMPRFAF he’s talking about the existence, not the strength, of relationships.
Although this paper looked at highly-cited papers, Ioannidis also notes that “The current analysis found that matched studies that were not so highly cited had a greater proportion of “negative” findings and similar or smaller proportions of contradicted results as the highly cited ones.” I.e. less-highly-cited findings have fewer problems with lack of replication. So that 74% is, if anything, most likely a lower bound on replication rates in the biomedical literature more broadly.
Ioannidis has refuted himself.
I don’t think that paper allows any such estimate because it’s based on published results, which are highly biased toward “significant” findings. It’s why, for example, in psychology meta-analyses have effect sizes 3x larger than those of registered replications. For an estimate of the replicability of a field you need something like the Many Labs project (~54% replication, median effect size 1⁄4 of the original study).
Just glancing at that Many Labs paper, it’s looking specifically at psych studies replicable through a web browser. Who knows to what extent that generalizes to psych studies more broadly, or to biomedical research?
So it sounds like you’re worried that a bunch of failed replication attempts got put in the file drawer, even after there was a published significant finding for the replication attempt to be pushing back against?
I think the OSC’s reproducibility project is much more of what you’re looking for, if you’re worried that Many Labs is selecting only for a specific type of effect.
They focus on selecting studies quasi-randomly and use a variety of reproducibility measures (confidence interval, p-value, effect size magnitude + direction, subjective assessment). They find that around 30-50% of effects replicate, depending on the criteria used. They looked at 100 studies, in total.
I don’t know enough about the biomedical field, but a brief search on the web yields the following links, which might be useful?
Science Forum: The Brazilian Reproducibility Initiative which aims to reproduce 60-100 Brazilian studies, results due in 2021.
Section 2 of this symposium report from 2015 which collects some studies (including the OSC one I list above)
This page references some studies from around early 2010-2011 which find base rates of ~10% for replicating oncology-related stuff.