From another of Ioannidis’s own papers:
Of 49 highly cited original clinical research studies, 45 claimed that the intervention was effective. Of these, 7 (16%) were contradicted by subsequent studies, 7 others (16%) had found effects that were stronger than those of subsequent studies, 20 (44%) were replicated, and 11 (24%) remained largely unchallenged.
If 44% of those unchallenged studies in turn replicated, then total replication rates would be 54%. Of course, Ioannidis himself gives a possible reason why some of these haven’t been replicated: “Sometimes the evidence from the original study may seem so overwhelming that further similar studies are deemed unethical to perform.” So perhaps we should think that more than 44% of the unchallenged studies would replicate.
If we count the 16% that found relationships with weaker but still statistically significant effects as replications rather than failures to replicate, and add in 16% of the 24% of unchallenged studies, then we might expect that a total of 74% of biomedical papers in high-impact journals with over 1,000 citations have found a real effect. Is that legit? Well, it’s his binary, not mine, and in WMPRFAF he’s talking about the existence, not the strength, of relationships.
Although this paper looked at highly-cited papers, Ioannidis also notes that “The current analysis found that matched studies that were not so highly cited had a greater proportion of “negative” findings and similar or smaller proportions of contradicted results as the highly cited ones.” I.e. less-highly-cited findings have fewer problems with lack of replication. So that 74% is, if anything, most likely a lower bound on replication rates in the biomedical literature more broadly.
Ioannidis has refuted himself.
I don’t think that paper allows any such estimate because it’s based on published results, which are highly biased toward “significant” findings. It’s why, for example, in psychology meta-analyses have effect sizes 3x larger than those of registered replications. For an estimate of the replicability of a field you need something like the Many Labs project (~54% replication, median effect size 1⁄4 of the original study).
Just glancing at that Many Labs paper, it’s looking specifically at psych studies replicable through a web browser. Who knows to what extent that generalizes to psych studies more broadly, or to biomedical research?
I don’t think that paper allows any such estimate because it’s based on published results, which are highly biased toward “significant” findings.
So it sounds like you’re worried that a bunch of failed replication attempts got put in the file drawer, even after there was a published significant finding for the replication attempt to be pushing back against?
I think the OSC’s reproducibility project is much more of what you’re looking for, if you’re worried that Many Labs is selecting only for a specific type of effect.
They focus on selecting studies quasi-randomly and use a variety of reproducibility measures (confidence interval, p-value, effect size magnitude + direction, subjective assessment). They find that around 30-50% of effects replicate, depending on the criteria used. They looked at 100 studies, in total.
I don’t know enough about the biomedical field, but a brief search on the web yields the following links, which might be useful?
Science Forum: The Brazilian Reproducibility Initiative which aims to reproduce 60-100 Brazilian studies, results due in 2021.
Section 2 of this symposium report from 2015 which collects some studies (including the OSC one I list above)
This page references some studies from around early 2010-2011 which find base rates of ~10% for replicating oncology-related stuff.