I liked the initial discussion, because broad heuristics are good for quickly evaluating things, but I think the second example really falls down. A poorly designed study shouldn’t be able to affect your odds as much as a well designed study, which is basically what his scoring system implies. He goes from 1⁄10 odds to 1⁄160 odds based on a study design which should provide very little evidence. One could argue that a poorly designed study finding a small effect should lower your odds slightly (because of publication bias, for example), or that it should raise your odds slightly because there was at least a small effect, but I find it hard to believe that it could decrease your odds substantially. Suppose it were something you felt was extremely likely (perhaps because of previous medium-quality studies), and you found an extremely poorly designed study that supported the conclusion. His reasoning would suggest that you decrease your odds from, say 4⁄1 to 1⁄4 based on the poorly designed study!
Yeah. This is an example where using the actual formula is helpful rather than just speaking heuristically. It’s actually somewhat difficult to translate from the author’s hand-wavy model to the real Bayes’ Theorem (and it would be totally opaque to someone who hadn’t seen Bayes before).
“Study support for headline” is supposed to be the Bayes factor P(study supports headline | headline is true) / P(study supports headline | headline is false). (Well actually, everything is also conditioned on you hearing about the study.) If you actually think about that, it’s clear that it should be very rare to find a study that is more likely to support its conclusion if that conclusion is not true.
If you’re just looking at the study, then it’s quite difficult for the support ratio to be less than one. However, suppose we assume that on average, for every published study, there are 100 unpublished studies, and the one with the lowest p-value gets published. Then if a study has a p-value of .04, that particular study supports the headline. However, the fact that that study was published contradicts the headline: if the headline were true, we would expect the lowest p-value to be lower than .04.
Yes, that’s what I meant by “very rare:” there are situations where it happens, like the model that you gave, but I don’t think ones that happen in real life likely to contribute a very large effect. You need really insane publication bias to get a large effect there.
It is not the odds the headline is true, nor the odds the study is correct, but only the odds the study supports the headline. For that, I don’t find his rule of thumb inappropriate.
No. The odds that the study supports the headline in the second example are 1⁄16. The formula he gives is
(final opinion on headline) = (initial gut feeling) * (study support for headline)
where the latter two are odds ratios. From context, “final opinion on headline” is pretty clearly supposed to be “opinion on whether the headline is true.”
I liked the initial discussion, because broad heuristics are good for quickly evaluating things, but I think the second example really falls down. A poorly designed study shouldn’t be able to affect your odds as much as a well designed study, which is basically what his scoring system implies. He goes from 1⁄10 odds to 1⁄160 odds based on a study design which should provide very little evidence. One could argue that a poorly designed study finding a small effect should lower your odds slightly (because of publication bias, for example), or that it should raise your odds slightly because there was at least a small effect, but I find it hard to believe that it could decrease your odds substantially. Suppose it were something you felt was extremely likely (perhaps because of previous medium-quality studies), and you found an extremely poorly designed study that supported the conclusion. His reasoning would suggest that you decrease your odds from, say 4⁄1 to 1⁄4 based on the poorly designed study!
Yeah. This is an example where using the actual formula is helpful rather than just speaking heuristically. It’s actually somewhat difficult to translate from the author’s hand-wavy model to the real Bayes’ Theorem (and it would be totally opaque to someone who hadn’t seen Bayes before).
“Study support for headline” is supposed to be the Bayes factor P(study supports headline | headline is true) / P(study supports headline | headline is false). (Well actually, everything is also conditioned on you hearing about the study.) If you actually think about that, it’s clear that it should be very rare to find a study that is more likely to support its conclusion if that conclusion is not true.
EDIT: the author is not actually Nate Silver.
If you’re just looking at the study, then it’s quite difficult for the support ratio to be less than one. However, suppose we assume that on average, for every published study, there are 100 unpublished studies, and the one with the lowest p-value gets published. Then if a study has a p-value of .04, that particular study supports the headline. However, the fact that that study was published contradicts the headline: if the headline were true, we would expect the lowest p-value to be lower than .04.
Yes, that’s what I meant by “very rare:” there are situations where it happens, like the model that you gave, but I don’t think ones that happen in real life likely to contribute a very large effect. You need really insane publication bias to get a large effect there.
It is not the odds the headline is true, nor the odds the study is correct, but only the odds the study supports the headline. For that, I don’t find his rule of thumb inappropriate.
No. The odds that the study supports the headline in the second example are 1⁄16. The formula he gives is
(final opinion on headline) = (initial gut feeling) * (study support for headline)
where the latter two are odds ratios. From context, “final opinion on headline” is pretty clearly supposed to be “opinion on whether the headline is true.”