Joseph Felsenstein is a pioneer in the use of maximum likelihood methods in evolutionary biology. In his book, “Inferring Phylogenies”, he has a chapter on Bayesian methods, and a section in that chapter on controversies over Bayesian inference. He discusses a toy example of a space probe to Mars which looks for little green men and doesn’t find them. He wonders whether a scientist whose prior for little green men involved odds of 1⁄4, and who, based on the evidence of the space probe, now assigns odds of 1⁄12, should publish those revised odds. He writes:
It might be argued that the correct thing to do in such a case is to publish the likelihood ratio 1⁄3 and let the reader provide their own prior. This is the likelihoodist position. A Bayesian is defined, not by using a prior, but by being willing to use a controversial prior.
Felsenstein apparently defines himself as a “likelihoodist” rather than a “frequentist” or “Bayesian”.
There are two slightly different meanings of what it is to be a “Bayesian”: philosophically, there is a Bayesian interpretation of probability theory, and practically, there are Bayesian methods in statistics. I see Felsenstein as saying that, even if one is a Bayesian philosophically, one ought to practise as a likelihoodist.
In original research, I agree; there is not much point in reporting posteriors. Certainly there’s no point in reporting them without also reporting the original priors, but better just to report the likelihoods and let readers supply their own priors.
On the other hand, in summaries for a broad readership, the posteriors are the most important result to report. Now most readers don’t have the expertise to bring their own priors, so you have to give them yours. And then do the calculation for them.
On the other hand, in summaries for a broad readership, the posteriors are the most important result to report. Now most readers don’t have the expertise to bring their own priors, so you have to give them yours. And then do the calculation for them.
Good point. It would be irresponsible to publish a news item that “the Prime Minister’s support for this bill is three times more likely if he is, in fact, a lizard alien than if he is a human” without noting that the prior probability for him being a lizard alien is pretty low.
I’m curious, though, as to what all you are giving up by not talking about priors. In Felsenstein’s field—roughly, constructing best estimates of the “tree of life”—you very frequently have prior information which you want to bring to the problem, but of course you don’t want to bring in any prior information which is not neutral on the controversial issue that your study is supposed to shed light on.
One of the advantages of a Bayesian methodology is supposed to be the ability to combine information from sources with different qualities and coverages. To what extent are you prevented from doing that if you insist on doing all of your likelihood ratio work behind a “veil of ignorance”?
you very frequently have prior information which you want to bring to the problem, but of course you don’t want to bring in any prior information which is not neutral on the controversial issue that your study is supposed to shed light on
Well, let’s be very explicit about that then. A good report will:
state the relevant priors used for everything that the study is not directly about but which are still relevant,
remain quiet about priors for what the study is directly about, giving likelihood ratios instead.
More mathematically, suppose that you make certain assumptions A which, in full completeness, are not just things like “I assume that a certain sample has been dated correctly.” but “I put the following probability distribution on the dates of this sample.” This is very lengthy, which is the inconvenient part; although if you make simplified assumptions for purposes of your calculations, then you would put simplified assumptions in your text too. So it shouldn’t really be any more inconvenient than whatever goes into your analysis.
But what you are testing is not A, but some hypothesis H (that the ancestors of Homo and Pan split after they split from Gorilla, for example; notice that this only makes sense if A includes that these three genera are really clades and that evolution of these animals really is a branching tree, although these are pretty common assumptions). And you have some evidence E.
Then in addition to A (which goes into your introduction, or maybe your appendix; anyway, it’s logically prior to examining E), you also report the likelihood ratio P(E|A&H)/P(E|A&!H), which goes into your conclusion. Then maybe you also state P(H|A) and calculate P(H|A&E), just in case people want to read about that, but that is not really your result.
Yes, I know, as I’m sure does Felsenstein. The book covered much more than maximum likelihood. The recommendation to report likelihood ratios came in the first of two chapters on Bayesian methods. The second involved hidden Markov models.
The book begins (as does the field) with a tree-building method called ‘maximum
parsimony’. Maximum likelihood is a step up in sophistication from that, and Felsenstein is largely responsible for that step forward. I’m not really sure why he is not an enthusiastic Bayesian. My guess would be that it is because he is a professional statistician and the whole discipline of statistics traditionally consists of ways of drawing totally objective conclusions from data.
Joseph Felsenstein is a pioneer in the use of maximum likelihood methods in evolutionary biology. In his book, “Inferring Phylogenies”, he has a chapter on Bayesian methods, and a section in that chapter on controversies over Bayesian inference. He discusses a toy example of a space probe to Mars which looks for little green men and doesn’t find them. He wonders whether a scientist whose prior for little green men involved odds of 1⁄4, and who, based on the evidence of the space probe, now assigns odds of 1⁄12, should publish those revised odds. He writes:
Felsenstein apparently defines himself as a “likelihoodist” rather than a “frequentist” or “Bayesian”.
“Likelihoodist” is so clunky and dull. I prefer “likelihoodlum”—it’s just as clunky, but at least it’s somewhat inflammatory.
There are two slightly different meanings of what it is to be a “Bayesian”: philosophically, there is a Bayesian interpretation of probability theory, and practically, there are Bayesian methods in statistics. I see Felsenstein as saying that, even if one is a Bayesian philosophically, one ought to practise as a likelihoodist.
In original research, I agree; there is not much point in reporting posteriors. Certainly there’s no point in reporting them without also reporting the original priors, but better just to report the likelihoods and let readers supply their own priors.
On the other hand, in summaries for a broad readership, the posteriors are the most important result to report. Now most readers don’t have the expertise to bring their own priors, so you have to give them yours. And then do the calculation for them.
Good point. It would be irresponsible to publish a news item that “the Prime Minister’s support for this bill is three times more likely if he is, in fact, a lizard alien than if he is a human” without noting that the prior probability for him being a lizard alien is pretty low.
And yet they do this all the frigging time in medical stories, as documented extensively on, for instance, Bad Science.
I’m curious, though, as to what all you are giving up by not talking about priors. In Felsenstein’s field—roughly, constructing best estimates of the “tree of life”—you very frequently have prior information which you want to bring to the problem, but of course you don’t want to bring in any prior information which is not neutral on the controversial issue that your study is supposed to shed light on.
One of the advantages of a Bayesian methodology is supposed to be the ability to combine information from sources with different qualities and coverages. To what extent are you prevented from doing that if you insist on doing all of your likelihood ratio work behind a “veil of ignorance”?
Well, let’s be very explicit about that then. A good report will:
state the relevant priors used for everything that the study is not directly about but which are still relevant,
remain quiet about priors for what the study is directly about, giving likelihood ratios instead.
More mathematically, suppose that you make certain assumptions A which, in full completeness, are not just things like “I assume that a certain sample has been dated correctly.” but “I put the following probability distribution on the dates of this sample.” This is very lengthy, which is the inconvenient part; although if you make simplified assumptions for purposes of your calculations, then you would put simplified assumptions in your text too. So it shouldn’t really be any more inconvenient than whatever goes into your analysis.
But what you are testing is not A, but some hypothesis H (that the ancestors of Homo and Pan split after they split from Gorilla, for example; notice that this only makes sense if A includes that these three genera are really clades and that evolution of these animals really is a branching tree, although these are pretty common assumptions). And you have some evidence E.
Then in addition to A (which goes into your introduction, or maybe your appendix; anyway, it’s logically prior to examining E), you also report the likelihood ratio P(E|A&H)/P(E|A&!H), which goes into your conclusion. Then maybe you also state P(H|A) and calculate P(H|A&E), just in case people want to read about that, but that is not really your result.
“Maximum likelihood” totally != “report likelihood ratios”.
Yes, I know, as I’m sure does Felsenstein. The book covered much more than maximum likelihood. The recommendation to report likelihood ratios came in the first of two chapters on Bayesian methods. The second involved hidden Markov models.
The book begins (as does the field) with a tree-building method called ‘maximum parsimony’. Maximum likelihood is a step up in sophistication from that, and Felsenstein is largely responsible for that step forward. I’m not really sure why he is not an enthusiastic Bayesian. My guess would be that it is because he is a professional statistician and the whole discipline of statistics traditionally consists of ways of drawing totally objective conclusions from data.
This position has also been expressed here.
(Should this article be migrated from OB to LW?)