IlyaShpitser comments on One-Magisterium Bayes

IlyaShpitser 5 Jul 2017 20:45 UTC
0 points

You can find countless examples of academics weighing in on matters they aren’t really qualified for.

Yes, absolutely. See also SMBC’s “send in the bishops, they can move diagonally” (chess masters on the Iraq war).

is there any good way for me to know that he represents a minority or obsolete position.

I don’t know if Jaynes represents a minority position (there are a lot of Bayesian statisticians). It’s more like the field moved on from this argument to more interesting arguments. Basically smart Bayesians and frequentists mostly understood each other’s arguments, and considered them mostly valid.

This is the type of B vs F argument people have these days (I linked this here before):

https://normaldeviate.wordpress.com/2012/08/28/robins-and-wasserman-respond-to-a-nobel-prize-winner/

If you really want the gory details, you can also read the Robins/Ritov paper. But it’s a hard paper.

Full disclosure: Robins was my former boss, and I am probably predisposed to liking his stuff.

Re: “what’s a good way to know”: I would say ask experts. Stat profs love talking about this stuff, you can email your local one, and try to go for coffee or something.

Re: “freshman level,” this was perhaps uncharitable phrasing. I just perceive, perhaps incorrectly, a lot of LW discussions as the type of discussion that takes place in dorms everywhere.
- Wei Dai 6 Jul 2017 23:10 UTC
  0 points
  Parent
  
  This is the type of B vs F argument people have these days (I linked this here before):
  
  I skimmed this a bit, and it seems like the argument went several rounds but was never actually resolved in the end? See Chris Sim’s last comment here which Robins and Wasserman apparently never responded to. Also, besides this type of highly technical discussion, can you point us to some texts that explains the overall history and current state of the F vs B debate in the professional stats community? I’d like to understand how and why they moved on from the kinds of discussion that LW is still having.
  - Lumifer 7 Jul 2017 0:57 UTC
    0 points
    Parent
    There is a recent book Computer Age Statistical Inference by Efron and Hastie (who are well-respected statisticians). They start by distinguishing three kinds of statistics—frequentist (by which they mean Neyman and Pearson with some reliance on Fisher); Bayesian which everybody here knows well; and Fisherian by which they mean mostly maximum likelihood and derivatives. They say that Fisher, though the was dismissive of the Bayesian approach, didn’t fully embrace the frequentism either and blazed his own path somewhere in the middle.
    
    The book is downloadable as a PDF via the link.
  - IlyaShpitser 6 Jul 2017 23:31 UTC
    0 points
    Parent
    We can ask Chris and Larry (I can if/when I see them).
    
    My take on the way this argument got resolved is that Chris and Larry/Jamie agree on the math—namely that to “solve” the example using B methods we need to have a prior that depends on pi. The possible source of disagreement is interpretational.
    
    Larry and Jamie think that this is Bayesians doing “frequentist pursuit”, that is using B machinery to mimic a fundamentally F behavior. As they say, there is nothing wrong with this, but the B here seems extraneous. Chris probably doesn’t see it that way, he probably thinks this is the natural way to do this problem in a B way.
    
    The weird thing about (what I think) Chris’ position here is that this example violates the “likelihood principle” some Bayesians like. The likelihood principle states that all information lives in the likelihood. Of course here the example is set up in such a way that the assignment probably pi(X) is (a) not a part of the likelihood and (b) is highly informative. The natural way for a Bayesian to deal with this is to stick pi(X) in the prior. This is formally ok, but kind of weird and unnatural.
    
    How weird and unnatural it is is a matter of interpretation, I suppose.
    
    This example is very simple, there are much more complicated versions of this. For example, what if we don’t know pi(X), but have to model it? Does pi(X) still go into the prior? That way lie dragons...
    
    I guess my point is, these types of highly technical discussions are the discussions that professionals have if B vs F comes up. If this is too technical, may I ask why even get into this? Maybe this level of technicality is the natural point of technicality for this argument in this, the year of our Lord 2017? This is kind of my point, if you aren’t a professional, why are you even talking about this?
    
    It’s a good question about a history text on B vs F. Let me ask around.
    
    edit: re: dragons, I guess what I mean is, it seems most things in life can be phrased in F or B ways. But there are a lot of phenomena for which the B phrasing, though it exists, isn’t really very clarifying. These might include identification and model misspecification issues. In such cases the B phrasing just feels like carrying around ideological baggage.
    
    My philosophy is inherently multiparadigm—you use the style of kung fu that yields the most benefit or the most clarity for the problem. Sometimes that’s B and sometimes that’s F and sometimes that’s something else. I guess in your language that would be “instrumental rationality in data analysis.”