Bayesian Flame

There once lived a great man named E.T. Jaynes. He knew that Bayesian in­fer­ence is the only way to do statis­tics log­i­cally and con­sis­tently, stand­ing on the shoulders of mi­s­un­der­stood gi­ants Laplace and Gibbs. On nu­mer­ous oc­ca­sions he van­quished tra­di­tional “fre­quen­tist” statis­ti­ci­ans with his su­pe­rior math, demon­strat­ing to any­one with half a brain how the Bayesian way gives faster and more cor­rect re­sults in each ex­am­ple. The weight of ev­i­dence falls so heav­ily on one side that it makes no sense to ar­gue any­more. The fight is over. Bayes wins. The uni­verse runs on Bayes-struc­ture.

Or at least that’s what you be­lieve if you learned this stuff from Over­com­ing Bias.

Like I was un­til two days ago, when Cyan hit me over the head with some­thing ut­terly in­com­pre­hen­si­ble. I sud­denly had to go out and un­der­stand this stuff, not just be­lieve it. (The origi­nal in­ten­tion, if I re­mem­ber it cor­rectly, was to im­press you all by pul­ling a Jaynes.) Now I’ve come back and in­tend to pro­voke a full-on flame war on the topic. Be­cause if we can have thought­ful flame wars about gen­der but not math, we’re a bad com­mu­nity. Bad, bad com­mu­nity.

If you’re like me two days ago, you kinda “un­der­stand” what Bayesi­ans do: as­sume a prior prob­a­bil­ity dis­tri­bu­tion over hy­pothe­ses, use ev­i­dence to morph it into a pos­te­rior dis­tri­bu­tion over same, and bless the re­sult­ing num­bers as your “de­grees of be­lief”. But chances are that you have a very vague idea of what fre­quen­tists do, apart from de­riv­ing half-assed re­sults with their ad hoc tools.

Well, here’s the ul­tra-short ver­sion: fre­quen­tist statis­tics is the art of draw­ing true con­clu­sions about the real world in­stead of as­sum­ing prior de­grees of be­lief and co­her­ently ad­just­ing them to avoid Dutch books.

And here’s an ul­tra-short ex­am­ple of what fre­quen­tists can do: es­ti­mate 100 in­de­pen­dent un­known pa­ram­e­ters from 100 differ­ent sam­ple data sets and have 90 of the es­ti­mates turn out to be true to fact af­ter­ward. Like, fo’real. Always 90% in the long run, truly, ir­re­vo­ca­bly and for­ever. No Bayesian method known to­day can re­li­ably do the same: the out­come will de­pend on the pri­ors you as­sume for each pa­ram­e­ter. I don’t be­lieve you’re go­ing to get lucky with all 100. And even if I be­lieved you a pri­ori (ahem) that don’t make it true.

(That’s what Jaynes did to achieve his awe­some vic­to­ries: use trained in­tu­ition to pick good pri­ors by hand on a per-sam­ple ba­sis. Maybe you can learn this skill some­where, but not from the In­tu­itive Ex­pla­na­tion.)

How in the world do you do in­fer­ence with­out a prior? Well, the char­ac­ter­i­za­tion of fre­quen­tist statis­tics as “trick­ery” is to­tally jus­tified: it has no sin­gle co­her­ent ap­proach and the tricks of­ten give con­flict­ing re­sults. Most ev­ery­body agrees that you can’t do bet­ter than Bayes if you have a clear-cut prior; but if you don’t, no one is go­ing to kick you out. We sym­pa­thize with your predica­ment and will gladly sell you some twisted tech­nol­ogy!

Con­fi­dence in­ter­vals: imag­ine you some­how pro­cess some sam­ple data to get an in­ter­val. Fur­ther imag­ine that hy­po­thet­i­cally, for any given hid­den pa­ram­e­ter value, this calcu­la­tion al­gorithm ap­plied to data sam­pled un­der that pa­ram­e­ter value yields an in­ter­val that cov­ers it with prob­a­bil­ity 90%. Believe it or not, this per­verse trick works 90% of the time with­out re­quiring any prior dis­tri­bu­tion on pa­ram­e­ter val­ues.

Un­bi­ased es­ti­ma­tors: you pro­cess the sam­ple data to get a num­ber whose ex­pec­ta­tion mag­i­cally co­in­cides with the true pa­ram­e­ter value.

Hy­poth­e­sis test­ing: I give you a black-box ran­dom dis­tri­bu­tion and claim it obeys a speci­fied for­mula. You sam­ple some data from the box and in­spect it. Fre­quen­tism al­lows you to call me a liar and be wrong no more than 10% of the time re­ject truth­ful claims no more than 10% of the time, guaran­teed, no prior in sight. (Thanks Eliezer for call­ing out the mis­take, and con­chis for the cor­rec­tion!)

But this is get­ting too aca­demic. I ought to throw you dry wood, good flame ma­te­rial. This hilar­i­ous PDF from An­drew Gel­man should do the trick. Choice quote:

Well, let me tell you some­thing. The 50 states aren’t ex­change­able. I’ve lived in a few of them and vis­ited nearly all the oth­ers, and call­ing them ex­change­able is just silly. Cal­ling it a hi­er­ar­chi­cal or mul­ti­level model doesn’t change things—it’s an ad­di­tional level of mod­el­ing that I’d rather not do. Call me old-fash­ioned, but I’d rather let the data speak with­out ap­ply­ing a prob­a­bil­ity dis­tri­bu­tion to some­thing like the 50 states which are nei­ther ran­dom nor a sam­ple.

As a bonus, the bibliog­ra­phy to that ar­ti­cle con­tains such mar­velous ti­tles as “Why Isn’t Every­one a Bayesian?” And Larry Wasser­man’s fol­lowup is also quite dis­turb­ing.

Another stick for the fire is pro­vided by Shal­izi, who (among other things) makes the cor­rect point that a good Bayesian must never be un­cer­tain about the prob­a­bil­ity of any fu­ture event. That’s why he calls Bayesi­ans “Often Wrong, Never In Doubt”:

The Bayesian, by defi­ni­tion, be­lieves in a joint dis­tri­bu­tion of the ran­dom se­quence X and of the hy­poth­e­sis M. (Other­wise, Bayes’s rule makes no sense.) This means that by in­te­grat­ing over M, we get an un­con­di­tional, marginal prob­a­bil­ity for f.

For my fi­nal quote it seems only fair to add one more polem­i­cal sum­mary of Cyan’s point that made me sit up and look around in a be­wil­dered man­ner. Credit to Wasser­man again:

Pen­ny­packer: You see, physics has re­ally ad­vanced. All those quan­tities I es­ti­mated have now been mea­sured to great pre­ci­sion. Of those thou­sands of 95 per­cent in­ter­vals, only 3 per­cent con­tained the true val­ues! They con­cluded I was a fraud.

van Nos­trand: Pen­ny­packer you fool. I never said those in­ter­vals would con­tain the truth 95 per­cent of the time. I guaran­teed co­her­ence not cov­er­age!

Pen­ny­packer: A lot of good that did me. I should have gone to that ob­jec­tive Bayesian statis­ti­cian. At least he cares about the fre­quen­tist prop­er­ties of his pro­ce­dures.

van Nos­trand: Well I’m sorry you feel that way Pen­ny­packer. But I can’t be re­spon­si­ble for your in­co­her­ent col­leagues. I’ve had enough now. Be on your way.

There’s of­ten good rea­son to ad­vo­cate a cor­rect the­ory over a wrong one. But all this ev­i­dence (ahem) shows that switch­ing to Guardian of Truth mode was, at the very least, pre­ma­ture for me. Bayes isn’t the cor­rect the­ory to make con­clu­sions about the world. As of to­day, we have no co­her­ent the­ory for mak­ing con­clu­sions about the world. Both per­spec­tives have se­ri­ous prob­lems. So do your­self a fa­vor and switch to truth-seeker mode.