Interpretations of “probability”

(Writ­ten for Ar­bital in 2016.)

What does it mean to say that a flipped coin has a 50% prob­a­bil­ity of land­ing heads?

His­tor­i­cally, there are two pop­u­lar types of an­swers to this ques­tion, the “fre­quen­tist” and “sub­jec­tive” (aka “Bayesian”) an­swers, which give rise to rad­i­cally differ­ent ap­proaches to ex­per­i­men­tal statis­tics. There is also a third “propen­sity” view­point which is largely dis­cred­ited (as­sum­ing the coin is de­ter­minis­tic). Roughly, the three ap­proaches an­swer the above ques­tion as fol­lows:

  • The propen­sity in­ter­pre­ta­tion: Some prob­a­bil­ities are just out there in the world. It’s a brute fact about coins that they come up heads half the time. When we flip a coin, it has a fun­da­men­tal propen­sity of 0.5 for the coin to show heads. When we say the coin has a 50% prob­a­bil­ity of be­ing heads, we’re talk­ing di­rectly about this propen­sity.

  • The fre­quen­tist in­ter­pre­ta­tion: When we say the coin has a 50% prob­a­bil­ity of be­ing heads af­ter this flip, we mean that there’s a class of events similar to this coin flip, and across that class, coins come up heads about half the time. That is, the fre­quency of the coin com­ing up heads is 50% in­side the event class, which might be “all other times this par­tic­u­lar coin has been tossed” or “all times that a similar coin has been tossed”, and so on.

  • The sub­jec­tive in­ter­pre­ta­tion: Uncer­tainty is in the mind, not the en­vi­ron­ment. If I flip a coin and slap it against my wrist, it’s already landed ei­ther heads or tails. The fact that I don’t know whether it landed heads or tails is a fact about me, not a fact about the coin. The claim “I think this coin is heads with prob­a­bil­ity 50%” is an ex­pres­sion of my own ig­no­rance, and 50% prob­a­bil­ity means that I’d bet at 1 : 1 odds (or bet­ter) that the coin came up heads.

For a vi­su­al­iza­tion of the differ­ences be­tween these three view­points, see Cor­re­spon­dence vi­su­al­iza­tions for differ­ent in­ter­pre­ta­tions of “prob­a­bil­ity”. For ex­am­ples of the differ­ence, see Prob­a­bil­ity in­ter­pre­ta­tions: Ex­am­ples. See also the Stan­ford En­cy­clo­pe­dia of Philos­o­phy ar­ti­cle on in­ter­pre­ta­tions of prob­a­bil­ity.

The propen­sity view is per­haps the most in­tu­itive view, as for many peo­ple, it just feels like the coin is in­trin­si­cally ran­dom. How­ever, this view is difficult to rec­on­cile with the idea that once we’ve flipped the coin, it has already landed heads or tails. If the event in ques­tion is de­cided de­ter­minis­ti­cally, the propen­sity view can be seen as an in­stance of the mind pro­jec­tion fal­lacy: When we men­tally con­sider the coin flip, it feels 50% likely to be heads, so we find it very easy to imag­ine a world in which the coin is fun­da­men­tally 50%-heads-ish. But that feel­ing is ac­tu­ally a fact about us, not a fact about the coin; and the coin has no phys­i­cal 0.5-heads-propen­sity hid­den in there some­where — it’s just a coin.

The other two in­ter­pre­ta­tions are both self-con­sis­tent, and give rise to prag­mat­i­cally differ­ent statis­ti­cal tech­niques, and there has been much de­bate as to which is prefer­able. The sub­jec­tive in­ter­pre­ta­tion is more gen­er­ally ap­pli­ca­ble, as it al­lows one to as­sign prob­a­bil­ities (in­ter­preted as bet­ting odds) to one-off events.

Fre­quen­tism vs subjectivism

As an ex­am­ple of the differ­ence be­tween fre­quen­tism and sub­jec­tivism, con­sider the ques­tion: “What is the prob­a­bil­ity that Hillary Clin­ton will win the 2016 US pres­i­den­tial elec­tion?”, as an­a­lyzed in the sum­mer of 2016.

A stereo­typ­i­cal (straw) fre­quen­tist would say, “The 2016 pres­i­den­tial elec­tion only hap­pens once. We can’t ob­serve a fre­quency with which Clin­ton wins pres­i­den­tial elec­tions. So we can’t do any statis­tics or as­sign any prob­a­bil­ities here.”

A stereo­typ­i­cal sub­jec­tivist would say: “Well, pre­dic­tion mar­kets tend to be pretty well-cal­ibrated about this sort of thing, in the sense that when pre­dic­tion mar­kets as­sign 20% prob­a­bil­ity to an event, it hap­pens around 1 time in 5. And the pre­dic­tion mar­kets are cur­rently bet­ting on Hillary at about 3 : 1 odds. Thus, I’m com­fortable say­ing she has about a 75% chance of win­ning. If some­one offered me 20 : 1 odds against Clin­ton — they get $1 if she loses, I get $20 if she wins — then I’d take the bet. I sup­pose you could re­fuse to take that bet on the grounds that you Just Can’t Talk About Prob­a­bil­ities of One-off Events, but then you’d be pointlessly pass­ing up a re­ally good bet.”

A stereo­typ­i­cal (non-straw) fre­quen­tist would re­ply: “I’d take that bet too, of course. But my tak­ing that bet is not based on rigor­ous episte­mol­ogy, and we shouldn’t al­low that sort of think­ing in ex­per­i­men­tal sci­ence and other im­por­tant venues. You can do sub­jec­tive rea­son­ing about prob­a­bil­ities when mak­ing bets, but we should ex­clude sub­jec­tive rea­son­ing in our sci­en­tific jour­nals, and that’s what fre­quen­tist statis­tics is de­signed for. Your pa­per should not con­clude “and there­fore, hav­ing ob­served thus-and-such data about car­bon diox­ide lev­els, I’d per­son­ally bet at 9 : 1 odds that an­thro­pogenic global warm­ing is real,” be­cause you can’t build sci­en­tific con­sen­sus on opinions.”

...and then it starts get­ting com­pli­cated. The sub­jec­tivist re­sponds “First of all, I agree you shouldn’t put pos­te­rior odds into pa­pers, and sec­ond of all, it’s not like your method is truly ob­jec­tive — the choice of “similar events” is ar­bi­trary, abus­able, and has given rise to p-hack­ing and the repli­ca­tion crisis.” The fre­quen­tists say “well your choice of prior is even more sub­jec­tive, and I’d like to see you do bet­ter in an en­vi­ron­ment where peer pres­sure pushes peo­ple to abuse statis­tics and ex­ag­ger­ate their re­sults,” and then down the rab­bit hole we go.

The sub­jec­tivist in­ter­pre­ta­tion of prob­a­bil­ity is com­mon among ar­tifi­cial in­tel­li­gence re­searchers (who of­ten de­sign com­puter sys­tems that ma­nipu­late sub­jec­tive prob­a­bil­ity dis­tri­bu­tions), Wall Street traders (who need to be able to make bets even in rel­a­tively unique situ­a­tions), and com­mon in­tu­ition (where peo­ple feel like they can say there’s a 30% chance of rain to­mor­row with­out wor­ry­ing about the fact that to­mor­row only hap­pens once). Nev­er­the­less, the fre­quen­tist in­ter­pre­ta­tion is com­monly taught in in­tro­duc­tory statis­tics classes, and is the gold stan­dard for most sci­en­tific jour­nals.

A com­mon fre­quen­tist stance is that it is vir­tu­ous to have a large toolbox of statis­ti­cal tools at your dis­posal. Sub­jec­tivist tools have their place in that toolbox, but they don’t de­serve any par­tic­u­lar pri­macy (and they aren’t gen­er­ally ac­cepted when it comes time to pub­lish in a sci­en­tific jour­nal).

An ag­gres­sive sub­jec­tivist stance is that fre­quen­tists have in­vented some in­ter­est­ing tools, and many of them are use­ful, but that re­fus­ing to con­sider sub­jec­tive prob­a­bil­ities is toxic. Fre­quen­tist statis­tics were in­vented in a (failed) at­tempt to keep sub­jec­tivity out of sci­ence in a time be­fore hu­man­ity re­ally un­der­stood the laws of prob­a­bil­ity the­ory. Now we have the­o­rems about how to man­age sub­jec­tive prob­a­bil­ities cor­rectly, and how to fac­tor per­sonal be­liefs out from the ob­jec­tive ev­i­dence pro­vided by the data, and if you ig­nore these the­o­rems you’ll get in trou­ble. The fre­quen­tist in­ter­pre­ta­tion is bro­ken, and that’s why sci­ence has p-hack­ing and a repli­ca­tion crisis even as all the wall-street traders and AI sci­en­tists use the Bayesian in­ter­pre­ta­tion. This “let’s com­pro­mise and agree that ev­ery­one’s view­point is valid” thing is all well and good, but how much worse do things need to get be­fore we say “oops” and start ac­knowl­edg­ing the sub­jec­tive prob­a­bil­ity in­ter­pre­ta­tion across all fields of sci­ence?

The most com­mon stance among sci­en­tists and re­searchers is much more ag­nos­tic, along the lines of “use what­ever statis­ti­cal tech­niques work best at the time, and use fre­quen­tist tech­niques when pub­lish­ing in jour­nals be­cause that’s what ev­ery­one’s been do­ing for decades upon decades upon decades, and that’s what ev­ery­one’s ex­pect­ing.”

See also Sub­jec­tive prob­a­bil­ity and Like­li­hood func­tions, p-val­ues, and the repli­ca­tion crisis.

Which in­ter­pre­ta­tion is most use­ful?

Prob­a­bly the sub­jec­tive in­ter­pre­ta­tion, be­cause it sub­sumes the propen­sity and fre­quen­tist in­ter­pre­ta­tions as spe­cial cases, while be­ing more flex­ible than both.

When the fre­quen­tist “similar event” class is clear, the sub­jec­tivist can take those fre­quen­cies (of­ten called base rates in this con­text) into ac­count. But un­like the fre­quen­tist, she can also com­bine those base rates with other ev­i­dence that she’s seen, and as­sign prob­a­bil­ities to one-off events, and make money in pre­dic­tion mar­kets and/​or stock mar­kets (when she knows some­thing that the mar­ket doesn’t).

When the laws of physics ac­tu­ally do “con­tain un­cer­tainty”, such as when they say that there are mul­ti­ple differ­ent ob­ser­va­tions you might make next with differ­ing like­li­hoods (as the Schrod­inger equa­tion of­ten will), a sub­jec­tivist can com­bine her propen­sity-style un­cer­tainty with her per­sonal un­cer­tainty in or­der to gen­er­ate her ag­gre­gate sub­jec­tive prob­a­bil­ities. But un­like a propen­sity the­o­rist, she’s not forced to think that all un­cer­tainty is phys­i­cal un­cer­tainty: She can act like a propen­sity the­o­rist with re­spect to Schrod­inger-equa­tion-in­duced un­cer­tainty, while still be­liev­ing that her un­cer­tainty about a coin that has already been flipped and slapped against her wrist is in her head, rather than in the coin.

This fully gen­eral stance is con­sis­tent with the be­lief that fre­quen­tist tools are use­ful for an­swer­ing fre­quen­tist ques­tions: The fact that you can per­son­ally as­sign prob­a­bil­ities to one-off events (and, e.g., eval­u­ate how good a cer­tain trade is on a pre­dic­tion mar­ket or a stock mar­ket) does not mean that tools la­beled “Bayesian” are always bet­ter than tools la­beled “fre­quen­tist”. What­ever in­ter­pre­ta­tion of “prob­a­bil­ity” you use, you’re en­couraged to use what­ever statis­ti­cal tool works best for you at any given time, re­gard­less of what “camp” the tool comes from. Don’t let the fact that you think it’s pos­si­ble to as­sign prob­a­bil­ities to one-off events pre­vent you from us­ing use­ful fre­quen­tist tools!