Heads I Win, Tails?—Never Heard of Her; Or, Selective Reporting and the Tragedy of the Green Rationalists

Fol­lowup to: What Ev­i­dence Filtered Ev­i­dence?

In “What Ev­i­dence Filtered Ev­i­dence?”, we are asked to con­sider a sce­nario in­volv­ing a coin that is ei­ther bi­ased to land Heads 2/​3rds of the time, or Tails 2/​3rds of the time. Ob­serv­ing Heads is 1 bit of ev­i­dence for the coin be­ing Heads-bi­ased (be­cause the Heads-bi­ased coin lands Heads with prob­a­bil­ity 23, the Tails-bi­ased coin does so with prob­a­bil­ity 13, the like­li­hood ra­tio of these is , and ), and analo­gously and re­spec­tively for Tails.

If such a coin is flipped ten times by some­one who doesn’t make liter­ally false state­ments, who then re­ports that the 4th, 6th, and 9th flips came up Heads, then the up­date to our be­liefs about the coin de­pends on what al­gorithm the not-ly­ing[1] re­porter used to de­cide to re­port those flips in par­tic­u­lar. If they always re­port the 4th, 6th, and 9th flips in­de­pen­dently of the flip out­comes—if there’s no ev­i­den­tial en­tan­gle­ment be­tween the flip out­comes and the choice of which flips get re­ported—then re­ported flip-out­comes can be treated the same as flips you ob­served your­self: three Head­ses is 3 * 1 = 3 bits of ev­i­dence in fa­vor of the hy­poth­e­sis that the coin is Heads-bi­ased. (So if we were ini­tially 50:50 on the ques­tion of which way the coin is bi­ased, our pos­te­rior odds af­ter col­lect­ing 3 bits of ev­i­dence for a Heads-bi­ased coin would be = 8:1, or a prob­a­bil­ity of 8/​(1 + 8) ≈ 0.89 that the coin is Heads-bi­ased.)

On the other hand, if the re­porter men­tions only and ex­actly the flips that came out Heads, then we can in­fer that the other 7 flips came out Tails (if they didn’t, the re­porter would have men­tioned them), giv­ing us pos­te­rior odds of = 1:16, or a prob­a­bil­ity of around 0.06 that the coin is Heads-bi­ased.

So far, so stan­dard. (You did read the Se­quences, right??) What I’d like to em­pha­size about this sce­nario to­day, how­ever, is that while a Bayesian rea­soner who knows the non-ly­ing re­porter’s al­gorithm of what flips to re­port will never be mis­led by the se­lec­tive re­port­ing of flips, a Bayesian with mis­taken be­liefs about the re­porter’s de­ci­sion al­gorithm can be mis­led quite badly: com­pare the 0.89 and 0.06 prob­a­bil­ities we just de­rived given the same re­ported out­comes, but differ­ent as­sump­tions about the re­port­ing al­gorithm.

If the coin gets flipped a suffi­ciently large num­ber of times, a re­porter whom you trust to be im­par­tial (but isn’t), can make you be­lieve any­thing she wants with­out ever tel­ling a sin­gle lie, just with ap­pro­pri­ate se­lec­tive re­port­ing. Imag­ine a very bi­ased coin that comes up Heads 99% of the time. If it gets flipped ten thou­sand times, 100 of those flips will be Tails (in ex­pec­ta­tion), giv­ing a se­lec­tive re­porter plenty of ex­am­ples to point to if she wants to con­vince you that the coin is ex­tremely Tails-bi­ased.

Toy mod­els about bi­ased coins are in­struc­tive for con­struct­ing ex­am­ples with ex­plic­itly calcu­la­ble prob­a­bil­ities, but the same struc­ture ap­plies to any real-world situ­a­tion where you’re re­ceiv­ing ev­i­dence from other agents, and you have un­cer­tainty about what al­gorithm is be­ing used to de­ter­mine what re­ports get to you. Real­ity is like the coin’s bias; ev­i­dence and ar­gu­ments are like the out­come of a par­tic­u­lar flip. Wrong the­o­ries will still have some valid ar­gu­ments and ev­i­dence sup­port­ing them (as even a very Heads-bi­ased coin will come up Tails some­times), but the­o­ries that are less wrong will have more.

If se­lec­tive re­port­ing is mostly due to the idiosyn­cratic bad in­tent of rare mal­i­cious ac­tors, then you might hope for safety in (the law of large) num­bers: if Helga in par­tic­u­lar is sys­tem­at­i­cally more likely to re­port Head­ses than Tailses that she sees, then her flip re­ports will di­verge from ev­ery­one else’s, and you can take that into ac­count when read­ing Helga’s re­ports. On the other hand, if se­lec­tive re­port­ing is mostly due to sys­temic struc­tural fac­tors that re­sult in cor­re­lated se­lec­tive re­port­ing even among well-in­ten­tioned peo­ple who are be­ing hon­est as best they know how,[2] then you might have a more se­ri­ous prob­lem.

“A Fable of Science and Poli­tics” de­picts a fic­tional un­der­ground So­ciety po­larized be­tween two par­ti­san fac­tions, the Blues and the Greens. “[T]here is a ‘Blue’ and a ‘Green’ po­si­tion on al­most ev­ery con­tem­po­rary is­sue of poli­ti­cal or cul­tural im­por­tance.” If hu­man brains con­sis­tently un­der­stood the is/​ought dis­tinc­tion, then poli­ti­cal or cul­tural al­ign­ment with the Blue or Green agenda wouldn’t dis­tort peo­ple’s be­liefs about re­al­ity. Un­for­tu­nately … hu­mans. (I’m not even go­ing to finish the sen­tence.)

Real­ity it­self isn’t on any­one’s side, but any par­tic­u­lar fact, ar­gu­ment, sign, or por­tent might just so hap­pen to be more eas­ily con­strued as “sup­port­ing” the Blues or the Greens. The Blues want stronger mar­riage laws; the Greens want no-fault di­vorce. An evolu­tion­ary psy­chol­o­gist in­ves­ti­gat­ing effects of kin-recog­ni­tion mechanisms on child abuse by step­par­ents might as­pire to sci­en­tific ob­jec­tivity, but be­ing ob­jec­tive and stay­ing ob­jec­tive is difficult when you’re em­bed­ded in an in­tel­li­gent so­cial web in which in your work is go­ing to be pre­dictably cham­pi­oned by Blues and re­viled by Greens.

Let’s make an­other toy model to try to un­der­stand the re­sult­ing dis­tor­tions on the Un­der­grounders’ col­lec­tive episte­mol­ogy. Sup­pose Real­ity is a coin—no, not a coin, a three-sided die,[3] with faces col­ored blue, green, and gray. One-third of the time it comes up blue (rep­re­sent­ing a fact that is more eas­ily con­strued as sup­port­ing the Blue nar­ra­tive), one-third of the time it comes up green (rep­re­sent­ing a fact that is more eas­ily con­strued as sup­port­ing the Green nar­ra­tive), and one-third of the time it comes up gray (rep­re­sent­ing a fact that not even the worst ide­ologues know how to spin as “sup­port­ing” their side).

Sup­pose each fac­tion has so­cial-pun­ish­ment mechanisms en­forc­ing con­sen­sus in­ter­nally. Without loss of gen­er­al­ity, take the Greens (with the un­der­stand­ing that ev­ery­thing that fol­lows goes just the same if you swap “Green” for “Blue” and vice versa).[4] Peo­ple ob­serve rolls of the die of Real­ity, and can freely choose what rolls to re­port—ex­cept a res­i­dent of a Green city who re­ports more than 1 blue roll for ev­ery 3 green rolls is as­sumed to be a se­cret Blue Bad Guy, and faces in­creas­ing so­cial pun­ish­ment as their ra­tio of re­ported green to blue rolls falls be­low 3:1. (Re­port­ing gray rolls is always safe.)

The pun­ish­ment is typ­i­cally in­for­mal: there’s no offi­cial cen­sor­ship from Green-con­trol­led lo­cal gov­ern­ments, just a visi­ble in­cen­tive gra­di­ent made out of so­cial-me­dia pile-ons, de­nied pro­mo­tions, lost friends and mat­ing op­por­tu­ni­ties, in­creased risk of be­ing in­vol­un­tar­ily com­mit­ted to psy­chi­a­tric prison,[5] &c. Even peo­ple who pri­vately agree with dis­si­dent speech might par­ti­ci­pate in pun­ish­ing it, the bet­ter to evade pun­ish­ment them­selves.

This sce­nario pre­sents a prob­lem for peo­ple who live in Green cities who want to make and share ac­cu­rate mod­els of re­al­ity. It’s im­pos­si­ble to re­port ev­ery die roll (the only 1:1 scale map of the ter­ri­tory, is the ter­ri­tory it­self), but it seems clear that the most gen­er­ally use­ful mod­els—the ones you would ex­pect ar­bi­trary AIs to come up with—aren’t go­ing to be sen­si­tive to which facts are “blue” or “green”. The re­ports of as­piring epistemic ra­tio­nal­ists who are just try­ing to make sense of the world will end up be­ing about one-third blue, one-third green, and one-third gray, match­ing the dis­tri­bu­tion of the Real­ity die.

From the per­spec­tive of or­di­nary nice smart Green cit­i­zens who have not been trained in the Way, these re­ports look un­think­ably Blue. Aspiring epistemic ra­tio­nal­ists who are ac­tu­ally pay­ing at­ten­tion can eas­ily dis­t­in­guish Blue par­ti­sans from ac­tual truth­seek­ers,[6] but the so­cial-pun­ish­ment ma­chin­ery can’t pro­cess more than five words at a time. The so­cial con­se­quences of be­ing an ac­tual Blue Bad Guy, or just an hon­est nerd who doesn’t know when to keep her stupid trap shut, are the same.

In this sce­nario,[7] pub­lic opinion within a sub­cul­ture or com­mu­nity in a Green area is con­strained by the 3:1 (green:blue) “Over­ton ra­tio.” In par­tic­u­lar, un­der these con­di­tions, it’s im­pos­si­ble to have a ra­tio­nal­ist com­mu­nity—at least the most naïve con­cep­tion of such. If your mar­ket­ing liter­a­ture says, “Speak the truth, even if your voice trem­bles,” but all the savvy high-sta­tus peo­ple’s ac­tual re­port­ing al­gorithm is, “Speak the truth, ex­cept when that would cause the lo­cal so­cial-pun­ish­ment ma­chin­ery to mark me as a Blue Bad Guy and hurt me and any peo­ple or in­sti­tu­tions I’m as­so­ci­ated with—in which case, tell the most con­ve­nient lie-of-omis­sion”, then smart sincere ideal­ists who have in­ter­nal­ized your mar­ket­ing liter­a­ture as a moral ideal and trust the com­mu­nity to im­ple­ment that ideal, are go­ing to be mis­led by the com­mu­nity’s stated be­liefs—and con­fused at some of the push­back they get when sub­mit­ting re­ports with a 1:1:1 blue:green:gray ra­tio.

Well, mis­led to some ex­tent—maybe not much! In the ab­sence of an Or­a­cle AI (or a com­pet­ing ra­tio­nal­ist com­mu­nity in Blue ter­ri­tory) to com­pare notes with, then it’s not clear how one could get a bet­ter map than trust­ing what the “green ra­tio­nal­ists” say. With a few more made-up mod­el­ing as­sump­tions, we can quan­tify the dis­tor­tion in­tro­duced by the Over­ton-ra­tio con­straint, which will hope­fully help de­velop an in­tu­ition for how large of a prob­lem this sort of thing might be in real life.

Imag­ine that So­ciety needs to make a de­ci­sion about an Is­sue (like a ques­tion about di­vorce law or mer­chant taxes). Sup­pose that the facts rele­vant to mak­ing op­ti­mal de­ci­sions about an Is­sue are rep­re­sented by nine rolls of the Real­ity die, and that the qual­ity (util­ity) of So­ciety’s de­ci­sion is pro­por­tional to the (base-two log­a­r­ithm) en­tropy of the dis­tri­bu­tion of what facts get heard and dis­cussed.[8]

The max­i­mum achiev­able de­ci­sion qual­ity is ≈ 3.17.

On av­er­age, Green par­ti­sans will find 3 “green” facts[9] and 3 “gray” facts to re­port, and mer­cilessly stonewall any­one who tries to re­port any “blue” facts, for a de­ci­sion qual­ity of ≈ 2.58.

On av­er­age, the Over­ton-con­strained ra­tio­nal­ists will re­port the same 3 “green” and 3 “gray” facts, but some­thing in­ter­est­ing hap­pens with “blue” facts: each in­di­vi­d­ual can only af­ford to re­port one “blue” fact with­out blow­ing their Over­ton bud­get—but it doesn’t have to be the same fact for each per­son. Re­ports of all 3 (on av­er­age) blue rolls get to en­ter the pub­lic dis­cus­sion, but get men­tioned (cited, retweeted, &c.) 13 as of­ten as green or gray rolls, in ac­cor­dance with the Over­ton ra­tio. So it turns out that the con­strained ra­tio­nal­ists end up with a de­ci­sion qual­ity of ≈ 3.03, sig­nifi­cantly bet­ter than the Green par­ti­sans—but still fal­ling short of the the­o­ret­i­cal ideal where all the rele­vant facts get their due at­ten­tion.

If it’s just not prag­matic to ex­pect peo­ple to defy their in­cen­tives, is this the best we can do? Ac­cept a some­what dis­torted state of dis­course, for­ever?

At least one par­tial rem­edy seems ap­par­ent. Re­call from our origi­nal coin-flip­ping ex­am­ple that a Bayesian who knows what the fil­ter­ing pro­cess looks like, can take it into ac­count and make the cor­rect up­date. If you’re fil­ter­ing your ev­i­dence to avoid so­cial pun­ish­ment, but it’s pos­si­ble to clue in your fel­low ra­tio­nal­ists to your fil­ter­ing al­gorithm with­out trig­ger­ing the so­cial-pun­ish­ment ma­chin­ery—you mustn’t as­sume that ev­ery­one already knows!—that’s po­ten­tially a big win. In other words, blatant cherry-pick­ing is the best kind!


  1. I don’t quite want to use the word hon­est here. ↩︎

  2. And it turns out that know­ing how to be hon­est is much more work than one might ini­tially think. You have read the Se­quences, right?! ↩︎

  3. For lack of an ap­pro­pri­ate Pla­tonic solid in three-di­men­sional space, maybe imag­ine toss­ing a tri­an­gle in two-di­men­sional space?? ↩︎

  4. As an au­thor, I’m fac­ing some con­flict­ing desider­ata in my color choices here. I want to say “Blues and Greens” in that or­der for con­sis­tency with “A Fable of Science and Poli­tics” (and other clas­sics from the Se­quences). Then when mak­ing an ar­bi­trary choice to talk in terms of one of the fac­tions in or­der to avoid clut­ter­ing the ex­po­si­tion, you might have ex­pected me to say “Without loss of gen­er­al­ity, take the Blues,” be­cause the first item in a se­quence (“Blues” in “Blues and Greens”) is a more of a Schel­ling point than the sec­ond, or last, item. But I don’t want to take the Blues, be­cause that color choice has other as­so­ci­a­tions that I’m try­ing to avoid right now: if I said “take the Blues”, I fear many read­ers would as­sume that I’m try­ing to di­rectly push a par­ti­san point about soft cen­sor­ship and prefer­ence-falsifi­ca­tion so­cial pres­sures in liberal/​left-lean­ing sub­cul­tures in the con­tem­po­rary United States. To be fair, it’s true that soft cen­sor­ship and prefer­ence-falsifi­ca­tion so­cial pres­sures in liberal/​left-lean­ing sub­cul­tures in the con­tem­po­rary United States are, his­tor­i­cally, what in­spired me, per­son­ally, to write this post. It’s okay for you to no­tice that! But I’m try­ing to talk about the gen­eral mechanisms that gen­er­ate this class of dis­tor­tions on a So­ciety’s col­lec­tive episte­mol­ogy, in­de­pen­dently of which fac­tion or which ide­ol­ogy hap­pens to be “on top” in a par­tic­u­lar place and time. If I’m do­ing my job right, then my analogue in a “nearby” Everett branch whose lo­cal sub­cul­ture was as “right-po­larized” as my Berkeley en­vi­ron­ment is “left-po­larized”, would have writ­ten a post mak­ing the same ar­gu­ments. ↩︎

  5. Okay, they mar­ket them­selves as psy­chi­a­tric “hos­pi­tals”, but let’s not be con­fused by mis­lead­ing la­bels. ↩︎

  6. Or rather, as­piring epistemic ra­tio­nal­ists can do a de­cent job of as­sess­ing the ex­tent to which some­one is ex­hibit­ing truth-track­ing be­hav­ior, or Blue-par­ti­san be­hav­ior. Ob­vi­ously, peo­ple who are con­sciously try­ing to seek truth, are not nec­es­sar­ily go­ing to suc­ceed at over­com­ing bias, and at­tempts to cor­rect for the “pro-Green” dis­tor­tionary forces be­ing dis­cussed in this parable could eas­ily veer into “pro-Blue” over-cor­rec­tion. ↩︎

  7. Please be ap­pro­pri­ately skep­ti­cal about the real-world rele­vance of my made-up mod­el­ing as­sump­tions! If it turned out that my choice of as­sump­tions were (sub­con­sciously) se­lected for the re­sult­ing con­clu­sions about how bad ev­i­dence-fil­ter­ing is, that would be re­ally bad for the same rea­son that I’m claiming that ev­i­dence-fil­ter­ing is re­ally bad! ↩︎

  8. The en­tropy of a dis­crete prob­a­bil­ity dis­tri­bu­tion is max­i­mized by the uniform dis­tri­bu­tion, in which all out­comes re­ceive equal prob­a­bil­ity-mass. I only chose these “ex­actly nine equally-rele­vant facts/​rolls” and “en­tropic util­ity” as­sump­tions to make the ar­ith­metic easy on me; a more re­al­is­tic model might ad­mit ar­bi­trar­ily many facts into dis­cus­sion of the Is­sue, but posit a dis­tri­bu­tion of facts/​rolls with diminish­ing marginal rele­vance to So­ciety’s de­ci­sion qual­ity. ↩︎

  9. The scare quotes around the ad­jec­tive “‘green’” (&c.) when ap­plied to the word “fact” (as op­posed to a die roll out­come rep­re­sent­ing a fact in our toy model) are sig­nifi­cant! The facts aren’t ac­tu­ally on any­one’s side! We’re try­ing to model the dis­tor­tions that arise from stupid hu­mans think­ing that the facts are on some­one’s side! This is suffi­ciently im­por­tant—and difficult to re­mem­ber—that I should prob­a­bly re­peat it un­til it be­comes ob­nox­ious! ↩︎