# Techniques for probability estimates

Utility max­i­miza­tion of­ten re­quires de­ter­min­ing a prob­a­bil­ity of a par­tic­u­lar state­ment be­ing true. But hu­mans are not util­ity max­i­miz­ers and of­ten re­fuse to give pre­cise nu­mer­i­cal prob­a­bil­ities. Nev­er­the­less, their ac­tions re­flect a “hid­den” prob­a­bil­ity. For ex­am­ple, even some­one who re­fused to give a pre­cise prob­a­bil­ity for Barack Obama’s re-elec­tion would prob­a­bly jump at the chance to take a bet in which ey lost \$5 if Obama wasn’t re-elected but won \$5 mil­lion if he was; such de­ci­sions de­mand that the de­cider covertly be work­ing off of at least a vague prob­a­bil­ity.

When un­trained peo­ple try to trans­late vague feel­ings like “It seems Obama will prob­a­bly be re-elected” into a pre­cise nu­mer­i­cal prob­a­bil­ity, they com­monly fall into cer­tain traps and pit­falls that make their prob­a­bil­ity es­ti­mates in­ac­cu­rate. Cal­ling a prob­a­bil­ity es­ti­mate “in­ac­cu­rate” causes philo­soph­i­cal prob­lems, but these prob­lems can be re­solved by re­mem­ber­ing that prob­a­bil­ity is “sub­jec­tively ob­jec­tive”—that al­though a mind “hosts” a prob­a­bil­ity es­ti­mate, that mind does not ar­bi­trar­ily de­ter­mine the es­ti­mate, but rather calcu­lates it ac­cord­ing to math­e­mat­i­cal laws from available ev­i­dence. Th­ese calcu­la­tions re­quire too much com­pu­ta­tional power to use out­side the sim­plest hy­po­thet­i­cal ex­am­ples, but they provide a stan­dard by which to judge real prob­a­bil­ity es­ti­mates. They also sug­gest tests by which one can judge prob­a­bil­ities as well-cal­ibrated or poorly-cal­ibrated: for ex­am­ple, a per­son who con­stantly as­signs 90% con­fi­dence to eir guesses but only guesses the right an­swer half the time is poorly cal­ibrated. So call­ing a prob­a­bil­ity es­ti­mate “ac­cu­rate” or “in­ac­cu­rate” has a real philo­soph­i­cal ground­ing.

There ex­ist sev­eral tech­niques that help peo­ple trans­late vague feel­ings of prob­a­bil­ity into more ac­cu­rate nu­mer­i­cal es­ti­mates. Most of them trans­late prob­a­bil­ities from forms with­out im­me­di­ate con­se­quences (which the brain sup­pos­edly pro­cesses for sig­nal­ing pur­poses) to forms with im­me­di­ate con­se­quences (which the brain sup­pos­edly pro­cesses while fo­cus­ing on those con­se­quences).

Pre­pare for Revelation

What would you ex­pect if you be­lieved the an­swer to your ques­tion were about to be re­vealed to you?

In Belief in Belief, a man acts as if there is a dragon in his garage, but ev­ery time his neigh­bor comes up with an idea to test it, he has a rea­son why the test wouldn’t work. If he imag­ined Omega (the su­per­in­tel­li­gence who is always right) offered to re­veal the an­swer to him, he might re­al­ize he was ex­pect­ing Omega to re­veal the an­swer “No, there’s no dragon”. At the very least, he might re­al­ize he was wor­ried that Omega would re­veal this, and so re-think ex­actly how cer­tain he was about the dragon is­sue.

This is a sim­ple tech­nique and has rel­a­tively few pit­falls.

Bet on it

At what odds would you be will­ing to bet on a propo­si­tion?

Sup­pose some­one offers you a bet at even odds that Obama will be re-elected. Would you take it? What about two-to-one odds? Ten-to-one? In the­ory, the knowl­edge that money is at stake should make you con­sider the prob­lem in “near mode” and max­i­mize your chances of win­ning.

The prob­lem with this method is that it only works when util­ity is lin­ear with re­spect to money and you’re not risk-averse. In the sim­plest case I should be in­differ­ent to a \$100,000 bet at 50% odds that a fair coin would come up tails, but in fact I would re­fuse it; win­ning \$100,000 would be mod­er­ately good, but los­ing \$100,000 would put me deeply in debt and com­pletely screw up my life. When these sorts of con­sid­er­a­tion be­come paramount, imag­in­ing wa­gers will tend to give in­ac­cu­rate re­sults.

Con­vert to a Fre­quen­cy

How many situ­a­tions would it take be­fore you ex­pected an event to oc­cur?

Sup­pose you need to give a prob­a­bil­ity that the sun will rise to­mor­row. “999,999 in a mil­lion” doesn’t im­me­di­ately sound wrong; the sun seems likely to rise, and a mil­lion is a very high num­ber. But if to­mor­row is an av­er­age day, then your prob­a­bil­ity will be linked to the num­ber of days it will take be­fore you ex­pect that the sun will fail to rise on at least one. A mil­lion days is three thou­sand years; the Earth has ex­isted for far more than three thou­sand years with­out the sun failing to rise. There­fore, 999,999 in a mil­lion is too low a prob­a­bil­ity for this oc­cur­rence. If you think the sort of as­tro­nom­i­cal event that might pre­vent the sun from ris­ing hap­pens only once ev­ery three billion years, then you might con­sider a prob­a­bil­ity more like 999,999,999,999 in a trillion.

In ad­di­tion to con­vert­ing to a fre­quency across time, you can also con­vert to a fre­quency across places or peo­ple. What’s the prob­a­bil­ity that you will be mur­dered to­mor­row? The best guess would be to check the mur­der rate for your area. What’s the prob­a­bil­ity there will be a ma­jor fire in your city this year? Check how many cities per year have ma­jor fires.

This method fails if your case is not typ­i­cal: for ex­am­ple, if your city is on the los­ing side of a war against an en­emy known to use fire-bomb­ing, the prob­a­bil­ity of a fire there has noth­ing to do with the av­er­age prob­a­bil­ity across cities. And if you think the rea­son the sun might not rise is a su­pervillain build­ing a high-tech sun-de­stroy­ing ma­chine, then con­sis­tent sun­rises over the past three thou­sand years of low tech­nol­ogy will provide lit­tle con­so­la­tion.

A spe­cial case of the above failure is con­vert­ing to fre­quency across time when con­sid­er­ing an event that is known to take place at a cer­tain dis­tance from the pre­sent. For ex­am­ple, if to­day is April 10th, then the prob­a­bil­ity that we hold a Christ­mas cel­e­bra­tion to­mor­row is much lower than the 1365 you get by check­ing on what per­centage of days we cel­e­brate Christ­mas. In the same way, al­though we know that the sun will fail to rise in a few billion years when it burns out its nu­clear fuel, this shouldn’t af­fect its chance of ris­ing to­mor­row.

Find a Refer­ence Class

How of­ten have similar state­ments been true?

What is the prob­a­bil­ity that the lat­est crisis in Korea es­ca­lates to a full-blown war? If there have been twenty crisis-level stand­offs in the Korean pen­in­sula in the past 60 years, and only one of them has re­sulted in a ma­jor war, then (war|crisis) = .05, so long as this crisis is equiv­a­lent to the twenty crises you’re us­ing as your refer­ence class.

But find­ing the refer­ence class is it­self a hard prob­lem. What is the prob­a­bil­ity Bigfoot ex­ists? If one makes a refer­ence class by say­ing that the yeti doesn’t ex­ist, the Loch Ness mon­ster doesn’t ex­ist, and so on, then the Bigfoot par­ti­san might ac­cuse you of as­sum­ing the con­clu­sion—af­ter all, the like­li­hood of these crea­tures ex­ist­ing is prob­a­bly similar to and cor­re­lated with Bigfoot. The par­ti­san might sug­gest ask­ing how many crea­tures pre­vi­ously be­lieved not to ex­ist later turned out to ex­ist—a list which in­cludes real an­i­mals like the orangutan and platy­pus—but then one will have to de­bate whether to in­clude crea­tures like drag­ons, orcs, and Poke­mon on the list.

This works best when the refer­ence class is more ob­vi­ous, as in the Korea ex­am­ple.

Make Mul­ti­ple State­ments

How many state­ments could you make of about the same un­cer­tainty as a given state­ment with­out be­ing wrong once?

Sup­pose you be­lieve France is larger than Italy. With what con­fi­dence should you be­lieve it? If you made ten similar state­ments (Ger­many is larger than Aus­tria, Bri­tain is larger than Ire­land, Spain is larger than Por­tu­gal, et cetera) how many times do you think you would be wrong? A hun­dred similar state­ments? If you think you’d be wrong only one time out of a hun­dred, you can give the state­ment 99% con­fi­dence.

This is the most con­tro­ver­sial prob­a­bil­ity as­sess­ment tech­nique; it tends to give lower lev­els of con­fi­dence than the oth­ers; for ex­am­ple, Eliezer wants to say there’s a less than one in a mil­lion chance the LHC would de­stroy the world, but doubts he could make a mil­lion similar state­ments and only be wrong once. Kom­pon­isto thinks this is a failure of imag­i­na­tion: we imag­ine our­selves grad­u­ally grow­ing tired and mak­ing mis­takes, whereas this method only works if the ac­cu­racy of the mil­lionth state­ment is ex­actly the same as the first.

In any case, the tech­nique is only as good as the abil­ity to judge which state­ments are equally difficult to a given state­ment. If I start say­ing things like “Rus­sia is larger than Vat­i­can City! Canada is larger than a speck of dust!” then I may get all the state­ments right, but it won’t mean much for my Italy-France ex­am­ple—and if I get bogged down in difficult ques­tions like “Bu­rundi is larger than Equa­to­rial Guinea” then I might end up un­der­con­fi­dent. In cases where there is an ob­vi­ous com­par­i­son (“Bob didn’t cheat on his test”, “Sue didn’t cheat on her test”, “Alice didn’t cheat on her test”) this prob­lem dis­ap­pears some­what.

Imag­ine Hy­po­thet­i­cal Ev­i­dence

Sup­pose one day all the re­li­gious peo­ple and all the athe­ists get tired of ar­gu­ing and de­cide to set­tle the mat­ter by ex­per­i­ment once and for all. The plan is to roll an n-sided num­bered die and have the faith­ful of all re­li­gions pray for the die to land on “1”. The ex­per­i­ment will be done once, with great pomp and cer­e­mony, and never re­peated, lest the losers try for a bet­ter re­sult. All the re­sources of the world’s skep­tics and se­cu­rity forces will be de­ployed to pre­vent any tam­per­ing with the die, and we as­sume their suc­cess is guaran­teed.

If the ex­per­i­menters used a twenty-sided die, and the die comes up 1, would this con­vince you that God prob­a­bly did it, or would you dis­miss the re­sult as a co­in­ci­dence? What about a hun­dred-sided die? Million-sided? If a suc­cess­ful re­sult on a hun­dred-sided die wouldn’t con­vince you, your prob­a­bil­ity of God’s ex­is­tence must be less than one in a hun­dred; if a mil­lion-sided die would con­vince you, it must be more than one in a mil­lion.

This tech­nique has also been de­nounced as in­ac­cu­rate, on the grounds that our co­in­ci­dence de­tec­tors are over­ac­tive and there­fore in no state to be cal­ibrat­ing any­thing else. It would feel very hard to dis­miss a suc­cess­ful re­sult on a thou­sand-sided die, no mat­ter how low the prob­a­bil­ity of God is. It might also be difficult to vi­su­al­ize a hy­po­thet­i­cal where the ex­per­i­ment can’t pos­si­bly be rigged, and it may be un­fair to force sub­jects to imag­ine a hy­po­thet­i­cal that would prac­ti­cally never hap­pen (like the mil­lion-sided die land­ing on one in a world where God doesn’t ex­ist).

Th­ese tech­niques should be ex­per­i­men­tally testable; any dis­agree­ment over which do or do not work (at least for a spe­cific in­di­vi­d­ual) can be re­solved by go­ing through a list of difficult ques­tions, declar­ing con­fi­dence lev­els, and scor­ing the re­sults with log odds. Steven’s blog has some good sets of test ques­tions (which I de­liber­ately do not link here so as to not con­tam­i­nate a pos­si­ble pool of test sub­jects); if many peo­ple are in­ter­ested in par­ti­ci­pat­ing and there’s a gen­eral con­sen­sus that an ex­per­i­ment would be use­ful, we can try to de­sign one.

• From Spet­zler and Stael von Holstein (1975), there is a vari­a­tion of Bet On It that doesn’t re­quire risk neu­tral­ity.

Say we are go­ing to flip a thumb­tack, and it can land heads (so you can see the head of the tack), or tails (so that the point sticks up like a tail). If we want to as­sess your prob­a­bil­ity of heads, we can con­struct two deals.

Deal 1: You win \$10,000 if we flip a thumb­tack and it comes up heads (\$0 oth­er­wise, you won’t lose any­thing). Deal 2: You win \$10,000 if we spin a roulette-like wheel la­beled with num­bers 1,2,3, …, 100, and the wheel comes up be­tween 1 and 50. (\$0 oth­er­wise, you won’t lose any­thing).

Which deal would you pre­fer? If you pre­fer deal 1, then you are as­sess­ing a prob­a­bil­ity of heads greater than 50%; oth­er­wise, you are as­sess­ing a prob­a­bil­ity of heads less than 50%.

Then, ask the ques­tion many times, us­ing a differ­ent num­ber than 50 for deal 2. For ex­am­ple, if you first say you would pre­fer deal 2, then change it to win­ning on 1-25 in­stead, and see if you still pre­fer deal 2. Keep ad­just­ing un­til you are in­differ­ent be­tween deal 1 and 2. If you are in­differ­ent be­tween the two deals when deal 2 wins from 1-37, then you have as­sessed a prob­a­bil­ity of 37%.

The above de­scribes one pro­ce­dure used by pro­fes­sional de­ci­sion an­a­lysts; they usu­ally use a phys­i­cal wheel with a “win­ning area” that is ad­justable con­tin­u­ously rather than us­ing num­bers like the above.

• Who are “pro­fes­sional de­ci­sion an­a­lysts?” Where do they come from, and who are their clients/​em­ploy­ers? Do they go by any other names? This sounds fas­ci­nat­ing.

• A ma­jor prob­lem with these ap­proaches is that for the ma­jor­ity of real-life ques­tions, the cir­cuits in your brain that are best ca­pa­ble of an­a­lyz­ing the situ­a­tion and giv­ing you an an­swer along with a vague feel­ing of cer­tainty are al­to­gether differ­ent from those that you can use to run these heuris­tics. This is why, in my opinion, at­tempts to as­sign nu­mer­i­cal prob­a­bil­ities to com­mon-sense judg­ments usu­ally don’t make sense.

If your brain has the abil­ity to make a com­mon-sense judg­ment about some real-world phe­nomenon, this abil­ity will typ­i­cally be im­ple­mented in the form of a black-box mod­ule that will out­put the an­swer along with some coarsely graded in­tu­itive feel­ing of cer­tainty. You can­not open this black box and an­a­lyze its al­gorithms in or­der to up­grade this vague feel­ing into a pre­cise nu­mer­i­cal prob­a­bil­ity es­ti­mate. If you in­stead use heuris­tics that yield nu­mer­i­cal prob­a­bil­ities, such as find­ing refer­ence classes, this means side-step­ping your black-box mod­ule and us­ing an al­to­gether differ­ent al­gorithm in­stead—and the prob­a­bil­ity es­ti­mate you’ll ar­rive at this way won’t be per­ti­nent to your best anal­y­sis that uses the black-box in­tu­ition mod­ule.

• Surely you can teach your­self to com­pare in­tu­itive cer­tainty to prob­a­bil­ities, though. I mean, if you come up with rough la­bels for lev­els of in­tu­itive cer­tainty, and record how of­ten each la­bel is right or wrong, you’d get a re­ally rough cor­re­spond­ing prob­a­bil­ity already.

Edit: Oh, this is pre­dic­tion­book’s rai­son d’être.

• In­spired by your fi­nal para­graph, I sought out a va­ri­ety of test ques­tions on the web—both on Steven’s blog and el­se­where. I was ex­pect­ing sys­tem­atic over­con­fi­dence, with a smaller chance of sys­tem­atic un­der­con­fi­dence, through­out the prob­a­bil­ity spec­trum.

In­stead I found a very in­ter­est­ing pat­tern.

When I was 90% or 95% cer­tain of a fact, I was slightly over­con­fi­dent. My 90% es­ti­mates shook out at about 80%, and my 95% es­ti­mates shook out around 90%. When I was com­pletely un­cer­tain of a fact, I was also slightly over­con­fi­dent, but within the realm of ex­per­i­men­tal er­ror.

But when I was just 50% con­fi­dent of a fact, I was al­most always wrong. Far more of­ten than any­one could achieve by ran­dom guess­ing: my wrong­ness was thor­ough and in­te­grated and sys­tem­atic.

Clearly, that feel­ing of slight con­cern which I’ve always in­ter­preted as, “I think I re­mem­ber X, but it could go ei­ther way,” ac­tu­ally means some­thing closer to, “X is not true; my be­liefs are in­con­sis­tent.”

If I’m sure I know some­thing, I prob­a­bly do. If I’m sure I’m clue­less, I prob­a­bly am. But if I think I might know some­thing, then I al­most cer­tainly have it back­wards.

Is this a com­mon bias which I should have read about by now?

• In­ter­est­ing!

By the way, HTML tags don’t work here; click “Help” to the lower right of the edit win­dow to see the Markup syn­tax rules.

• Thanks—ed­ited for proper ital­ics.

• I think most of these have the same limi­ta­tion. When the num­bers are to big, such as 1000 com­pa­rable cases or a 1 in 1000 chance, the hu­man brain can­not in­tu­itively grasp what to do. We are re­ally only op­ti­mized for things in a cen­tral range (and, ob­vi­ously, not even that un­der many cir­cum­stances). Rarer events, at least ones that do not oc­cur in ag­gre­gate, do not pro­duce suffi­cient op­ti­miza­tion pres­sures. At some point, all the hard parts must be re­duced to purely math­e­mat­i­cal ques­tions. If you can ac­tu­ally think of 10 cor­re­spond­ing situ­a­tions, or re­mem­ber the av­er­age of 100 past situ­a­tions, you can use that, but pic­tur­ing your­self deal­ing with 10 000 of some­thing does not feel very differ­ent than pic­tur­ing 100 000.

• That’s definitely a prob­lem; a mil­lion is a statis­tic. I think we can try to work around it in some cases, though. You men­tioned the num­bers 10,000 and 100,000; one might con­vert these into a car and a house, re­spec­tively, by es­ti­mat­ing costs. By in­ter­pret­ing such large num­bers in terms of these real con­cepts, we get a more con­crete sense of the differ­ence be­tween such large num­bers. You can then think of the is­sue in terms of how of­ten you use the car vs. the house, or even how much time you’re go­ing to spend pay­ing them off. That re­duces the differ­ence to some­thing man­age­able. Ob­vi­ously, this won’t work in all cases, and the weight or cost of even a real con­cept can vary based on the per­son and their lo­ca­tion (spa­tial and tem­po­ral), but it can be worth try­ing.

For an­other ex­am­ple, con­sider the way peo­ple some­times talk about gov­ern­ment bud­gets. Some­one might be out­raged at \$100 mil­lion go­ing to a cer­tain area, out of the over­all bud­get of \$50 billion. “Million” and “billion” are usu­ally pro­cessed by our brains as just “big,” so we fo­cus on the 100 and the 50, and 100 is big­ger than 50, so… out­rage! But if we di­vide by a mil­lion, we have \$100 (a new cell phone) vs. \$50,000 (a year of col­lege tu­ition, or an ex­pen­sive car). The differ­ence is much clearer.

• A tech­nique I use to get around this prob­lem is to think in terms of or­ders of mag­ni­tude. What you can do is ask your­self (for ex­am­ple) about be­ing in ten cor­re­spond­ing situ­a­tions, then ask your­self about that (i.e. the set of ten situ­a­tions) hap­pen­ing ten times, then about that hap­pen­ing ten times. This is also, with a lit­tle prac­tice, an effec­tive way to de­velop a visceral (and ac­cord­ingly mind-blow­ing) sense of cos­mic/​micro­scopic scales, long pe­ri­ods of time, and so forth—cf. the Pow­ers of Ten video.

• I just found this with google. I spent much time in 2005-2007 to get ex­perts as­sign a sub­jec­tive prob­a­bil­ity to a se­vere (H5N1) pan­demic with >100M deaths. This was a strange ex­pe­rience. Ex­perts didn’t want to give prob­a­bil­ities but painted a some­how dark pic­ture in in­ter­views.Economists ig­nored the prob­lem in their mod­els (mor­tal­ity bonds rat­ing). Among the few who gave es­ti­mates were Bob Glee­son and Michael Steele with ~20% per year. The same prob­lem oc­curs in other sci­ences : ask your sur­geon for the prob­a­bil­ity that you’ll die or your lawer for the prob­a­bil­ity to win the pro­cess or your teacher for the prob­a­bil­ity that you’ll pass the exam or the can­di­date for his prob­a­bil­ity to win the elec­tion or the pres­i­dent for his prob­a­bil­ity of a nu­clear war or global re­ces­sion etc. Th­ese would be use­ful in­for­ma­tion, even if only sub­jec­tive, in­for­mal. Yet peo­ple usu­ally won’t give them. Make a bet­ter so­ciety with peo­ple giv­ing prob­a­bil­ity es­ti­mates !

• Nice. Much of this has been cov­ered im­plic­itly in var­i­ous com­ments, but hav­ing it all in one place is lovely.

Some­thing that is per­haps ob­vi­ous but seems worth say­ing ex­plic­itly is that these tech­niques aren’t mu­tu­ally ex­clu­sive, and us­ing sev­eral of them can help con­trast their rel­a­tive weak­nesses and strengths.

For ex­am­ple, if I want to put a num­ber on my es­ti­mate that I will die to­mor­row (D), I can pre­pare for rev­e­la­tion and ob­serve that, if I imag­ine Omega ap­pear­ing and offer­ing to tell me whether D, I’d be­come very anx­ious. So I might con­clude that my es­ti­mated prob­a­bil­ity of D is sig­nifi­cant enough to worry about… on the or­der of 10%, say.

But I can also look at the refer­ence class of peo­ple like me and see how many of them die on any given day. Ideally I’d look up my age and other fac­tors on an ac­tu­ar­ial table, but I don’t feel like both­er­ing, so in­stead I take the less re­li­able refer­ence class of hu­mans… ~150000 deaths per day out of 6.8 billion peo­ple is some­thing on the or­der of 2e-5.

And I can also con­vert to a fre­quency… I’ve lived ~15,000 days with­out dy­ing, which gets me ~7e-5.

And now I can com­pare these tech­niques. [EDIT: I mean, com­pare them in the spe­cific con­text of this ques­tion.]

Con­vert­ing to a fre­quency has the bizarre prop­erty that my dy­ing be­comes less and less likely as I get older, when com­mon sense tells me the op­po­site is true. Refer­ence class of hu­mans prob­a­bly has the same prop­erty, but to a lesser de­gree. Ac­tu­ar­ial refer­ence class doesn’t have this prop­erty, which is a point in its fa­vor (no sur­prise there). Prepar­ing for rev­e­la­tion re­lies on some very un­re­li­able in­tu­itions. And so forth.

(I con­clude, in­ci­den­tally, that I have a poorly cal­ibrated fear of dy­ing. This is un­sur­pris­ing; af­ter my re­cent stroke it was a ma­jor post-trau­matic prob­lem. I’ve worked it down to a tol­er­able level, but it does not sur­prise me at all that it’s still or­ders of mag­ni­tude higher than it should be.)

• My largest prob­lem with this post comes right at the be­gin­ning:

hu­mans are not util­ity max­i­miz­ers and of­ten re­fuse to give pre­cise nu­mer­i­cal prob­a­bil­ities. Nev­er­the­less, their ac­tions re­flect a “hid­den” prob­a­bil­ity.

.

al­though a mind “hosts” a prob­a­bil­ity es­ti­mate, that mind does not ar­bi­trar­ily de­ter­mine the es­ti­mate, but rather calcu­lates it ac­cord­ing to math­e­mat­i­cal laws from available ev­i­dence.

If our ba­sic brain pro­cesses are ir­ra­tional, there is no rea­son why there has to be a well-defined prob­a­bil­ity in there some­where. You might ex­tract a prob­a­bil­ity by some method or an­other, but you might also ex­tract method-de­pen­dent gob­bledy­gook.

• Con­grats, this is an ex­cel­lent post. Why hasn’t it been pro­moted?

I to­tally use #1 + #2 by imag­in­ing bet­ting, and then imag­in­ing my re­ac­tion when the out­come is re­vealed. I try to think whether I would say:

• “Shit, I should have bet with lower con­fi­dence”

or

• “Wow, that is truly sur­pris­ing”

I would just add a note that as­sign­ing re­ally small prob­a­bil­ities is some­times nec­es­sary but of­ten fraught with dan­ger. For ex­am­ple, I would not bet my life that the sun will rise to­mor­row in ex­change for \$1, even though my life is not worth as much to me as 1 trillion times the util­ity of \$1.

• I would take that bet, but only be­cause an event that caused the sun to not rise to­mor­row would al­most cer­tainly kill me any­way.

• I’m look­ing for­ward to us­ing this kind of rea­son­ing to profit off end-of-the-wor­lders in late 2012.

Well, that kind of rea­son­ing and just my run-of-the-mill, “no I don’t think the world is end­ing” rea­son­ing.

• Ac­tu­ally, it’s not analo­gous, be­cause you don’t have any non-zero-sum­ness with the dooms­day bet­tor be­yond that which is always pre­sent when two par­ties have differ­ing pre­dic­tions.

Imag­ine two bet­tors who each try to max­i­mize U_ex­pected = P(e)U(e) + (1-P(e))U(~e)

Typ­i­cally the bet­tors have the same U(e) and U(~e), and only dis­agree on P(e).

If you an­a­lyze the dooms­day bet with e = “the world ends”, then it’s just a stan­dard bet situ­a­tion, be­cause both bet­tors set U(the world ends) = 0.

If you an­a­lyze the dooms­day bet with e = “what­ever hap­pens in the year 2013”, then it’s seem­ingly un­usual in that both bet­tors set P(e) to the same value (1), but it’s re­ally not un­usual be­cause you can fac­tor their re­spec­tive prob­a­bil­ity of dooms­day out of their U(e) val­ues.

So why isn’t my/​WrongBot’s bet analo­gous? Let’s say Omega offered me \$1 to­day in ex­change for get­ting to kill me if the sun doesn’t rise to­mor­row. Let e = “sun doesn’t rise to­mor­row”.

My bet with Omega has two prop­er­ties that are not true about a typ­i­cal zero-sum bet:

1. Since my U(e) is 0, Omega’s U(e) must be pos­i­tive for it to make that bet. Any time there’s a con­tract that the par­ties en­ter into be­cause of differ­ing U(e) val­ues, and the U(e) differ­ence doesn’t fac­tor into a P(e_subevent) like in the 2012 dooms­day bet, the con­tract is not so much a bet as a non-zero-sum trade.

2. I’d bet on ~e re­gard­less of how high my P(e) is, be­cause there’s no P(e) that can make P(e)U(e) + (1-P(e))U(~e) < 0 for me. That’s a gen­eral prop­erty of con­tracts which are guaran­teed to make my life bet­ter than be­fore, i.e. non-zero-sum trades.

• I have to ad­mit… I’m mostly con­fused by this com­ment. Not by the math, but by ex­actly what you’re get­ting at/​dis­agree­ing with.

If you’re just say­ing that the dooms­day sce­nario isn’t perfectly analo­gous to the Omega sce­nario, I ac­cept this, and never meant to im­ply that it was. I was only point­ing out that the “if I lose I’ll be dead any­way” gen­eral type of rea­son­ing could be ap­plied to the other situ­a­tion (and not nec­es­sar­ily through ex­plic­itly bet­ting against the other party). If you’re say­ing that it couldn’t, then I con­fess that I still don’t un­der­stand why from your com­ment.

• I was only point­ing out that the “if I lose I’ll be dead any­way” gen­eral type of rea­son­ing could be ap­plied to the other situ­a­tion (and not nec­es­sar­ily through ex­plic­itly bet­ting against the other party).

My point is that ac­tu­ally, you don’t get any ex­tra ex­pected value from the doom­sayer’s “if I lose I’ll be dead any­way” rea­son­ing. You get ex­actly as much ex­pected value from them as you would get from any­one with any kind of pre­dic­tion whose ac­cu­racy is lower than your own by the same amount.

In con­trast, WrongBot did get to cap­i­tal­ize on a spe­cial “if I lose I’m dead” prop­erty of his bet, and my pre­vi­ous post de­tails the im­por­tant prop­er­ties that make WrongBot’s bet atyp­i­cal (prop­er­ties that your own bet does not have).

• Ah, I see then where we mis­com­mu­ni­cated. I meant that I, not he, would be ap­ply­ing that rea­son­ing. I strongly an­ti­ci­pate not be­ing dead, and for the pur­poses of this bet (and only for this bet) don’t care if I’m wrong about it. He would strongly an­ti­ci­pate be­ing dead, and might there­fore ne­glect the pos­si­bil­ity that he’ll have to suffer the con­se­quences of what­ever we’re do­ing. My los­ing the bet is “pro­tected” (in a rather dreary way), his isn’t.

Ob­vi­ously, I haven’t worked out the de­tails, and prob­a­bly won’t ac­tu­ally go around tak­ing ad­van­tage of these peo­ple, but it oc­curred to me the other day while I was pon­der­ing how one should al­most always be able to turn bet­ter-cal­ibrated ex­pec­ta­tions into util­ity.

• Ob­vi­ously, I haven’t worked out the de­tails, and prob­a­bly won’t ac­tu­ally go around tak­ing ad­van­tage of these people

Hey, they’d be happy enough to still be al­ive, and you could donate the pro­ceeds to erad­i­cat­ing po­lio. But un­for­tu­nately you’d also be en­courag­ing peo­ple to take ex­is­ten­tial threats less se­ri­ously in gen­eral, which may be a bad idea. I can’t de­cide.

Any­way, good luck find­ing a be­liever in any kind of woo who is pre­pared to make a cash wa­ger on a testable out­come. Think how quickly we would have erad­i­cated home­opa­thy and as­trol­ogy by now! :)

• Re­gard­ing bett­ting, one way of han­dling the is­sue of diminish­ing marginal re­turns for money is to use bets that are small com­pared to your net worth. When the bet size is small, the ra­tio of util­ity should be close to lin­ear. This doesn’t work perfectly, but it does help re­duce the prob­lem some what. Of course, this only works for is­sues where the ex­pected prob­a­bil­ity is not ex­treme (by the time one gets to more than 10 to 1 this starts to break down).

• This doesn’t work great even if you deal with mod­er­ate prob­a­bil­ities, be­cause you need high frac­tions of net worth to get peo­ple to stop sig­nal­ing...if I am a Yan­kees fan who earns \$50,000 a year, I will bet \$10 at even odds that the Yan­kees will win even if my available data would only pre­dict a 40% chance for the Yan­kees to win. The ex­pected loss of \$1 doesn’t even come close to the ex­pected loss of ap­pear­ing not to love the pin­striped slug­gers with all my heart.

• Mass_Driver:

This doesn’t work great even if you deal with mod­er­ate prob­a­bil­ities, be­cause you need high frac­tions of net worth to get peo­ple to stop sig­nal­ing...

Yes, and there’s also the is­sue of trans­ac­tion costs. Espe­cially since trans­ac­tion costs are ba­si­cally the op­por­tu­nity costs of the time spent ar­rang­ing the bet and the pay­ment, and for peo­ple with higher net worth, these op­por­tu­nity costs are typ­i­cally also higher.

• I won­der if there’s a rea­son­ably straight­for­ward way to find a func­tion for your­self such that your sub­jec­tive util­ity is roughly lin­ear over f(\$). That’d make the bet­ting ap­proach a lot more widely ap­pli­ca­ble.

log( (net worth + pay­off) /​ net worth) times some con­stant seems like a good start, but I already see some pos­si­ble flaws.

• I be­lieve there is a straight­for­ward way: Con­sider bets on events with known prob­a­bil­ity!

• I think you could perform the dice rol­ling ex­per­i­ment with­out any need for se­cu­rity against tam­per­ing. To gen­er­ate a ran­dom num­ber from 0 to N-1, have ev­ery in­ter­ested party gen­er­ate their own num­ber (roll their own die), then ev­ery­body re­veals their num­bers to­gether and the group adds them all up and takes the re­main­der af­ter di­vid­ing by N.

With that pro­ce­dure ev­ery­body should be con­vinced that the re­sult is at least as ran­dom as their own num­ber.

• then ev­ery­body re­veals their num­bers together

That is not so eas­ily done.

• It is if you use a com­mit­ment scheme. Such a thing al­lows you to com­mit to a value be­fore re­veal­ing it. So you go in two steps—ev­ery­body com­mits, then ev­ery­body re­veals. No­body can change their value af­ter com­mit­ting, so no­body can base their val­ues on oth­ers’ val­ues.

• A com­mit­ment scheme sounds like “se­cu­rity against tam­per­ing”.

• But there’s no para­noia in­volved. It’s cryp­to­graph­i­cally quite sim­ple. All you need is a hash func­tion.

Con­trast with all of the gov­ern­ments and all of their se­cu­rity agents and such and no­body re­ally trusts that it’s se­cure.

• All you need is a hash func­tion.

A hash func­tion on a die roll is quite vuln­er­a­ble to a dic­tio­nary at­tack. You could add a salt, but this makes hash col­li­sions eas­ier to take ad­van­tage of.

• You wouldn’t use a hash func­tion that peo­ple could gen­er­ate col­li­sions with, any more than you would use ROT-13.

• Of course a salt. Not sure why that would make hash col­li­sions eas­ier to take ad­van­tage of though. Pre­sum­ably you use a good hash func­tion.

• The point is there are peo­ple who would not re­al­ize that you need a salt, or a hash func­tion not vuln­er­a­ble to col­li­sions. Yes, there are ex­ist­ing solu­tions for this prob­lem, but even choos­ing an ex­ist­ing solu­tion from the space of se­cu­rity solu­tions to differ­ent prob­lems is not triv­ial.

• Why does “some peo­ple don’t know how this works” make it less triv­ial?

• This pro­vides an ex­cel­lent demon­stra­tion of E.T. Jaynes’s point that mak­ing some­thing more ran­dom re­ally means mak­ing it more com­pli­cated.

• The point isn’t to make it more ran­dom, the point is to make it more trust­wor­thy. You can par­ti­ci­pate in the pro­cess and be con­fi­dent that the re­sult is ran­dom with­out hav­ing to put any trust in the other par­ti­ci­pants.

• Re­gard­less, your origi­nal lan­guage claiming it made it more ran­dom was cor­rect, be­cause it does make it more hard-to-pre­dict-but-with-clear-sym­me­try, aka ran­dom.

• The point isn’t to make it more ran­dom, the point is to make it more trust­wor­thy. You can par­ti­ci­pate in the pro­cess and be con­fi­dent that the re­sult is ran­dom with­out hav­ing to put any trust in the other par­ti­ci­pants.

• Thanks, nice ar­ti­cle! I think the cal­ibra­tion of our prob­a­bil­ity es­ti­ma­tion or­gans is one of the items that de­serve a lot more at­ten­tion on LW.

• This isn’t hugely rele­vant to the post, but LessWrong doesn’t re­ally provide a means for a time-sen­si­tive link dump, and it seems a shame to miss the op­por­tu­nity to pro­mote an ex­cel­lent site for a slight lack of func­tion­al­ity.

For any cricket fans that have been en­joy­ing the Ashes, here is a very read­able de­scrip­tion of Bayesian statis­tics ap­plied to cricket bat­ting av­er­ages.

• That seems like a good thing to post in Dis­cus­sion. Also, to the ex­tent that it’s about the math more than the par­tic­u­lar matches, it isn’t all that time sen­si­tive.

• I’d be more in­ter­ested to know what LW thought of cre­at­ing a Prob­a­bil­ity Distri­bu­tion for a con­tin­u­ous out­come. This seems to be cum­ber­some with all of the above tools, which I’ll ad­mit are quite helpful for bi­nary events; but when you’re pur­chas­ing a new com­puter, it’s months that the com­puter will last be­fore break­ing, not whether it breaks in the first two years, that is rele­vant.

If taken to inanity, one could con­struct a large num­ber of bi­nary out­comes and try to smash them to­gether to get a prob­a­bil­ity dis­tri­bu­tion for a con­tin­u­ous vari­able. But, that’s pretty an­noy­ing—surely there are bet­ter ways

• For this, I would use the ‘smash-to­gether’ method. “How many months have con­tained an ex­pe­rience of a com­puter break­ing on me?” over “How many months have I owned com­put­ers?” will give me the prob­a­bil­ity of the com­puter break­ing in any given month, and then the graph y=(1-pr(break))^x rep­re­sents the con­tin­u­ous vari­able “My com­puter is not bro­ken”. This takes about five min­utes: it’s worth it for cars, com­put­ers, homes, smart­phones, etc. But you’re right, too an­noy­ing for smaller cases.

• That’s ac­tu­ally a pretty good idea, shok­wave—thanks

• “The plan is to roll an n-sided num­bered die and have the faith­ful of all re­li­gions pray for the die to land on “1”

Eli­jah did this; from what I can tell, it was in­suffi­cient to end dis­agree­ment :) Read 1st Kings 18 (al­ter­na­tively, wikipedia

• I don’t think ex­per­i­ments performed prior to the in­ven­tion of video record­ing ought to count.

• “The plan is to roll an n-sided num­bered die and have the faith­ful of all re­li­gions pray for the die to land on “1”

Eli­jah did this; from what I can tell, it was in­suffi­cient to end dis­agree­ment :) Read 1st Kings 18 (al­ter­na­tively, wikipedia

• The god of the one true re­li­gion will re­fuse to in­ter­vene to pun­ish its be­liev­ers for co­op­er­at­ing with the mem­bers of all the other re­li­gions for the ex­per­i­ment.

• I’d bet good money the god of the one true re­li­gion is per­me­able to flour too.

• Well if I were om­nipo­tent, I’m sure I’d use my pow­ers to avoid get­ting cov­ered with flour.

• Upvoted for pre­cise use of an­thro­po­mor­phic bias (less cer­tain: to high­light the com­mon an­thro­pocen­tric dis­tor­tions of God con­cepts).

• Pre­dic­tion: The die comes up 666.

Con­fi­dence: If N>665, slightly higher than 1/​N due to ironic gods. If N<666, 0.

• When you edit this com­ment, click the “Help” link to the lower right of the text box for more in­for­ma­tion on the Mark­down syn­tax. It doesn’t ac­cept HTML, alas.

• I think your link to “Near Mode” is sup­posed to be a link to “Far Mode”, and that you meant to link the fol­low­ing phrase to “Near Mode”.

• The link redi­rected to “near/​far think­ing”, but I’ve changed it to re­flect that.