Probability, knowledge, and meta-probability

This ar­ti­cle is the first in a se­quence that will con­sider situ­a­tions where prob­a­bil­ity es­ti­mates are not, by them­selves, ad­e­quate to make ra­tio­nal de­ci­sions. This one in­tro­duces a “meta-prob­a­bil­ity” ap­proach, bor­rowed from E. T. Jaynes, and uses it to an­a­lyze a gam­bling prob­lem. This situ­a­tion is one in which rea­son­ably straight­for­ward de­ci­sion-the­o­retic meth­ods suffice. Later ar­ti­cles in­tro­duce in­creas­ingly prob­le­matic cases.

A sur­pris­ing de­ci­sion anomaly

Let’s say I’ve re­cruited you as a sub­ject in my thought ex­per­i­ment. I show you three cu­bi­cal plas­tic boxes, about eight inches on a side. There’s two green ones—iden­ti­cal as far as you can see—and a brown one. I ex­plain that they are gam­bling ma­chines: each has a fa­ce­plate with a slot that ac­cepts a dol­lar coin, and an out­put slot that will re­turn ei­ther two or zero dol­lars.

I un­screw the fa­ce­plates to show you the mechanisms in­side. They are quite sim­ple. When you put a coin in, a wheel spins. It has a hun­dred holes around the rim. Each can be blocked, or not, with a teeny rub­ber plug. When the wheel slows to a halt, a sen­sor checks the near­est hole, and dis­penses ei­ther zero or two coins.

The brown box has 45 holes open, so it has prob­a­bil­ity p=0.45 of re­turn­ing two coins. One green box has 90 holes open (p=0.9) and the other has none (p=0). I let you ex­per­i­ment with the boxes un­til you are satis­fied these prob­a­bil­ities are ac­cu­rate (or very nearly so).

Then, I screw the fa­ce­plates back on, and put all the boxes in a black cloth sack with an elas­tic clo­sure. I squidge the sack around, to mix up the boxes in­side, and you reach in and pull one out at ran­dom.

I give you a hun­dred one-dol­lar coins. You can put as many into the box as you like. You can keep as many coins as you don’t gam­ble, plus what­ever comes out of the box.

If you pul­led out the brown box, there’s a 45% chance of get­ting $2 back, and the ex­pected value of putting a dol­lar in is $0.90. Ra­tion­ally, you should keep the hun­dred coins I gave you, and not gam­ble.

If you pul­led out a green box, there’s a 50% chance that it’s the one that pays two dol­lars 90% of the time, and a 50% chance that it’s the one that never pays out. So, over­all, there’s a 45% chance of get­ting $2 back.

Still, ra­tio­nally, you should put some coins in the box. If it pays out at least once, you should gam­ble all the coins I gave you, be­cause you know that you got the 90% box, and you’ll nearly dou­ble your money.

If you get noth­ing out af­ter a few tries, you’ve prob­a­bly got the never-pay box, and you should hold onto the rest of your money. (Ex­er­cise for read­ers: how many no-pay­outs in a row should you ac­cept be­fore quit­ting?)

What’s in­ter­est­ing is that, when you have to de­cide whether or not to gam­ble your first coin, the prob­a­bil­ity is ex­actly the same in the two cases (p=0.45 of a $2 pay­out). How­ever, the ra­tio­nal course of ac­tion is differ­ent. What’s up with that?

Here, a sin­gle prob­a­bil­ity value fails to cap­ture ev­ery­thing you know about an un­cer­tain event. And, it’s a case in which that failure mat­ters.

Such limi­ta­tions have been rec­og­nized al­most since the be­gin­ning of prob­a­bil­ity the­ory. Dozens of solu­tions have been pro­posed. In the rest of this ar­ti­cle, I’ll ex­plore one. In sub­se­quent ar­ti­cles, I’ll look at the prob­lem more gen­er­ally.


To think about the green box, we have to rea­son about the prob­a­bil­ities of prob­a­bil­ities. We could call this meta-prob­a­bil­ity, al­though that’s not a stan­dard term. Let’s de­velop a method for it.

Pull a penny out of your pocket. If you flip it, what’s the prob­a­bil­ity it will come up heads? 0.5. Are you sure? Pretty darn sure.

What’s the prob­a­bil­ity that my lo­cal ju­nior high school sports­ball team will win its next game? I haven’t a ghost of a clue. I don’t know any­thing even about pro­fes­sional sports­ball, and cer­tainly noth­ing about “my” team. In a match be­tween two teams, I’d have to say the prob­a­bil­ity is 0.5.

My girlfriend asked me to­day: “Do you think Raley’s will have dol­mades?” Raley’s is our lo­cal su­per­mar­ket. “I don’t know,” I said. “I guess it’s about 5050.” But un­like sports­ball, I know some­thing about su­per­mar­kets. A fancy Whole Foods is very likely to have dol­mades; a 7-11 al­most cer­tainly won’t; Raley’s is some­where in be­tween.

How can we model these three cases? One way is by as­sign­ing prob­a­bil­ities to each pos­si­ble prob­a­bil­ity be­tween 0 and 1. In the case of a coin flip, 0.5 is much more prob­a­ble than any other prob­a­bil­ity:

Tight Gaussian centered around 0.5

We can’t be ab­solutely sure the prob­a­bil­ity is 0.5. In fact, it’s al­most cer­tainly not ex­actly that, be­cause coins aren’t perfectly sym­met­ri­cal. And, there’s a very small prob­a­bil­ity that you’ve been given a tricky penny that comes up tails only 10% of the time. So I’ve illus­trated this with a tight Gaus­sian cen­tered around 0.5.

In the sports­ball case, I have no clue what the odds are. They might be any­thing be­tween 0 to 1:

Flat line from 0 to 1

In the Raley’s case, I have some knowl­edge, and ex­tremely high and ex­tremely low prob­a­bil­ities seem un­likely. So the curve looks some­thing like this:

Wide Gaussian centered on 0.5

Each of these curves av­er­ages to a prob­a­bil­ity of 0.5, but they ex­press differ­ent de­grees of con­fi­dence in that prob­a­bil­ity.

Now let’s con­sider the gam­bling ma­chines in my thought ex­per­i­ment. The brown box has a curve like this:

Tight Gaussian around 0.45

Whereas, when you’ve cho­sen one of the two green boxes at ran­dom, the curve looks like this:

Bimodal distribution with sharp peaks at 0 and 0.9

Both these curves give an av­er­age prob­a­bil­ity of 0.45. How­ever, a ra­tio­nal de­ci­sion the­ory has to dis­t­in­guish be­tween them. Your op­ti­mal strat­egy in the two cases is quite differ­ent.

With this frame­work, we can con­sider an­other box—a blue one. It has a fixed pay­out prob­a­bil­ity some­where be­tween 0 and 0.9. I put a ran­dom num­ber of plugs in the holes in the spin­ning disk—leav­ing be­tween 0 and 90 holes open. I used a noise diode to choose; but you don’t get to see what the odds are. Here the prob­a­bil­ity-of-prob­a­bil­ity curve looks rather like this:

Flat line from 0 to 0.9, then zero above

This isn’t quite right, be­cause 0.23 and 0.24 are much more likely than 0.235—the plot should look like a comb—but for strat­egy choice the differ­ence doesn’t mat­ter.

What is your op­ti­mal strat­egy in this case?

As with the green box, you ought to spend some coins gath­er­ing in­for­ma­tion about what the odds are. If your es­ti­mate of the prob­a­bil­ity is less than 0.5, when you get con­fi­dent enough in that es­ti­mate, you should stop. If you’re con­fi­dent enough that it’s more than 0.5, you should con­tinue gam­bling.

If you en­joy this sort of thing, you might like to work out what the ex­act op­ti­mal al­gorithm is.

In the next ar­ti­cle in this se­quence, we’ll look at some more com­pli­cated and in­ter­est­ing cases.

Fur­ther reading

The “meta-prob­a­bil­ity” ap­proach I’ve taken here is the Ap dis­tri­bu­tion of E. T. Jaynes. I find it highly in­tu­itive, but it seems to have had al­most no in­fluence or ap­pli­ca­tion in prac­tice. We’ll see later that it has some prob­lems, which might ex­plain this.

The green and blue boxes are re­lated to “multi-armed ban­dit prob­lems.” A “one-armed ban­dit” is a cas­ino slot ma­chine, which has defined odds of pay­out. A multi-armed ban­dit is a hy­po­thet­i­cal gen­er­al­iza­tion with sev­eral arms, each of which may have differ­ent, un­known odds. In gen­eral, you ought to pull each arm sev­eral times, to gain in­for­ma­tion. The ques­tion is: what is the op­ti­mal al­gorithm for de­cid­ing which arms to pull how many times, given the pay­ments you have re­ceived so far?

If you read the Wikipe­dia ar­ti­cle and fol­low some links, you’ll find the con­cepts you need to find the op­ti­mal green and blue box strate­gies. But it might be more fun to try on your own first! The green box is sim­ple. The blue box is harder, but the same gen­eral ap­proach ap­plies.

Wikipe­dia also has an ac­ci­den­tal list of for­mal ap­proaches for prob­lems where or­di­nary prob­a­bil­ity the­ory fails. This is far from com­plete, but a good start­ing point for a browser tab ex­plo­sion.


Thanks to Rin’dzin Pamo, St. Rev., Matt_Simp­son, Kaj_So­tala, and Vaniver for helpful com­ments on drafts. Of course, they may dis­agree with my analy­ses, and aren’t re­spon­si­ble for my mis­takes!