Probability space has 2 metrics

A met­ric is tech­ni­cally defined as a func­tion from pairs of points to the non ne­gi­t­ive re­als. With the prop­er­ties that and and .

In­tu­itively, a met­ric is a way of mea­sur­ing how similar points are. Which points are nearby which oth­ers. Prob­a­bil­ities can be rep­re­sented in sev­eral differ­ent ways, in­clud­ing the stan­dard range and the log odds . They are re­lated by and and (equa­tions alge­braically equiv­a­lent)

The two met­rics of im­por­tance are the baysian met­ric and the prob­a­bil­ity met­ric .

Sup­pose you have a prior, in log odds, for some propo­si­tion. Sup­pose you up­date on some ev­i­dence that is twice as likely to ap­pear if the propo­si­tion is true, to get a pos­te­rior, in log odds. Then . The met­ric mea­sures how much ev­i­dence you need to move be­tween prob­a­bil­ities.

Sup­pose you have a choice of ac­tions, the first ac­tion will make an event of util­ity hap­pen with prob­a­bil­ity , the other will cause the prob­a­bil­ity of the event to be . How much should you care. .

The first met­ric stretches prob­a­bil­ities near 0 or 1 and is uniform in log odds. The sec­ond squashes all log odds with large ab­solute value to­gether, and is uniform in prob­a­bil­ities. The first is used for baysian up­dates, the sec­ond for ex­pected util­ity calcu­la­tions.

Sup­pose an im­perfect agent rea­soned us­ing a sin­gle met­ric, some­thing in be­tween these two. Some met­ric func­tion less squashed up than but more squashed than around the ends. Sup­pose it crudely sub­sti­tuted this new met­ric into its rea­son­ing pro­cesses when­ever one of the other two met­rics was re­quired.

In de­ci­sion the­ory prob­lems, such an agent would rate small differ­ences in prob­a­bil­ity as more im­por­tant than they re­ally were when fac­ing prob­a­bil­ities near 0 or 1. From the in­side, the differ­ence be­tween no chance and 0.01, would feel far larger than the dis­tance be­tween prob­a­bil­ities 0.46 and 0.47.

The Allais Paradox

How­ever, the met­ric is more squashed than , so mov­ing from a 10000:1 odds to 1000:1 odds seems to re­quire less ev­i­dence than mov­ing from 10:1 to 1:1. When fac­ing small prob­a­bil­ities, such an agent would perform larger baysian up­dates than re­ally nec­es­sary, based on weak ev­i­dence.

Priv­ileg­ing the Hypothesis

As both of these be­hav­iors cor­re­spond to known hu­man bi­ases, could hu­mans be us­ing only a sin­gle met­ric on prob­a­bil­ity space?