# P(X = exact value) = 0: Is it really counterintuitive?

I’m prob­a­bly not go­ing to say any­thing new here. Some­one must have pon­dered over this already. How­ever, hope­fully it will in­vite dis­cus­sion and clear things up.

Let X be a ran­dom vari­able with a con­tin­u­ous dis­tri­bu­tion over the in­ter­val [0, 10]. Then, by the defi­ni­tion of prob­a­bil­ity over con­tin­u­ous do­mains, P(X = 1) = 0. The same is true for P(X = 10), P(X = sqrt(2)), P(X = π), and in gen­eral, the prob­a­bil­ity that X is equal to any ex­act num­ber is always zero, as an in­te­gral over a sin­gle point.

This is some­times de­scribed as coun­ter­in­tu­itive: surely, at any mea­sure­ment, X must be equal to some­thing, and thus its prob­a­bil­ity can­not be zero since its clearly hap­pened. It can be, of course, ar­gued that math­e­mat­i­cal prob­a­bil­ity is ab­stract func­tion that does not ex­actly map to our in­tu­itive un­der­stand­ing of prob­a­bil­ity, but in this case, I would ar­gue that it does.

What if X is the x-co­or­di­nate of a phys­i­cal ob­ject? If clas­si­cal physics are in ques­tion—for ex­am­ple, we pointed a nee­dle at a ran­dom point on a 10 cm ruler—then it can­not be a point ob­ject, and must have a nonzero size. Thus, we can mea­sure the prob­a­bil­ity of the 1 cm point ly­ing within the space the end of the nee­dle oc­cu­pies, a prob­a­bil­ity that is clearly defined and nonzero.

But even if we’re talk­ing about a point ob­ject, while it may well oc­cupy a definite and ex­act co­or­di­nate in clas­si­cal physics, we’ll never know what ex­actly it is. For one, our mea­sur­ing tools are not that pre­cise. But even if they had in­finite pre­ci­sion, state­ments like “X equals ex­actly 2.(0)” or “X equals ex­actly π” con­tain in­finite in­for­ma­tion, since they spec­ify all the dec­i­mal digits of the co­or­di­nate into in­finity. We would have an in­finite num­ber of mea­sure­ments to con­firm it. So while X may ob­jec­tively equal ex­actly 2 or π - again, un­der clas­si­cal physics—mea­sur­ers would never know it. At any given point, to mea­sur­ers, X would lie in an in­ter­val.

Then of course there is quan­tum physics, where it is liter­ally im­pos­si­ble for any phys­i­cal ob­ject, in­clud­ing point ob­jects, to have a definite co­or­di­nate with ar­bi­trary pre­ci­sion. In this case, the purely math­e­mat­i­cal no­tion that any ex­act value is an im­pos­si­ble event turns out (by co­in­ci­dence?) to match how the uni­verse ac­tu­ally works.

• This ac­tu­ally gets even worse. Con­sider for ex­am­ple a hy­po­thet­i­cal Bayesian ver­sion of Is­sac New­ton, try­ing to es­ti­mate what ex­po­nent k the ra­dius is raised to in F= GMm/​R^k. There’s an in­tu­ition that math­e­mat­i­cally sim­ple num­bers should be more likely, such as say “2”. A while ago jim­ran­domh and benel­liiot dis­cussed this with me. Ben sug­gested that in this sort of con­text you might just have a com­pli­cated dis­tri­bu­tion where part of the dis­tri­bu­tion arose from some­thing con­tin­u­ous and the other part arose from dis­crete prob­a­bil­ities for sim­ple num­bers. This seems to do a de­cent job cap­tur­ing our in­tu­ition but it seems to be very hard to ac­tu­ally use that sort of dis­tri­bu­tion.

• If New­ton tried to de­rive his law purely from em­piri­cal mea­sure­ments, then yes, he would never be ex­actly sure (ig­nor­ing gen­eral rel­a­tivity for a mo­ment) that the ex­po­nent is ex­actly 2. For all he would know, it could ac­tu­ally be 2.00000145...

But that would be like try­ing to de­rive the value of pi or the ex­po­nents in the Pythagorean the­o­rem by mea­sur­ing phys­i­cal cir­cles and tri­an­gles. If the law of grav­ity is de­rived from more gen­eral ax­ioms, then its form can be com­puted ex­actly pro­vided that these ax­ioms are cor­rect.

• Do you think that the Dirich­let Pro­cesses mod­els that ma­chine learn­ing peo­ple use might be rele­vant here? As I un­der­stand it, a DP prior says that the true prob­a­bil­ity dis­tri­bu­tion is a dis­crete prob­a­bil­ity dis­tri­bu­tion over some countable set of points, but you don’t know which set in ad­vance. So in the pos­te­rior, this can con­sis­tently as­sign some nonzero prob­a­bil­ity on a sin­gle point—in fact, if you do the math the pos­te­rior is very sim­ple, it’s a mix be­tween a DP and some finite prob­a­bil­ity mass on the val­ues that you did see.

• My min­i­mal knowl­edge base says that sounds po­ten­tially rele­vant. Un­for­tu­nately, I don’t know nearly enough about this sort of thing other than to make very vague, non-com­mit­tal re­marks.

• In sum­mary New­ton should as­sign prob­a­bil­ity 0 to the state­ment that his the­ory of rel­a­tivity is ex­actly cor­rect. This turns out to be the right thing to do.

• Huh? No. The prob­a­bil­ity shouldn’t be zero that he’s cor­rect. Even now there’s some very tiny prob­a­bil­ity that New­ton’s laws are ex­actly cor­rect. This chance is van­ish­ingly small but non-zero. More­over, your ar­gu­ment im­plies too much be­cause one could use the ex­act same logic for gen­eral rel­a­tivity.

• More­over, your ar­gu­ment im­plies too much be­cause one could use the ex­act same logic for gen­eral rel­a­tivity.

And it would be equally cor­rect.

• Ok. But even if you had a the­ory of quan­tum grav­ity that seemed to ex­plain all ob­served data your ar­gu­ment would still go through. If your ar­gu­ment is ac­cepted than any the­ory of ev­ery­thing would have to be as­signed zero prob­a­bil­ity of be­ing cor­rect no mat­ter how well it pre­dicted things. This seems wrong.

• “Should”? I would much rather be log­i­cally in­con­sis­tent, or bet that the ax­ioms of prob­a­bil­ity are mean­ingless or ir­rele­vant—which in rele­vant de­ci­sion the­o­retic prob­lems they tend to be—than give odds of in­finity to one.

• It may or may not be helpful to re­al­ize that in­fini­ties (in­clud­ing in­finites­i­mals) are merely a math­e­mat­i­cal ab­strac­tion. Every­thing you en­counter in the phys­i­cal world is finite. Thus, it’s not overly sur­pris­ing that some­thing ac­tu­ally hap­pens, even though a given math­e­mat­i­cal model of that some­thing as­signs it a zero prob­a­bil­ity.

That said, math­e­mat­i­cal de­scrip­tions that in­clude con­ti­nu­ity are ex­tremely con­ve­nient (life would be rather cum­ber­some if we had to use finite differ­ence calcu­lus in­stead of deriva­tives in all ap­pli­ca­tions).

It is a very com­mon ten­dency to iden­tify a phys­i­cal phe­nomenon with a par­tic­u­lar math­e­mat­i­cal model of it (one of the most abused mod­els is that of vir­tual par­ti­cles in par­ti­cle physics), but one would be rather less wrong by keep­ing in mind that an ab­strac­tion of an ob­ject is not the ob­ject it­self.

A nice (if fan­tas­ti­cal) de­scrip­tion of ob­jects vs mod­els can be found in the HPMoR chap­ter on par­tial trans­figu­ra­tion.

• Let X be a ran­dom vari­able over the in­ter­val [0, 10]. Then, by the defi­ni­tion of prob­a­bil­ity over con­tin­u­ous do­mains, P(X = 1) = 0.

Only if you have a con­tin­u­ous prob­a­bil­ity dis­tri­bu­tion over that do­main. It’s quite pos­si­ble to have a prob­a­bil­ity dis­tri­bu­tion with, for ex­am­ple, a point mass at 5 such that p(X=5)=0.5.

This is some­times de­scribed as coun­ter­in­tu­itive: surely, at any mea­sure­ment, X must be equal to some­thing, and thus its prob­a­bil­ity can­not be zero since its clearly hap­pened.

Others have an­swered this be­low, but there is an­other as­pect to this I’d like to dis­cuss. All data are dis­crete. When you mea­sure some­thing, your mea­sure­ment ap­para­tus is only ever go­ing to give you one of a dis­crete, finite set of val­ues. (I’m pretty sure about finite, but will­ing to be cor­rected). Any prob­a­bil­ity dis­tri­bu­tion over the pos­si­ble val­ues that you might mea­sure with your ap­para­tus can eas­ily satisfy p(X=x)>0 for all x.

Con­cretely, if you’re mea­sur­ing the length of some­thing with a ruler, you prob­a­bly just round to the near­est 1/​16th of an inch. This means there are only 12*16=192 pos­si­ble mea­sure­ments you can make, so you can cre­ate any num­ber of prob­a­bil­ity dis­tri­bu­tions of these points where each point has p(X=x)>0.

• I im­plic­itly meant a con­tin­u­ous dis­tri­bu­tion. Clar­ified that in the post now.

Con­cretely, if you’re mea­sur­ing the length of some­thing with a ruler, you prob­a­bly just round to the near­est 1/​16th of an inch.

As some­one who lives in the dan­ger­ous and un­charted part of the world called “out­side the US’, I pre­fer cen­time­ters. ;)

• This one isn’t even a mat­ter of ne­glect­ing to con­vert; it’s a cul­tural di­vide—while I ex­pect you knew what Matt meant, it’s en­tirely pos­si­ble he didn’t know how to trans­late it for you. Pre­sum­ably you don’t round to the near­est 1.5875 mil­lime­ters. What do met­ric users round to when mea­sur­ing lengths? Millime­ters? Those are lit­tle—even lit­tler than six­teenths of an inch! Do most met­ric rulers even mark them, or do they just mark halfway points be­tween cen­time­ter lines? I don’t know.

• Yes, mil­lime­ters are typ­i­cally marked, with a spe­cial mark half-way at 5mm. Once you’re be­yond 1m in length one might skip them, but even then rulers of­ten have them. Small things are nor­mally mea­sured in mil­lime­ters as well, though usu­ally some tol­er­ance is ex­pected. For ex­am­ple, one of my rings has a di­ame­ter of 21.7mm and was ad­ver­tised as such. Of course, if you don’t need this pre­ci­sion, you round to what­ever dec­i­mal place you care about and use the near­est unit (like in any sys­tem). I don’t think of mil­lime­ters as par­tic­u­larly tiny, more like the ba­sic unit of “smal­l­ness”.

(And I fully agree with lu­cid­fox. Im­pe­rial units are in­sane.)

• Huh, re­ally, that’s a cul­tural di­vide? I was taught how to do met­ric mea­sure­ments in ev­ery sci­ence class I took, and I knew how to use mil­lime­tres be­fore then be­cause I’ve never seen a ruler that didn’t have them marked. Is this truly un­com­mon knowl­edge in the US? o.o

• I’ve used met­ric rulers, in sci­ence classes mostly, but I don’t think I’ve used one in years. When I have to mea­sure things, I use a tape mea­sure, which only has inches marked.

• Huh, fas­ci­nat­ing. Even my cheap “gift from a job” tape mea­sure does met­ric, so this is news to me :)

• A lot of Amer­i­can rulers are marked in both inches and cen­time­ters, though I don’t know what the pro­por­tion is com­pared to rulers which are just marked in inches.

• What do met­ric users round to when mea­sur­ing lengths? Millime­ters?

Depends. In ca­sual use, typ­i­cally cen­time­ters. But yes, as mu­flax said, met­ric rulers have in­di­vi­d­ual mil­lime­ters marked, and typ­i­cally they mark half-cen­time­ters with slightly longer bars.

• As some­one who lives in the dan­ger­ous and un­charted part of the world called “out­side the US’, I pre­fer cen­time­ters.

Feel free to use cen­time­ters in your own ex­am­ples, then. But you’re not en­ti­tled to de­mand that US users do so.

• Feel free to use cen­time­ters in your own ex­am­ples, then. But you’re not en­ti­tled to de­mand that US users do so.

She didn’t. Matt said, in re­ply to lu­cid­fox, “you prob­a­bly just round to the near­est 1/​16ths of an inch”. Since she, in fact, would not round to such an ab­surd met­ric she pointed out what she would ac­tu­ally use. It is rather rude to de­clare or im­ply lu­cid­fox is ex­ceed­ing the bounds of what her sta­tus per­mits to cor­rect a false claim about her­self.

• You would have a point if lu­cid­fox had not writ­ten this post (in which a poster’s use of “miles per hour” is cited as one of the offenses), but in that case I wouldn’t have writ­ten the grand­par­ent ei­ther.

Con­text.

• “I pre­fer” with a smiley and some mild snark isn’t ex­actly a de­mand.

In any case, peo­ple are en­ti­tled to de­mand what­ever they want, they just aren’t en­ti­tled to get com­pli­ance.

Would it be worth hav­ing a con­ven­tion at LW that mea­sure­ments should be given in English and met­ric units?

• Would it be worth hav­ing a con­ven­tion at LW that mea­sure­ments should be given in English and met­ric units?

Make the con­ven­tion the use of the ex­ist­ing In­ter­na­tional Sys­tem of Units with other units hu­mored as parochial ec­cen­tric­i­ties. If folks par­tic­u­larly care they can re­ply with a con­ver­sion to the con­ven­tional unit.

• “I pre­fer” with a smiley and some mild snark isn’t ex­actly a de­mand.

The con­text of other com­ments and posts by the same user caused me to read it as hos­tile pas­sive-ag­gres­sion.

Would it be worth hav­ing a con­ven­tion at LW that mea­sure­ments should be given in English and met­ric units?

Seems un­nec­es­sary. The gen­eral con­ven­tion should be that peo­ple are en­ti­tled to em­ploy the con­ven­tions and ter­minol­ogy in use where they live, or that they them­selves are most fa­mil­iar with. I wouldn’t think of de­mand­ing that some­one in an­other coun­try talk about their pur­chas­ing habits in terms of US dol­lars, for ex­am­ple.

• Where did I de­mand any­thing?

• You are mi­s­un­der­stand­ing what prob­a­bil­ity means. A prob­a­bil­ity of 0 does not mean an event will never hap­pen, it means it will al­most surely not hap­pen. That is, for any finite pos­si­bil­ity one can think of, the prob­a­bil­ity of that event oc­cur­ing is less than that. This does not mean that the event can never oc­cur- as you sur­mise, oth­er­wise we could never ob­serve any re­sult! Ba­si­cally, in­finity is a bit weird.

• In­ter­est­ingly, the words “Al­most surely” also has a Wikipe­dia ar­ti­cle that rep­re­sents some of these math­e­mat­i­cal con­cepts, and there are also re­lated ar­ti­cles on “Al­most All” and “Al­most Every­where.”

• When I read thakll’s post, I thought they in­deed meant the math­e­mat­i­cal defi­ni­tion of “al­most surely”. The do­main of an event with prob­a­bil­ity zero is in­deed “al­most nowhere” in the rigor­ous sense, since it is a mea­sure-zero set.

• Yes, thats the con­cept to which I am refer­ing. The con­cept comes from mea­sure the­ory. If you’re fa­mil­iar with I’m not sure why you’re con­fused about prob­a­bil­ity 0 events. Or are you? Per­haps I’m mis-read­ing your ar­ti­cle.

• Yes, thats the con­cept to which I am refer­ing. The con­cept comes from mea­sure the­ory. If you’re fa­mil­iar with I’m not sure why you’re con­fused about prob­a­bil­ity 0 events.

I think her con­fu­sion comes from the fact that if your prior prob­a­bil­ity that an event hap­pened is 0, no amount of ev­i­dence will con­vince you that it did hap­pen. Sup­pose your prior prob­a­bil­ity that some ran­dom vari­able X is equal to 1 is P(X=1)=0. Now sup­pose you find out that ac­tu­ally, X=1. Then us­ing Baye’s rule:

P(X=1|X=1) = P(X=1|X=1)*P(X=1) /​ denominator

I’ll leave the de­nom­i­na­tor out be­cause the nu­mer­a­tor is 0 (the de­nom­i­na­tor won’t be 0), so P(X=1|X=1)=0, which makes no sense.

I don’t claim the calcu­la­tion I did above is cor­rect—I re­al­ize con­di­tional prob­a­bil­ities a fraught with difficul­ties, and I prob­a­bly vi­o­lated some rule I don’t know about or have for­got­ten from my mea­sure the­ory class. How­ever, this does give you in­tu­ition for why lu­cid­fox or per­haps some­one else would be con­fused de­spite hav­ing knowl­edge of mea­sure the­ory (if this is in fact why it was con­fus­ing to him/​her).

• No finite amount of ev­i­dence will con­vince you. I can be con­vinced of in­finitely un­likely things by an in­finite amount of ev­i­dence just fine.

And if we’re talk­ing about a situ­a­tion (like real life!) where you can’t ex­pect to re­ceive an in­finite amount of ev­i­dence, then we shouldn’t be us­ing prob­a­bil­ities of 0 or 1, ei­ther.

• Her con­fu­sion.

• My in­tu­ition, for what it’s worth, works more eas­ily with the bi­nary ex­pan­sion of the num­bers than their in­ter­pre­ta­tion as phys­i­cal quan­tities.

From that per­spec­tive, “X equals ex­actly pi” would nor­mally be as­signed finite prob­a­bil­ity be­cause pi is a com­putable num­ber with finite Kol­mogorov com­plex­ity; there is a nonzero chance that two pro­cesses will gen­er­ate the same in­finite but com­putable bit stream.

But “X equals ex­actly Y” where Y is a ran­dom in­com­putable num­ber, is in­deed in­finitely im­prob­a­ble, be­cause it amounts to a state­ment that in­finitely many coin flips will come out a par­tic­u­lar way; the prob­a­bil­ity is 0.5^in­finity, which clearly con­verges to zero.

• An uniform dis­tri­bu­tion over the real in­ter­val e.g. [0,1] is pos­si­ble. An al­gorithm is the fair coin toss­ing for each bi­nary place 0 or 1. In the case of all tails you have 0. In the case of all heads—it’s 1. An uniform prob­a­bil­ity dis­tri­bu­tion with P(x)=0 for ev­ery x. They are not im­pos­si­ble, only 0 likely.

But there is no con­stant prob­a­bil­ity dis­tri­bu­tion for only the ra­tio­nal num­bers from this in­ter­val. Or from any other, for that mat­ter. Nor there is an uniform prob­a­bil­ity dis­tri­bu­tion for all the nat­u­rals. Or for any in­finite sub­set of nat­u­rals.

• Then of course there is quan­tum physics, where it is liter­ally im­pos­si­ble for any phys­i­cal ob­ject, in­clud­ing point ob­jects, to have a definite co­or­di­nate with ar­bi­trary pre­ci­sion.

Con­ven­tion­ally, in the mul­ti­verse, ev­ery­thing is pre­cisely some­where. What is difficult is find­ing out ex­actly where things are.

• How so? We can re­gard each point within a cloud of am­pli­tude as a ‘sep­a­rate world’ in one sense, but I un­der­stood that points less than a cer­tain ‘dis­tance’ away from each other will af­fect each oth­ers’ fu­tures in a mean­ingful way. I thought there ex­ists no fact of the mat­ter one sec­ond later as to which of those ‘wor­lds’ I came from.

• Given that the sum (0+0+0...) = 0, wouldn’t that im­ply that P(any value at all) = 0, and that you ac­tu­ally can­not pro­duce a re­sult in this sys­tem?

Which, ad­mit­tedly, strikes me as a perfectly rea­son­able re­sult, given you can’t ac­tu­ally have a con­tin­u­ous dis­tri­bu­tion in re­al­ity, and I’m not aware of any ran­dom­iza­tion method that could ac­tu­ally meet these re­quire­ments.

• you can’t ac­tu­ally have a con­tin­u­ous dis­tri­bu­tion in reality

The crush­ing ma­jor­ity of ev­i­dence sug­gests that con­tin­u­ous dis­tri­bu­tions are what re­al­ity is built on.

The prob­lem with in­te­grat­ing 0 to get P(any­thing) = 0 is that you can’t switch the or­der in which you take limits—the limit that gives you P(X=x) = 0 is out­side the in­te­gral, and the in­te­gral it­self be­haves like a limit (re­mem­ber Rie­mann sums?). So if you switch the or­der of the limits by in­te­grat­ing 0, you have com­mit­ted an ille­gal op­er­a­tion.

• Augh, my mis­take. This is why I am cur­rently do­ing a math re­fresher. Thank you :)

• Yeah, it makes all sorts of sense to just set things to val­ues, but once you start us­ing limits that breaks things. Stupid limits.

• The crush­ing ma­jor­ity of ev­i­dence sug­gests that con­tin­u­ous dis­tri­bu­tions are what re­al­ity is built on.

Not re­ally. Lots of dis­crete things look con­tin­u­ous—if you stand far enough back.

• Alright, I’m cu­ri­ous. Are you claiming that the prob­a­bil­ity dis­tri­bu­tions that come out of quan­tum me­chan­ics are dis­crete?

• If you are not fa­mil­iar with the idea, per­haps, see: http://​​en.wikipe­dia.org/​​wiki/​​Digi­tal_physics

• I am fa­mil­iar with the idea. I just don’t see where the ev­i­dence is. Sure, quan­tiz­ing space fits well with there be­ing a max­i­mum en­tropy of space, but this seems like a clas­si­cal solu­tion to a very non-clas­si­cal prob­lem, and it elimi­nates rel­a­tivity in the pro­cess.

• You are the one claiming that “the crush­ing ma­jor­ity of ev­i­dence” op­poses dis­crete the­o­ries.

My po­si­tion is more that we can barely see any­thing down that far, and so we have very lit­tle ex­per­i­men­tal ev­i­dence about whether the uni­verse is con­tin­u­ous or dis­crete.

In the ab­sence of ev­i­dence, as­sum­ing un­com­putable physics seems to be counter-in­tu­itive to me. We don’t know of any­thing else that is un­com­putable.

• We don’t know of any­thing else that is un­com­putable.

We’re talk­ing about the en­tire uni­verse here, so it would be just as valid to say we don’t know of any­thing else that is (dis­cretely) com­putable.

And yeah, there is always some level of dis­crete­ness that would have no im­pact on our ob­ser­va­tions, just like there is some level of teapots in the as­ter­oid belt that would have no im­pact on our ob­ser­va­tions. You’re right that that sort of thing isn’t ruled out by the ev­i­dence, so my state­ment was wrong.

• Teapots in the as­ter­oid belt are con­trary to Oc­cam’s ra­zor. The situ­a­tion with dis­crete physics is very differ­ent. Science has a long his­tory of show­ing that ap­par­ently-con­tin­u­ous phe­nom­ena ac­tu­ally turn out to be grainy on a smaller scale.

• P(X = ex­act value) = 0: Is it re­ally coun­ter­in­tu­itive?

Not coun­ter­in­tu­itive, just an­noy­ing. Also used as an ex­cuse to con­clude silly things by play­ing with in­fini­ties (as you al­lude to).