A sum­mary of Sav­age’s found­a­tions for prob­ab­il­ity and util­ity.

Edit: I think the P2c I wrote ori­gin­ally may have been a bit too weak; fixed that. Never­mind, recheck­ing, that wasn’t needed.

More ed­its (now con­sol­id­ated): Edited non­tri­vi­al­ity note. Edited to­tal­ity note. Ad­ded in the defin­i­tion of nu­mer­ical prob­ab­il­ity in terms of qual­it­at­ive prob­ab­il­ity (though not the proof that it works). Also slight cla­ri­fic­a­tions on im­plic­a­tions of P6′ and P6‴ on par­ti­tions into equi­val­ent and al­most-equi­val­ent parts, re­spect­ively.

One very late edit, June 2: Even though we don’t get count­able ad­dit­iv­ity, we still want a σ-al­gebra rather than just an al­gebra (this is needed for some of the proofs in the “par­ti­tion con­di­tions” sec­tion that I don’t go into here). Also noted nonempti­ness of gambles.

The idea that ra­tional agents act in a man­ner iso­morphic to ex­pec­ted-util­ity max­im­izers is of­ten used here, typ­ic­ally jus­ti­fied with the Von Neu­mann-Mor­gen­stern the­orem. (The last of Von Neu­mann and Mor­gen­stern’s ax­ioms, the in­de­pend­ence ax­iom, can be groun­ded in a Dutch book ar­gu­ment.) But the Von Neu­mann-Mor­gen­stern the­orem as­sumes that the agent already meas­ures its be­liefs with (fi­nitely ad­dit­ive) prob­ab­il­it­ies. This in turn is of­ten jus­ti­fied with Cox’s the­orem (valid so long as we as­sume a “large world”, which is im­plied by e.g. the ex­ist­ence of a fair coin). But Cox’s the­orem as­sumes as an ax­iom that the plaus­ib­il­ity of a state­ment is taken to be a real num­ber, a very large as­sump­tion! I have also seen this jus­ti­fied here with Dutch book ar­gu­ments, but these all seem to as­sume that we are already us­ing some no­tion of ex­pec­ted util­ity max­im­iz­a­tion (which is not only some­what cir­cu­lar, but also a con­sid­er­ably stronger as­sump­tion than that plaus­ib­il­it­ies are meas­ured with real num­bers).

There is a way of ground­ing both (fi­nitely ad­dit­ive) prob­ab­il­ity and util­ity sim­ul­tan­eously, how­ever, as de­tailed by Leonard Sav­age in his Found­a­tions of Stat­ist­ics (1954). In this art­icle I will state the ax­ioms and defin­i­tions he gives, give a sum­mary of their lo­gical struc­ture, and sug­gest a slight modi­fic­a­tion (which is equi­val­ent math­em­at­ic­ally but slightly more philo­soph­ic­ally sat­is­fy­ing). I would also like to ask the ques­tion: To what ex­tent can these ax­ioms be groun­ded in Dutch book ar­gu­ments or other more ba­sic prin­ciples? I warn the reader that I have not worked through all the proofs my­self and I sug­gest simply find­ing a copy of the book if you want more de­tail.

Peter Fish­burn later showed in Util­ity The­ory for De­cision Mak­ing (1970) that the ax­ioms set forth here ac­tu­ally im­ply that util­ity is bounded.

(Note: The ver­sions of the ax­ioms and defin­i­tions in the end pa­pers are for­mu­lated slightly dif­fer­ently from the ones in the text of the book, and in the 1954 ver­sion have an er­ror. I’ll be us­ing the ones from the text, though in some cases I’ll re­for­mu­late them slightly.)

Prim­it­ive no­tions; pref­er­ence given a set of states

We will use the fol­low­ing prim­it­ive no­tions. Firstly, there is a set S of “states of the world”; the ex­act cur­rent state of the world is un­known to the agent. Se­condly, there is a set F of “con­sequences”—things that can hap­pen as a res­ult of the agent’s ac­tions. Ac­tions or acts will be in­ter­preted as func­tions f:S→F, as two ac­tions which have the same con­sequences re­gard­less of the state of the world are in­dis­tin­guish­able and hence con­sidered equal. While the agent may be un­cer­tain as to the ex­act res­ults of its ac­tions, this can be fol­ded into his un­cer­tainty about the state of the world. Fin­ally, we in­tro­duces as prim­it­ive a re­la­tion ≤ on the set of ac­tions, in­ter­preted as “is not pre­ferred to”. I.e., f≤g means that given a choice between ac­tions f and g, the agent will either prefer g or be in­dif­fer­ent. As usual, sets of states will be re­ferred to as “events”, and for the usual reas­ons we may want to re­strict the set of ad­miss­ible events to a boolean σ-sub­al­gebra of ℘(S), though I don’t know if that’s really ne­ces­sary here (Sav­age doesn’t seem to do so, though he does dis­cuss it some).

In any case, we then have the fol­low­ing ax­iom:

P1. The re­la­tion ≤ is a total pre­order.

The in­tu­ition here for trans­it­iv­ity is pretty clear. For to­tal­ity, if the agent is presen­ted with a choice of two acts, it must choose one of them! Or be in­dif­fer­ent. Per­haps we could in­stead use a par­tial pre­order (or or­der?), though this would give us two dif­fer­ent in­dis­tin­guish­able fla­vors of in­dif­fer­ence, which seems prob­lem­atic. But this could be use­ful if we wanted in­trans­it­ive in­dif­fer­ence. So long as in­dif­fer­ence is trans­it­ive, though, we can col­lapse this into a total pre­order.

As usual we can then define f≥g, f<g (mean­ing “it is false that g≤f”), and g>f. I will use f≡g to mean “f≤g and g≤f”, i.e., the agent is in­dif­fer­ent between f and g. (Sav­age uses an equals sign with a dot over it.)

Note that though ≤ is defined in terms of how the agent chooses when presen­ted with two op­tions, Sav­age later notes that there is a con­struc­tion of W. Al­len Wal­lis that al­lows one to ad­duce the agent’s pref­er­ence or­der­ing among a fi­nite set of more than two op­tions (mod­ulo in­dif­fer­ence): Sim­ply tell the agent to rank the op­tions given, and that af­ter­ward, two of them will be chosen uni­formly at ran­dom, and it will get whichever one it ranked higher.

The second ax­iom states that if two ac­tions have the same con­sequences in some situ­ation, just what that equal con­sequence is does not af­fect their re­l­at­ive or­der­ing:

P2. Sup­pose f≤g, and B is a set of states such f and g agree on B. If f’ and g’ are an­other pair of acts which, out­side of B, agree with f and g re­spect­ively, and on B, agree with each other, then f’≤g’.

In other words, to de­cide between two ac­tions, only the cases where they ac­tu­ally have dif­fer­ent con­sequences mat­ter.

With this ax­iom, we can now define:

D1. We say “f≤g given B” to mean that if f’ and g’ are ac­tions such that f’ agrees with f on B, g’ agrees with g on B, and f’ and g’ agree with each other out­side of B, then f’≤g’.

Due to ax­iom P2, this is well-defined.

Here is where I would like to sug­gest a small modi­fic­a­tion to this setup. The no­tion of “f≤g given B” is im­pli­citly taken to be how the agent makes de­cisions if it knows that B ob­tains. However it seems to me that we should ac­tu­ally take “f≤g given B”, rather than f≤g, to be the prim­it­ive no­tion, ex­pli­citly in­ter­peted as “the agent does not prefer f to g if it knows that B ob­tains”. The agent al­ways has some state of prior know­ledge and this way we have ex­pli­citly spe­cified de­cisions un­der a given state of know­ledge—the acts we are con­cerned with—as the basis of our the­ory. Rather than de­fin­ing f≤g given B in terms of ≤, we can define f≤g to mean “f≤g given S” and then add ad­di­tional ax­ioms gov­ern­ing the re­la­tion between “≤ given B” for vary­ing B, which in Sav­age’s setup are the­or­ems or part of the defin­i­tion D1.

(Spe­cific­ally, I would modify P1 and P2 to talk about “≤ given B” rather than ≤, and add the fol­low­ing the­or­ems as ax­ioms:

P2a. If f and g agree on B, then f≡g given B.

P2b. If B⊆C, f≤g given C, and f and g agree out­side B, then f≤g given B.

P2c. If B and C are dis­joint, and f≤g given B and given C, then f≤g given B∪C.

This is a little un­wieldy and per­haps there is an easier way—these might not be min­imal. But they do seem to be suf­fi­cient.)

In any case, re­gard­less which way we do it, we’ve now es­tab­lished the no­tion of pref­er­ence given that a set of states ob­tains, as well as pref­er­ence without ad­di­tional know­ledge, so hence­forth I’ll freely use both as Sav­age does without wor­ry­ing about which makes a bet­ter found­a­tion, since they are equi­val­ent.

Or­der­ing on preferences

The next defin­i­tion is simply to note that we can sens­ibly talk about f≤b, b≤f, b≤c where here b and c are con­sequences rather than ac­tions, simply by in­ter­pret­ing con­sequences as con­stant func­tions. (So the agent does have a pref­er­ence or­der­ing on con­sequences, it’s just in­duced from its or­der­ing on ac­tions. We do it this way since it’s its choices between ac­tions we can ac­tu­ally see.)

However, the third ax­iom re­ifies this in­duced or­der­ing some­what, by de­mand­ing that it be in­vari­ant un­der gain­ing new in­form­a­tion.

P3′. If b and c are con­sequences and b≤c, then b≤c given any B.

Thus the fact that the agent may change pref­er­ences given new in­form­a­tion, just re­flects its un­cer­tainty about the res­ults of their ac­tions, rather than ac­tu­ally pre­fer­ring dif­fer­ent con­sequences in dif­fer­ent states (any such pref­er­ences can be done away with by simply ex­pand­ing the set of con­sequences).

Really this is not strong enough, but to state the ac­tual P3 we will first need a defin­i­tion:

D3. An event B is said to be null if f≤g given B for any ac­tions f and g.

Null sets will cor­res­pond to sets of prob­ab­il­ity 0, once nu­mer­ical prob­ab­il­ity is in­tro­duced. Prob­ab­il­ity here is to be ad­duced from the agent’s pref­er­ences, so we can­not dis­tin­guish between “the agent is cer­tain that B will not hap­pen” and “if B ob­tains, the agent doesn’t care what hap­pens”.

Now we can state the ac­tual P3:

P3. If b and c are con­sequences and B is not null, then b≤c given B if and only if b≤c.

P3′, by con­trast, al­lowed some col­lapsing of pref­er­ence on gain­ing new in­form­a­tion; here we have dis­al­lowed that ex­cept in the case where the new in­form­a­tion is enough to col­lapse all pref­er­ences en­tirely (a sort of “end of the world” or “fatal er­ror” scen­ario).

Qu­al­it­at­ive probability

We’ve in­tro­duced above the idea of “prob­ab­il­ity 0” (and hence im­pli­citly prob­ab­il­ity 1; ob­serve that “¬B is null” is equi­val­ent to “for any f and g, f≤g given B if and only if f≤g”). Now we want to ex­pand this to prob­ab­il­ity more gen­er­ally. But we will not ini­tially get num­bers out of it; rather we will first just get an­other total pre­order­ing, A≤B, “A is at most as prob­able as B”.

How can we de­term­ine which of two events the agent thinks is more prob­able? Have it bet on them, of course! First, we need a non­tri­vi­al­ity ax­iom so it has some things to bet on.

P5. There ex­ist con­sequences b and c such that b>c.

(I don’t know what the res­ults would be if in­stead we used the weaker non­tri­vi­al­ity ax­iom “there ex­ist ac­tions f and g such that f<g”, i.e., “S is not null”. That we even­tu­ally get that ex­pec­ted util­ity for com­par­ing all acts sug­gests that this should work, but I haven’t checked.)

So let us now con­sider a class of ac­tions which I will call “wagers”. (Sav­age doesn’t have any spe­cial term for these.) Define “the wager on A for b over c” to mean the ac­tion that, on A, re­turns b, and oth­er­wise, re­turns c. De­note this by wA,b,c. Then we pos­tu­late:

P4. Let b>b’ be a pair of con­sequences, and c>c’ an­other such pair. Then for any events A and B, wA,b,b’≤wB,b,b’ if and only if wA,c,c’≤wB,c,c’.

That is to say, if the agent is given the choice between bet­ting on event A and bet­ting on event B, and the prize and booby prize are the same re­gard­less of which it bets on, then it shouldn’t just mat­ter just what the prize and booby prize are—it should just bet on whichever it thinks is more prob­able. Hence we can define:

D4. For events A and B, we say “A is at most as prob­able as B”, de­noted A≤B, if wA,b,b’≤wB,b,b’, where b>b’ is a pair of con­sequences.

By P4, this is well-defined. We can then show that the re­la­tion on events ≤ is a total pre­order, so we can use the usual nota­tion when talk­ing about it (again, ≡ will de­note equi­val­ence).

In fact, ≤ is not only a total pre­order, but a qual­it­at­ive prob­ab­il­ity:

  1. ≤ is a total preorder

  2. ∅≤A for any event A

  3. ∅<S

  4. Given events B, C, and D with D dis­joint from B and C, then B≤C if and only if B∪D≤C∪D.

(There is no con­di­tion cor­res­pond­ing to count­able ad­dit­iv­ity; as men­tioned above, we simply won’t get count­able ad­dit­iv­ity out of this.) Note also that un­der this, A≡∅ if and only if A is null in the earlier sense. Also, we can define “A≤B given C” by com­par­ing the wagers given C; this is equi­val­ent to the con­di­tion that A∩C≤B∩C. This re­la­tion is too a qual­it­at­ive prob­ab­il­ity.

Par­ti­tion con­di­tions and nu­mer­ical probability

In or­der to get real num­bers to ap­pear, we are of course go­ing to have to make some sort of Archimedean as­sump­tion. In this sec­tion I dis­cuss what some of these look like and then ul­ti­mately state P6, the one Sav­age goes with.

First, defin­i­tions. We will be con­sid­er­ing fi­nitely-ad­dit­ive prob­ab­il­ity meas­ures on the set of states, i.e. a func­tion P from the set of events to the in­ter­val [0,1] such that P(S)=1, and for dis­joint B and C, P(B∪C)=P(B)+P(C). We will say “P agrees with ≤” if for every A and B, A≤B if and only if P(A)≤P(B); and we will say “P al­most agrees with ≤” if for every A and B, A≤B im­plies P(A)≤P(B). (I.e., in the lat­ter case, nu­mer­ical prob­ab­il­ity is al­lowed to col­lapse some dis­tinc­tions between events that the agent might not ac­tu­ally be in­dif­fer­ent between.)

We’ll be con­sid­er­ing here par­ti­tions of the set of states S. We’ll say a par­ti­tion of S is “uni­form” if the parts are all equi­val­ent. More gen­er­ally we’ll say it is “al­most uni­form” if, for any r, the union of any r parts is at most as prob­able as the union of any r+1 parts. (This is us­ing ≤, re­mem­ber; we don’t have nu­mer­ical prob­ab­il­it­ies yet!) (Note that any uni­form par­ti­tion is al­most uni­form.) Then it turns out that the fol­low­ing are equi­val­ent:

  1. There ex­ist al­most-uni­form par­ti­tions of S into ar­bit­rar­ily large num­bers of parts.

  2. For any B>∅, there ex­ists a par­ti­tion of S with each part less prob­able than B.

  3. There ex­ists a (ne­ces­sar­ily unique) fi­nitely ad­dit­ive prob­ab­il­ity meas­ure P that al­most agrees with ≤, which has the prop­erty that for any B and any 0≤λ≤1, there is a C⊆B such that P(C)=λP(B).

(Def­in­itely not go­ing into the proof of this here. However, the ac­tual defin­i­tion of the nu­mer­ical prob­ab­il­ity P(A) is not so com­plic­ated: Let k(A,n) de­note the largest r such that there ex­ists an al­most-uni­form par­ti­tion of S into n parts, for which there is some union of r parts, C, such that C≤A. Then the se­quence k(A,n)/​n al­ways con­verges, and we can define P(A) to be its limit.)

So we could use this as our 6th ax­iom:

P6‴. For any B>∅, there ex­ists a par­ti­tion of S with each part less prob­able than B.

Sav­age notes that other au­thors have as­sumed the stronger

P6″. There ex­ist uni­form par­ti­tions of S into ar­bit­rar­ily large num­bers of parts.

since there’s an ob­vi­ous jus­ti­fic­a­tion for this: the ex­ist­ence of a fair coin! If a fair coin ex­ists, then we can gen­er­ate a uni­form par­ti­tion of S into 2n parts simply by flip­ping it n times and con­sid­er­ing the res­ult. We’ll ac­tu­ally end up as­sum­ing some­thing even stronger than this.

So P6‴ does get us nu­mer­ical prob­ab­il­it­ies, but they don’t ne­ces­sar­ily re­flect all of the qual­it­at­ive prob­ab­il­ity; P6‴ is only strong enough to force al­most agree­ment. Though it is stronger than that when ∅ is in­volved—it does turn out that P(B)=0 if and only if B≡∅. (And hence also P(B)=1 if and only if B≡S.) But more gen­er­ally it turns out that P(B)=P(C) if and only if B and C are “al­most equi­val­ent”, which I will de­note B≈C (Sav­age uses a sym­bol I haven’t seen else­where), which is defined to mean that for any E>∅ dis­joint from B, B∪E≥C, and for any E>∅ dis­joint from C, C∪E≥B.

(It’s not ob­vi­ous to me that ≈ is in gen­eral an equi­val­ence re­la­tion, but it cer­tainly is in the pres­ence of P6‴; Sav­age seems to use this im­pli­citly. Note also that an­other con­sequence of P6‴ is that for any n there ex­ists a par­ti­tion of S into n al­most-equi­val­ent parts; such a par­ti­tion is ne­ces­sar­ily al­most-uni­form.)

However the fol­low­ing stronger ver­sion of P6‴ gets rid of this dis­tinc­tion:

P6′. For any B>C, there ex­ists a par­ti­tion of S, each part D of which sat­is­fies C∪D<B.

(Ob­serve that P6‴ is just P6′ for C=∅.) Under P6′, al­most equi­val­ence is equi­val­ence, and so nu­mer­ical prob­ab­il­ity agrees with qual­it­at­ive prob­ab­il­ity, and we fi­nally have what we wanted. (So by earlier, P6′ im­plies P6″, not just P6‴. Indeed by above it im­plies the ex­ist­ence of uni­form par­ti­tions into n parts for any n, not just ar­bit­rar­ily large n.)

In ac­tu­al­ity, Sav­age as­sumes an even stronger ax­iom, which is needed to get util­ity and not just prob­ab­il­ity:

P6. For any acts g<h, and any con­sequence b, there is a par­ti­tion of S such that if g is mod­i­fied on any one part to be con­stantly b there, we would still have g<h; and if h is mod­i­fied on any one part to be con­stantly b there, we would also still have g<h.

Ap­ply­ing P6 to wagers yields the weaker P6′.

We can now also get con­di­tional prob­ab­il­ity—if P6′ holds, it also holds for the pre­order­ings “≤ given C” for non-null C, and hence we can define P(B|C) to be the prob­ab­il­ity of B un­der the quant­it­at­ive prob­ab­il­ity we get cor­res­pond­ing to the qual­it­at­ive prob­ab­ilty “≤ given C”. Using the unique­ness of agree­ing prob­ab­il­ity meas­ures, it’s easy to check that in­deed, P(B|C)=P(B∩C)/​P(C).

Util­ity for fi­nite gambles

Now that we have nu­mer­ical prob­ab­il­ity, we can talk about fi­nite gambles. If we have con­sequences b1, …, bn, and prob­ab­il­it­ies λ1, …, λn sum­ming to 1, we can con­sider the gamble ∑λibi, rep­res­en­ted by any ac­tion which yields b1 with prob­ab­il­ity λ1, b2 with prob­ab­il­ity λ2, etc. (And with prob­ab­il­ity 0 does any­thing; we don’t care about events with prob­ab­il­ity 0.) Note that by above such an ac­tion ne­ces­sar­ily ex­ists. It can be proven that any two ac­tions rep­res­ent­ing the same gamble are equi­val­ent, and hence we can talk about com­par­ing gambles. We can also sens­ibly talk about mix­ing gambles—tak­ing ∑λifi where the fi are fi­nite gambles, and the λi are prob­ab­il­it­ies sum­ming to 1 - in the ob­vi­ous fash­ion.

With these defin­i­tions, it turns out that Von Neu­mann and Mor­gen­stern’s in­de­pend­ence con­di­tion holds, and, us­ing ax­iom P6, Sav­age shows that the con­tinu­ity (i.e. Archimedean) con­di­tion also holds, and hence there is in­deed a util­ity func­tion, a func­tion U:F→R such that for any two fi­nite gambles rep­res­en­ted by f and g re­spect­ively, f≤g if and only if the ex­pec­ted util­ity of the first gamble is less than or equal to that of the second. Fur­ther­more, any two such util­ity func­tions are re­lated via an in­creas­ing af­fine trans­form­a­tion.

We can also take ex­pec­ted value know­ing that a given event C ob­tains, since we have nu­mer­ical prob­ab­il­ity; and in­deed this agrees with the pref­er­ence or­der­ing on gambles given C.

Ex­pec­ted util­ity in gen­eral and bounded­ness of utility

Fin­ally, Sav­age shows that if we as­sume one more ax­iom, P7, then we have that for any es­sen­tially bounded ac­tions f and g, we have f≤g if and only if the ex­pec­ted util­ity of f is at most that of g. (It is pos­sible to define in­teg­ra­tion with re­spect to a fi­nitely ad­dit­ive meas­ure sim­il­arly to how one does with re­spect to a count­ably ad­dit­ive meas­ure; the res­ult is lin­ear and mono­tonic but doesn’t sat­isfy con­ver­gence prop­er­ties.) Sim­il­arly with re­spect to a given event C.

The ax­iom P7 is:

P7. If f and g are acts and B is an event such that f≤g(s) given B for every s∈B, then f≤g given B. Sim­il­arly, if f(s)≤g given B for every s in B, then f≤g given B.

So this is just an­other vari­ant on the “sure-thing prin­ciple” that I earlier labeled P2c.

Now in fact it turns out as men­tioned above that P7, when taken to­gether with the rest, im­plies that util­ity is bounded, and hence that we do in­deed have that for any f and g, f≤g if and only if the ex­pec­ted util­ity of f is at most that of g! This is due to Peter Fish­burn and postdates the first edi­tion of Found­a­tions of Stat­ist­ics, so in there Sav­age simply notes that it would be nice if this worked for f and g not ne­ces­sar­ily es­sen­tially bounded (so long as their ex­pec­ted val­ues ex­ist, and al­low­ing them to be ±∞), but that he can’t prove this, and then adds a foot­note giv­ing a ref­er­ence for bounded util­ity. (Though he does prove us­ing P7 that if you have two acts f and g such that f,g≤b for all con­sequences b, then f≡g; sim­il­arly if f,g≥b for all b. Ac­tu­ally, this is a key lemma in prov­ing that util­ity is bounded; Fish­burn’s proof works by show­ing that if util­ity were un­boun­ded, you could con­struct two ac­tions that con­tra­dict this.)

Of course, if you really don’t like the con­clu­sion that util­ity is bounded, you could throw out ax­iom 7! It’s pretty in­tu­it­ive, but it’s not clear that ig­nor­ing it could ac­tu­ally get you Dutch-booked. After all, the first 6 ax­ioms are enough to handle fi­nite gambles, 7 is only needed for more gen­eral situ­ations. So long as your Dutch bookie is lim­ited to fi­nite gambles, you don’t need this.

Ques­tions on fur­ther justification

So now that I’ve laid all this out, here’s the ques­tion I ori­gin­ally meant to ask: To what ex­tent can these ax­ioms be groun­ded in more ba­sic prin­ciples, e.g. Dutch book ar­gu­ments? It seems to me that most of these are too ba­sic for that to ap­ply—Dutch book ar­gu­ments need more work­ing in the back­ground. Still, it seems to me ax­ioms P2, P3, and P4 might plaus­ibly be groun­ded this way, though I have not yet at­temp­ted to fig­ure out how. P7 pre­sum­ably can’t, for the reas­ons noted in the pre­vi­ous sec­tion. P1 I as­sume is too ba­sic. P5 ob­vi­ously can’t (if the agent doesn’t care about any­thing, that’s its own prob­lem).

P6 is an Archimedean con­di­tion. Typ­ic­ally I’ve seen those (spe­cific­ally Von Neu­mann and Mor­gen­stern’s con­tinu­ity con­di­tion) jus­ti­fied on this site with the idea that in­fin­ites­im­als will never be rel­ev­ant in any prac­tical situ­ation—if c has only in­fin­ites­im­ally more util­ity than b, the only case when the dis­tinc­tion would be rel­ev­ant is if the prob­ab­il­it­ies of ac­com­plish­ing them were ex­actly equal, which is not real­istic. I’m guess­ing in­fin­ites­imal prob­ab­il­it­ies can prob­ably be done away with in a sim­ilar man­ner?

Or are these not good ax­ioms in the first place? You all are more fa­mil­iar with these sorts of things than me. Ideas?