P: 0 ⇐ P ⇐ 1

Part of The Contrarian Sequences.

Reply to infinite certainty and 0 and 1 are not probabilities.

Introduction

In infinite certainty, Eliezer makes the argument that you can’t ever be absolutely sure of a proposition. That is an argument I disagreed with for a long time, but due to Akrasia acedia, I never got around to writing it. I think I have a more coherent counter argument now, and would present it below. Because the post I am replying to and infinite certainty are linked, I address both of them in this post.

This doesn’t mean, though, that I have absolute confidence that 2 + 2 = 4. See the previous discussion on how to convince me that 2 + 2 = 3, which could be done using much the same sort of evidence that convinced me that 2 + 2 = 4 in the first place. I could have hallucinated all that previous evidence, or I could be misremembering it. In the annals of neurology there are stranger brain dysfunctions than this.

This is true. That a statement is true does not mean that you have absolute confidence in the veracity of the statement. It is possible that you may have hallucinated everything.

Suppose you say that you’re 99.99% confident that 2 + 2 = 4. Then you have just asserted that you could make 10,000 independent statements, in which you repose equal confidence, and be wrong, on average, around once.

I am not so sure of this. If I have X% confidence in a belief, and I am well calibrated, then if there were K statements for which I said I have X% confidence in, then you expect that ((100-X)/​100)*K of those statements would be wrong, and the remainder would be right. It does not follow that if I have X% confidence in a belief that I can make K statements in which I repose equal confidence, and be wrong only ((100-X)/​100)*K times.

It’s something like X% confidence (implies) if you made K statements then ((100-X)/​100)*K of those statements would be wrong.

A well calibrated agent does not have to be able to make K with only ((100-X)/​100)*K wrong those statements for them to possess X% confidence in the proposition. It only indicates that in a hypothetical world in which they did make K statements, if they were well calibrated, only ((100-X)/​100)*K of those statements would be wrong. To assert that a well calibrated agent must be able to make those statements before they can have X% confidence, is to establish the hypothetical as a given fact—either a honest mistake, or deliberate malice.

As for the notion that you could get up to 100% confidence in a mathematical proposition—well, really now! If you say 99.9999% confidence, you’re implying that you could make one million equally fraught statements, one after the other, and be wrong, on average, about once. That’s around a solid year’s worth of talking, if you can make one assertion every 20 seconds and you talk for 16 hours a day.

Assert 99.9999999999% confidence, and you’re taking it up to a trillion. Now you’re going to talk for a hundred human lifetimes, and not be wrong even once?

Assert a confidence of (1—1/​googolplex) and your ego far exceeds that of mental patients who think they’re God.

And a googolplex is a lot smaller than even relatively small inconceivably huge numbers like 3^^^3.


All based on the same flawed premise, and equally flawed.

I am Infinitely Certain

There is one proposition that I would start with and assign a probability of 1, not 1-1/​googolplex. Not 1 − 1/​3^^^^3, Not 1 - epsilon (where epsilon is an arbitrarily small number), but a probability of 1.

I exist.

Rene Descartes presents a very wonderful argument for the veracity of this statement:

Accordingly, seeing that our senses sometimes deceive us, I was willing to suppose that there existed nothing really such as they presented to us; And because some men err in reasoning, and fall into Paralogisms, even on the simplest matters of Geometry, I, convinced that I was as open to error as any other, rejected as false all the reasonings I had hitherto taken for Demonstrations; And finally, when I considered that the very same thoughts (presentations) which we experience when awake may also be experienced when we are asleep, while there is at that time not one of them true, I supposed that all the objects (presentations) that had ever entered into my mind when awake, had in them no more truth than the illusions of my dreams. But immediately upon this I observed that, whilst I thus wished to think that all was false, it was absolutely necessary that I, who thus thought, should be something; And as I observed that this truth, I think, therefore I am,[c] was so certain and of such evidence that no ground of doubt, however extravagant, could be alleged by the Sceptics capable of shaking it, I concluded that I might, without scruple, accept it as the first principle of the philosophy of which I was in search

Eliezer quotes Rafal Smigrodski:

“I would say you should be able to assign a less than 1 certainty level to the mathematical concepts which are necessary to derive Bayes’ rule itself, and still practically use it. I am not totally sure I have to be always unsure. Maybe I could be legitimately sure about something. But once I assign a probability of 1 to a proposition, I can never undo it. No matter what I see or learn, I have to reject everything that disagrees with the axiom. I don’t like the idea of not being able to change my mind, ever.”

I am alright with accepting as an axiom that I exist. I see no reason why I should be cautious of assigning a probability of 1 to this statement. I am infinitely certain that I exist.


If you accept Descartes argument, then this is very important. You’re accepting that we can be infinitely certain about a proposition—and not just that—that it is sensible to be infinitely certain about a proposition. Usually, only one counterexample is necessary, but there are several other statements which you may assign a probability of 1 to.

I believe that I exist.

I believe that I believe that I exist.

I believe that I believe that I believe that I exist.

And so on and so forth, ad infinitum. An infinite chain of statements, all of which are exactly true. I have satisfied Eliezer’s (fatuous) requirements for assigning a certain level of confidence to a proposition. If you feel that it is not sensible to assign probability 1 to the first statement, then consider this argument. I assign a probability 1 to the proposition “I exist”. This means that the proposition “I exist” exists (pun intended) in my mental map of the world, and is therefore a belief of mine. By deduction, if I assign a probability of 1 to the statement “I exist”, then I must assign a probability of 1 to the proposition “I believe that I exist”. By induction, I must assign a probability of 1 to all the infinite statements, and all of them are true.

(I assign a probability of 1 to deduction being true).

Generally, using the power of recursion, we can pick any statement, to which we assign a probability of 1 and generate infinite more statements to which we (by deduction) also assign a probability of 1.

Let X be a proposition to which we assign a probability of 1.

define f(var, n=0)

if n < 0 or type(n) != int

return −1

end if

if var == X and n == 0

var = (“I believe ” + var + ”.”)

print var

end if

n = (n < 2)?2:n

str = (“I believe that ” + var + ”.”)

print str

i = 0

while i < n

str += “I believe that ” + str + ”.”

print str

end while

end if else

f(str, n**n)

end

f(f(X, n)) for any X (to which we assign a probability of 1 and some valid n) prints an infinite number of statements to which we also assign a probability of 1.

While I’m at it, I can show that there are an uncountably infinite number of such statements with a probability of 1.

Let S be the array of all propositions produced by f(f(X, n)) (for some valid X to which we assigned a probability of 1, and a valid n).

define g(var)

k = rand(#S)

i = 0

j = rand(#S)

str = “I believe ” + S[j]

delete(S[j])

while i < k

j = rand(#S)

str += ” and ” + S[j]

delete(S[j]

i++

end while

print(str)

f(g(var), 2)

end

Assuming #S = Aleph_null, there are 2^#S possible values for str, and each of them can be used to generate an infinite sequence of true propositions. By Cantor’s diagonal argument the number of propositions to which we assign a probability of 1 are uncountable. For each of those propsitions, we assign a probability of 0 to their negation. That is if you accept Descartes argument, or accept any single proposition has having a probability of 1 (or 0), then you accept uncountably infinite many propositions as having a probability of 1 (or 0). Either we can never be certain of any propositions ever, or we can be certain of uncountably infinite many propositions (you can also use the outlined method to construct K statements with arbitrary accuracy).

Personally, I see no problem with accepting “I exist” (and deduction) as having P of 1.

When you work in log odds, the distance between any two degrees of uncertainty equals the amount of evidence you would need to go from one to the other. That is, the log odds gives us a natural measure of spacing among degrees of confidence.

Using the log odds exposes the fact that reaching infinite certainty requires infinitely strong evidence, just as infinite absurdity requires infinitely strong counterevidence.

This ignores the fact that you can assign priors of 0 and 1—in fact, it is for this very reason that I argue that 0 and 1 are probabilities—Eliezer is right in that we can never update upwards (or downwards as the case may be) to 1 or 0 (without using priors of 0 or 1), but we can (and I argue we should) sometimes start with priors of 0 and 1.

0 and 1 as priors.

Consider Pascal’s Mugging. Pascal’s Mugging is a breaker (breakers are a name I coined for decision problems which break decision theories). Let us reconceive the problem such that the person doing the mugging is me.

I walk up to Eliezer and tell him that he should pay me a $10,000 or I would grant him infinite negative utility.

Now, I cannot (as a matter of fundamental physical law) inflict infinite negative utility on Eliezer. However, if Eliezer is rational (maximising his expected utility), then Eliezer must pay me the money. No matter how much money I demand from Eliezer, Eliezer must pay me, because Eliezer does not assign a probability of 0 to me carrying out my threat, and no matter how small the probability is, as long as it’s not 0, paying me the ransom I demanded is the choice which maximises expected utility.

(If you claim that it is impossible for me to grant you infinite negative utility/​infinite negative utility is incoherent/​return a category error on infinite negative utility, then you are assigning a probability of 0 to the existence of infinite negative utility, and (implicitly (because P(A) >= P(A and B). A here is “infinite negative utility exists”. B is “I can grant infinite negative utility”.) assigning a probability of 0 to me granting you infinite negative utility).

I have no problems with decision problems which break decision theories, but when a problem breaks the very formulation of rationality itself, then I’m pissed. There is a trivial solution to resolving Pascal’s mugging using classical decision theory (accept the objective definition of probability; once you do so, the probability of me carrying out my threat becomes zero and the problem disappears). Only the insistence to cling to (unfounded) subjective probability that forbids 0 and 1 as probabilities leads to this mess.

If anything, Pascal’s mugging should be a definitive proof demonstrating that indeed 0 and 1 are perfectly legitimate priors (if you accept a prior of 0 that I will grant you infinite negative utility, then trivially, you accept a prior of 1 that I do not grant you infinite negative utility). Pascal’s mugging only “breaks” Expected utility theory if you forbid priors of 0 and 1—an inane commandment.

I’ll expand more on breakers, rationality, etc. in my upcoming several ten pages+ paper.

Conclusion

So I propose that it makes sense to say that 1 and 0 are not in the probabilities; just as negative and positive infinity, which do not obey the field axioms, are not in the real numbers.

The main reason this would upset probability theorists is that we would need to rederive theorems previously obtained by assuming that we can marginalize over a joint probability by adding up all the pieces and having them sum to 1.

However, in the real world, when you roll a die, it doesn’t literally have infinite certainty of coming up some number between 1 and 6. The die might land on its edge; or get struck by a meteor; or the Dark Lords of the Matrix might reach in and write “37” on one side.

If you made a magical symbol to stand for “all possibilities I haven’t considered”, then you could marginalize over the events including this magical symbol, and arrive at a magical symbol “T” that stands for infinite certainty.

But I would rather ask whether there’s some way to derive a theorem without using magic symbols with special behaviors. That would be more elegant. Just as there are mathematicians who refuse to believe in double negation or infinite sets, I would like to be a probability theorist who doesn’t believe in absolute certainty.

Eliezer presents a shaky basis for rejecting 0 and 1 as probabilities. His model leads to absurd conclusion(s) (a proof by contradiction that 0 and 1 are indeed probabilities), he offers no benefits to rejecting the standard model and replacing it with his (only multiple demerits), and he doesn’t formalise an alternative model of probability that is free of absurdities and has more benefits than the standard model.

0 and 1 are not probabilities is a solution in search of a problem.

Epistemic Hygiene

This article may have come across as overly vicious and confrontational; I adopted such an attitude to minimise the bias in my perception of the original article based on the halo effect.