Raven paradox settled to my satisfaction

The raven paradox, originated by Carl Gustav Hempel, is an apparent absurdity of inductive reasoning. Consider the hypothesis:

H1: All ravens are black.

Inductively, one might expect that seeing many black ravens and no non-black ones is evidence for this hypothesis. As you see more black ravens, you may even find it more and more likely.

Logically, a statement is equivalent to its contrapositive (where you negate both things and flip the order). Thus if “if it is a raven, it is black” is true, so is:

H1′: If it is not black, it is not a raven.

Take a moment to double-check this.

Inductively, just like with H1, one would expect that seeing many non-black non-ravens is evidence for this hypothesis. As you see more and more examples, you may even find it more and more likely. Thus a yellow banana is evidence for the hypothesis “all ravens are black.”

Since this is silly, there is an apparent problem with induction.


Consider the following two possible states of the world:

Suppose that these are your two hypotheses, and you observe a yellow banana (drawing from some fixed distribution over things). Q: What does this tell you about one hypothesis versus another? A: It tells you bananas-all about the number of black ravens.

One might contrast this with a hypothesis where there is one less banana, and one more yellow raven, by some sort of spontaneous generation.

Observations of both black ravens and yellow bananas cause us to prefer 1 over 3, now!

The moral of the story is that the amount of evidence that an observation provides is not just about whether it whether it is consistent with the “active” hypothesis—it is about the difference in likelihood between when the hypothesis is true versus when it’s false.

This is a pretty straightforward moral—it’s a widely known pillar of statistical reasoning. But its absence in the raven paradox takes a bit of effort to see. This is because we’re using an implicit model of the problem (driven by some combination of outside knowledge and framing effects) where nonblack ravens replace black ravens, but don’t replace bananas. The logical statements H1 and H1′ are not alone enough to tell how you should update upon seeing new evidence. Or to put it another way, the version of induction that drives the raven paradox is in fact wrong, but probability theory implies a bigger version.

(Technical note: In the hypotheses above, the exact number of yellow bananas does not have to be the same for observing a yellow banana to provide no evidence—what has to be the same is the measure of yellow bananas in the probability distribution we’re drawing from. Talking about “99 ravens” is more understandable, but what differentiates our hypotheses are really the likelihoods of observing different events [there’s our moral again]. This becomes particularly important when extending the argument to infinite numbers of ravens—infinities or no infinities, when you make an observation you’re still drawing from some distribution.)