An example demonstrating how to deduce Bayes’ Theorem

(This post is not an at­tempt to con­vey any­thing new, but is in­stead just an at­tempt to provide back­ground con­text on how Bayes’ the­o­rem works by de­scribing how it can be de­duced. This is not meant to be a for­mal proof. There have been other el­e­men­tary posts that have cov­ered how to use Bayes’ the­o­rem: here, here, here and here)

Con­sider the fol­low­ing example

Imag­ine that your friend has a bowl that con­tains cook­ies in two va­ri­eties: choco­late chip and white chip macadamia nut. You think to your­self: “Yum. I would re­ally like a choco­late chip cookie”. So you reach for one, but be­fore you can pull one out your friend lets you know that you can only pick one, that you can­not look into the bowl and that all the cook­ies are ei­ther fresh or stale. Your friend also tells you that there are 80 fresh cook­ies, 40 choco­late chip cook­ies, 15 stale white chip macadamia nut cook­ies and 100 cook­ies in to­tal. What is the prob­a­bil­ity that you will pull out a fresh choco­late chip cookie?

To figure this out we will cre­ate a truth table. If we fill in the val­ues that we do know, then we will end up with the be­low table. I have high­lighted in yel­low the cell that we want to find the value of.

Cho­co­late Chip

White Chip Ma­cadamia Nut

Total

Fresh

80

Stale

15

Total

40

100

If we look at the above table we can no­tice that, like in Su­doku, there are some val­ues that we can fill in based on the in­for­ma­tion that we already know. Th­ese val­ues are coloured in grey and they are:

  • The num­ber of stale cook­ies. We know that 80 cook­ies are fresh and that there are 100 cook­ies in to­tal, so this means that there must be 20 stale cook­ies.

  • The num­ber of white chip macadamia nut cook­ies. We know that there are 40 choco­late chip cook­ies and 100 cook­ies in to­tal, so this means that there must be 60 white chip macadamia nut cookies

If we fill in both these val­ues we end up with the be­low table:

Cho­co­late Chip

White Chip Ma­cadamia Nut

Total

Fresh

80

Stale

15

20

Total

40

60

100

If we look at the table now, we can see that there are two more val­ues that can be filled in. Th­ese val­ues are coloured in grey and they are:

  • The num­ber of fresh white chip macadamia nut cook­ies. We know that there are 60 white chip macadamia nut cook­ies and that 15 of these are stale, so this means that there must be 45 fresh white chip macadamia nut cook­ies.

  • The num­ber of stale choco­late chip cook­ies. We know that there are 20 stale cook­ies and that 15 of these are white chip macadamia nut, so this means that there must be 5 stale choco­late chip cook­ies.

If we fill in both these val­ues we end up with the be­low table:

Cho­co­late Chip

White Chip Ma­cadamia Nut

Total

Fresh

45

80

Stale

5

15

20

Total

40

60

100

We can now find out the num­ber of fresh choco­late chip cook­ies. It is im­por­tant to note that there are two ways in which we can do this. Th­ese two ways are called the in­verse of each other (this will be used later):

  • Us­ing the filled in row val­ues. We know that there are 80 fresh cook­ies and that 45 of these are white chip macadamia nut, so this means that there must be 35 fresh choco­late chip cook­ies.

  • Us­ing the filled in column val­ues. We know that there are 40 choco­late chip cook­ies and the 5 of these are stale, so this means that there must be 35 fresh choco­late chip cook­ies.

If we fill in the last value in the table we end up with the be­low table:

Cho­co­late Chip

White Chip Ma­cadamia Nut

Total

Fresh

35

45

80

Stale

5

15

20

Total

40

60

100

We can now find out the prob­a­bil­ity of choos­ing a fresh choco­late chip cookie by di­vid­ing the num­ber of fresh choco­late chip cook­ies (35) by the to­tal num­ber of cook­ies (100). This is 35 /​ 100 which is 35%. We now have the prob­a­bil­ity of choos­ing a fresh choco­late chip cookie (35%).

To get to the Bayes’ the­o­rem I will need to re­duce the terms to a sim­pler form.

  • P(A) = prob­a­bil­ity of find­ing some ob­ser­va­tion A. You can think of this as the prob­a­bil­ity of the picked cookie be­ing choco­late chip.

  • P(B) = the prob­a­bil­ity of find­ing some ob­ser­va­tion B. You can think of this as the prob­a­bil­ity of the picked cookie be­ing fresh. Please note that A is what we want to find given B. If it was de­sired, then A could be fresh and B choco­late chip.

  • P(~A) = negated ver­sion of find­ing some ob­ser­va­tion A. You can think of this as the prob­a­bil­ity of the picked cookie not be­ing choco­late i.e. be­ing a white chip macadamia nut in­stead.

  • P(~B) = a negated ver­sion of find­ing some ob­ser­va­tion B. You can think of this as the prob­a­bil­ity of the picked cookie not be­ing fresh i.e. be­ing stale in­stead.

  • P(A∩B) = prob­a­bil­ity of be­ing both A and B. You can think of this as the prob­a­bil­ity of the picked cookie be­ing fresh and choco­late chip.

Now, we will start get­ting a bit more com­pli­cated as we start mov­ing into the ba­sis of the Bayes’ The­o­rem. Let’s go through an­other ex­am­ple based on the origi­nal.

Let’s as­sume that be­fore you pull out a cookie you no­tice that it is fresh. Can you then figure out the like­li­hood of it be­ing choco­late chip be­fore you pull it out? The an­swer is yes.

We will find this out us­ing the table that we filled in pre­vi­ously. The im­por­tant row is un­der­lined.

Cho­co­late Chip

White Chip Ma­cadamia Nut

Total

Fresh

35

45

80

Stale

5

15

20

Total

40

60

100

Since we already know that the cookie is fresh, we can say that the like­li­hood of it be­ing a choco­late chip cookie is equal to the num­ber of fresh choco­late chip cook­ies (35) di­vided by the to­tal num­ber of fresh cook­ies (80). This is 35 /​ 80 which is 43.75%.

In a sim­pler form this is:

  • P(A|B) - The prob­a­bil­ity of A given B. You can think of this as the prob­a­bil­ity of the picked cookie be­ing choco­late chip if you already know that it is fresh.

If we relook at the table we can see that there is some ex­tra im­por­tant in­for­ma­tion that we can find out about P(A|B). We can dis­cover that it is equal to P(A∩B) /​ P(B) You can think of this as the prob­a­bil­ity of the picked cookie be­ing choco­late chip if you know that it is fresh (35 /​ 80) is equal to the prob­a­bil­ity of the picked cookie be­ing fresh and choco­late chip (35 /​ 100) di­vided by the prob­a­bil­ity of it be­ing fresh (80 /​ 100). This is P(A|B) = (35 /​ 100) /​ (80 /​ 100) which be­comes 0.35 /​ 0.8 which is the same as the an­swer we found out above (43.75%). Take note of the fact that P(A|B) = P(A∩B) /​ P(B) as we will use this later.

Let’s now re­turn to the in­verse idea that was raised pre­vi­ously. If we want to know the prob­a­bil­ity of the picked cookie be­ing fresh and choco­late chip, i.e. P(A∩B), then we can use the un­der­lined parts of the filled in truth table.

Cho­co­late Chip

White Chip Ma­cadamia Nut

Total

Fresh

35

45

80

Stale

5

15

20

Total

40

60

100

If we know that the cookie is known to be fresh like in the top row above, then we can find out that: P(A∩B) = P(A|B) * P(B). This means that the prob­a­bil­ity of the picked cookie be­ing fresh and choco­late chip (35 /​ 100) (re­mem­ber that there were 100 cook­ies in to­tal) is equal to the prob­a­bil­ity of it be­ing choco­late chip given that you know that it is fresh (35 /​ 80) times the prob­a­bil­ity of it be­ing fresh (80 /​ 100) . So, we end up with P(A∩B) = (35 /​ 80) * (80 /​ 100) which be­comes 35% which is the same as 35 /​ 100 which we know is the right an­swer.

Alter­na­tively, since we know that we can con­vert P(A|B) to P(A∩B) /​ P(B) (we found this out pre­vi­ously) we can also find out that:P(A∩B) = P(A|B) * P(B). We can do this by us­ing the fol­low­ing method:

  1. As­sume P(A∩B) = P(A|B) * P(B)

  2. Con­vert P(A|B) to P(A∩B) /​ P(B) so we get P(A∩B) = (P(A∩B) * P(B)) /​ P(B).

  3. No­tice that P(B) is on both the top and bot­tom of the equa­tion, which means that it can be crossed out

  4. Cross out P(B) to give you P(A∩B) = P(A∩B)

The in­verse situ­a­tion is when you know that the cookie is choco­late chip like in the left column in the table above. Us­ing the left column we can find out that: P(A∩B) = P (B|A) * P(A). This means that the prob­a­bil­ity of the picked cookie be­ing fresh and choco­late chip (35 /​ 100) is equal to the prob­a­bil­ity of it be­ing fresh given that you know it is choco­late chip (35 /​ 40) times the prob­a­bil­ity of it be­ing choco­late chip (40 /​ 100). This is: P(A∩B) = (35 /​ 40) * (40 /​ 100). This be­comes 35% which we know is the right an­swer.

Now, we have enough in­for­ma­tion to de­duce the sim­ple form of Bayes’ The­o­rem.

Let’s first re­count what we know:

  1. P(A|B) = P(A∩B) /​ P(B)

  2. P(A∩B) = P(B|A) * P(A)

By tak­ing the first fact: P(A|B) = P(A∩B) /​ P(B) and us­ing the sec­ond fact to con­vert P(A∩B) to P(B|A) * P(A) you end up with P(A|B) = (P(B|A) * P(A)) /​ P(B) which is Bayes’ The­o­rem in its sim­ple form.

From the sim­ple form of Bayes’ The­o­rem, there is one more con­ver­sion that we need to make to de­rive the ex­plicit form of Bayes’ The­o­rem which is the one we are try­ing to de­duce.

To get to the ex­plicit form ver­sion we need to first find out that P(B) = P(A) * P(B|A) + P(~A) * P(B|~A).

To do this let’s re­fer to the table again:

Cho­co­late Chip

White Chip Ma­cadamia Nut

Total

Fresh

35

45

80

Stale

5

15

20

Total

40

60

100

We can see that the prob­a­bil­ity that the picked cookie is fresh (80 /​ 100) is equal to the prob­a­bil­ity that it is fresh and choco­late chip (35 /​ 100) plus the prob­a­bil­ity that it is fresh and white chip macadamia nut (45 /​ 100). So, we can find out that the prob­a­bil­ity of P(B) (cookie is fresh) is equal to 35 /​ 100 + 45 /​ 100 which is 0.8 or 80% which we know is the an­swer. This gives the for­mula:P(B) = P(A∩B) + P(~A∩B)

We know that P(A∩B) = P(B|A) * P(A) as we found this out ear­lier. Similarly we can find out that P(~A∩B) = P(~A) * P(B|~A). This means that the prob­a­bil­ity of the picked cookie be­ing fresh and white chip macadamia nut (45 /​ 100) is equal to the prob­a­bil­ity of it be­ing white chip macadamia nut (60 /​ 100) times the prob­a­bil­ity of it be­ing fresh cookie given that you know that it is white chip macadamia nut (45 /​ 60). This is: (60 /​ 100) * (45 /​ 60) which is 45% which we know is the an­swer.

Us­ing this in­for­ma­tion, we can now get to the ex­plicit form of Bayes’ The­o­rem:

  1. We know the sim­ple form of Bayes’ The­o­rem: P(A|B) = (P(B|A) * P(A)) /​ P(B)

  2. We can con­vert P(B) to P(A∩B) + P(~A∩B) to get P(A|B) = (P(B|A) * P(A)) /​ (P(A∩B) + P(~A∩B))

  3. We can con­vert P(A∩B) to P(A) * P(B|A) to get P(A|B) = (P(B|A) * P(A)) /​ (P(A) * P(B|A) + P(~A∩B))

  4. We can con­vert P(~A∩B) to P(~A) * P(B|~A) to get P(A|B) = (P(B|A) * P(A)) /​ (P(A) * P(B|A) + P(~A) * P(B|~A))

Con­grat­u­la­tions we have now reached the ex­plicit form of Bayes’ The­o­rem: