Let’s test on my favorite example: most people in port city in glasses are not librarian but sailors.
Share of librarians in town multiplied by share of librarians who wear glasses = share of people in glasses in town multiplied by share of people in glasses who are librarians
Below is Opus 4.5 text where I asked it explain your post with librarians example:
Here’s the translation:
Step 1: Multiplication rule for joint events
P(L and G)=P(L)⋅P(G∣L)P(L \text{ and } G) = P(L) \cdot P(G|L)P(L and G)=P(L)⋅P(G∣L)
This is the probability of meeting a person who is both a librarian and wears glasses. We take the proportion of librarians in the city and multiply by the proportion of them who wear glasses.
P(G and L)=P(G)⋅P(L∣G)P(G \text{ and } L) = P(G) \cdot P(L|G)P(G and L)=P(G)⋅P(L∣G)
The same event, but “from the other side”: we take the proportion of all people wearing glasses and multiply by the proportion of librarians among them.
Where P(G)P(G)P(G) is the overall proportion of people wearing glasses in the city (summed across all population groups).
This is why even a high probability P(G∣L)=0.50P(G|L) = 0.50P(G∣L)=0.50 doesn’t help: it gets multiplied by the tiny P(L)=0.001P(L) = 0.001P(L)=0.001, and as a result P(L∣G)P(L|G)P(L∣G) ends up being small.
I think the example has more to do with applying the theorem, while the post is about deriving it, so I’m not sure the post helps with the example directly.
The example can be used to geometrically illustrate (and prove) the theorem. The whole city population can be represented as a square. On X axis is represented the number of people in the city in glasses. On Y axis, the number of people in with different professions. Below is AI generated image based on my prompt:
Let’s test on my favorite example: most people in port city in glasses are not librarian but sailors.
Share of librarians in town multiplied by share of librarians who wear glasses = share of people in glasses in town multiplied by share of people in glasses who are librarians
Below is Opus 4.5 text where I asked it explain your post with librarians example:
Here’s the translation:
Step 1: Multiplication rule for joint events
P(L and G)=P(L)⋅P(G∣L)P(L \text{ and } G) = P(L) \cdot P(G|L)P(L and G)=P(L)⋅P(G∣L)
This is the probability of meeting a person who is both a librarian and wears glasses. We take the proportion of librarians in the city and multiply by the proportion of them who wear glasses.
P(G and L)=P(G)⋅P(L∣G)P(G \text{ and } L) = P(G) \cdot P(L|G)P(G and L)=P(G)⋅P(L∣G)
The same event, but “from the other side”: we take the proportion of all people wearing glasses and multiply by the proportion of librarians among them.
Step 2: This is the same event, therefore
P(L)⋅P(G∣L)=P(G)⋅P(L∣G)P(L) \cdot P(G|L) = P(G) \cdot P(L|G)P(L)⋅P(G∣L)=P(G)⋅P(L∣G)
Step 3: Divide by P(G)
P(L∣G)=P(G∣L)⋅P(L)P(G)P(L|G) = \frac{P(G|L) \cdot P(L)}{P(G)}P(L∣G)=P(G)P(G∣L)⋅P(L)
Substituting our numbers:
P(L∣G)=0.50⋅0.001P(G)P(L|G) = \frac{0.50 \cdot 0.001}{P(G)}P(L∣G)=P(G)0.50⋅0.001
Where P(G)P(G)P(G) is the overall proportion of people wearing glasses in the city (summed across all population groups).
This is why even a high probability P(G∣L)=0.50P(G|L) = 0.50P(G∣L)=0.50 doesn’t help: it gets multiplied by the tiny P(L)=0.001P(L) = 0.001P(L)=0.001, and as a result P(L∣G)P(L|G)P(L∣G) ends up being small.
I think the example has more to do with applying the theorem, while the post is about deriving it, so I’m not sure the post helps with the example directly.
The example can be used to geometrically illustrate (and prove) the theorem. The whole city population can be represented as a square. On X axis is represented the number of people in the city in glasses. On Y axis, the number of people in with different professions. Below is AI generated image based on my prompt: