That seems philosophically deeper than the goal of this post, which is to just help people who already accept the standard joint probability statement to remember Bayes’ Theorem. If I want to remember or write Bayes’ Theorem, I roughly just think “joint probability, divide.”
The intuition behind the definition there (it’s a definition of conditional probability; the only assumption is that this correctly captures the informal idea of conditional probability) is: P(B) tells you how many worlds/how much probability mass is in the blob where B is true, measured so that if the blob contains everything then it has size 1. P(A|B) means you take the weight of A but constrain yourself to only look at worlds where B is true; so you look at the part of A that intersects B and measure it relative to B so that if it had B’s size it would be 1. This is why you take P(A&B)/P(B).
Your proof of Bayes’ Theorem assumes P(A and B)=P(A)⋅P(B∣A), but it’s not clear why someone who doubts Bayes would accept that.
That seems philosophically deeper than the goal of this post, which is to just help people who already accept the standard joint probability statement to remember Bayes’ Theorem. If I want to remember or write Bayes’ Theorem, I roughly just think “joint probability, divide.”
The intuition behind the definition there (it’s a definition of conditional probability; the only assumption is that this correctly captures the informal idea of conditional probability) is: P(B) tells you how many worlds/how much probability mass is in the blob where B is true, measured so that if the blob contains everything then it has size 1. P(A|B) means you take the weight of A but constrain yourself to only look at worlds where B is true; so you look at the part of A that intersects B and measure it relative to B so that if it had B’s size it would be 1. This is why you take P(A&B)/P(B).