When the assumptions of Bayes’ Theorem hold, and when Bayesian updating can be performed computationally efficiently, then it is indeed tautological that Bayes is the optimal approach. Even when some of these assumptions fail, Bayes can still be a fruitful approach. However, by working under weaker (sometimes even adversarial) assumptions, frequentist approaches can perform well in very complicated domains even with fairly simple models; this is because, with fewer assumptions being made at the outset, less work has to be done to ensure that those assumptions are met.
I’ve only skimmed this for now (will read soon), but I wanted to point out that I completely agree with this conclusion (without reading the arguments in detail). However, I might frame it differently: both Bayesian statistics and frequentist statistics are useful only insofar as they approximate the true Bayesian epistemology. In other words, if you know the prior and know the likelihood, then performing the Bayesian update will give P(A|data) where A is any question you’re interested in. However, since we usually don’t know our prior or likelihood, there’s no guarantee that Bayesian statistics—which amounts to doing the Bayesian update on the wrong model of the actual structure of our uncertainty (i.e. our actual prior + likelihood) -- will closely approximate Bayesian epistemology. So, of course we should consider other methods that, while superficially don’t look like the true Bayesian update, may do a better job of approximating the answer we want. Computational difficulty is a separate reason why we might have to approximate Bayesian epistemology even if we can write down the prior + likelihood and that, once again, might entail using methods that don’t look “Bayesian” in any way.
If you recall, I briefly made this argument to you at the July minicamp, but you didn’t seem to find it persuasive. I’ll note now that I’m simply not talking about decision theory. So, e.g., when you say
It follows, then, that Bayes is the superior method whenever we can obtain a good prior and when good average-case performance is sufficient.
I’m not taking a position on whether we need to consider whether we need average-case performance to be sufficient in order for using Bayesian statistics to be the best or a good option (I have intuitions going both directions, but nothing fleshed out).
I predict that you’ll probably answer my question in the later essay since my position hinges, crucially, one whether Bayesian epistemology is correct, but do you see anything that you disagree with here?
I predict that you’ll probably answer my question in the later essay since my position hinges, crucially, one whether Bayesian epistemology is correct, but do you see anything that you disagree with here?
Nope, everything you said looks good! I actually like the interpretation you gave:
However, I might frame it differently: both Bayesian statistics and frequentist statistics are useful only insofar as they approximate the true Bayesian epistemology.
I don’t actually intend to take a position on whether Bayesian epistemology is correct; I merely plan to talk about implications and relationships between different interpretations of probability and let people decide for themselves which to prefer, if any. Although if I had to take a position, it would be something like, “Bayes is more correct than frequentist but frequentist ideas can provide insight into patching some of the holes in Bayesian epistemology”. For instance, I think UDT is a very frequentist thing to do.
I’ve only skimmed this for now (will read soon), but I wanted to point out that I completely agree with this conclusion (without reading the arguments in detail). However, I might frame it differently: both Bayesian statistics and frequentist statistics are useful only insofar as they approximate the true Bayesian epistemology. In other words, if you know the prior and know the likelihood, then performing the Bayesian update will give P(A|data) where A is any question you’re interested in. However, since we usually don’t know our prior or likelihood, there’s no guarantee that Bayesian statistics—which amounts to doing the Bayesian update on the wrong model of the actual structure of our uncertainty (i.e. our actual prior + likelihood) -- will closely approximate Bayesian epistemology. So, of course we should consider other methods that, while superficially don’t look like the true Bayesian update, may do a better job of approximating the answer we want. Computational difficulty is a separate reason why we might have to approximate Bayesian epistemology even if we can write down the prior + likelihood and that, once again, might entail using methods that don’t look “Bayesian” in any way.
If you recall, I briefly made this argument to you at the July minicamp, but you didn’t seem to find it persuasive. I’ll note now that I’m simply not talking about decision theory. So, e.g., when you say
I’m not taking a position on whether we need to consider whether we need average-case performance to be sufficient in order for using Bayesian statistics to be the best or a good option (I have intuitions going both directions, but nothing fleshed out).
I predict that you’ll probably answer my question in the later essay since my position hinges, crucially, one whether Bayesian epistemology is correct, but do you see anything that you disagree with here?
Nope, everything you said looks good! I actually like the interpretation you gave:
I don’t actually intend to take a position on whether Bayesian epistemology is correct; I merely plan to talk about implications and relationships between different interpretations of probability and let people decide for themselves which to prefer, if any. Although if I had to take a position, it would be something like, “Bayes is more correct than frequentist but frequentist ideas can provide insight into patching some of the holes in Bayesian epistemology”. For instance, I think UDT is a very frequentist thing to do.