I’ve read up to the introduction, I’ll comment as I continue. I’ve found three problems so far:
it’s not true that for objective Bayesians (the subjectives are those of the de Finetti school) any model and any prior is equally valid. The logical analysis of the problem and of the background information is the defining feature of the discipline, indeed since the inference step is reduced to the application of the product and negation rules. For example, in the problem you pose, we can analyze the background information and notice that: 1. we suppose that each outcome is independent; 2. we know that the coin does indeed have a head and a tail; 3. we know nothing else about the coin. These three observations alone are sufficient to decide for a single model and a single prior. Choosing a different model or a different prior means starting from a different background information, and that amounts to answering questions about a problem that was not posed in the first place.
objective Bayesianism is just the logically correct way (as per Cox theorem and further amendations) to assign probabilities to logical formulae. There’s nothing in the discipline that forces anyone to find a universal model, and since one can do model comparison just as ‘easily’, any Bayesian can live happily in a many-models environment. What would be cool to have is a universal logical analysis tool, that is something that inputs a verbal description of the problem and outputs the most general model that is warranted by the description. The MAXENT princple is right now our best attempt at coming up with such a tool.
universal models already do exists, they are called universal semi-measures and the most famous of those is the Solomonoff prior. This also means that it’s true that there’s not a single universal model, as you said, but you can also show that any such model differs only in a finite initial ‘segment’, matching different initial information encoded in the universal Turing machine used to measure the Kolmogorov complexity.
I will go ahead and answer your first three questions
Objective Bayesians might have “standard operating procedures” for common problems, but I bet you that I can construct realistic problems where two Objective Bayesians will disagree on how to proceed. At the very least the Objective Bayesians need an “Objective Bayesian manifesto” spelling out what are the canonical procedures.
For the “coin-flipping” example, see my response to RichardKennaway where I ask whether you would still be content to treat the problem as coin-flipping if you had strong prior infromation on g(x).
MaxENT is not invariant to parameterization, and I’m betting that there are examples where it works poorly. Far from being a “universal principle” it ends up being yet another heuristic joining the ranks of asymptotic optimality, minimax, minimax relative to oracle, etc. Not to say these are bad principles—each of them is very useful, but when and where to use them is still subjective.
That would be great if you could implement a Solomonoff prior. It is hard to say whether implementing an approximate algorithmic prior which doesn’t produce garbage is easier or harder than encoding the sum total of human scientific knowledge and heuristics into a Bayesian model, but I’m willing to bet that it is. (This third bet is not a serious bet, the first two are.)
I’ve read up to the introduction, I’ll comment as I continue.
I’ve found three problems so far:
it’s not true that for objective Bayesians (the subjectives are those of the de Finetti school) any model and any prior is equally valid. The logical analysis of the problem and of the background information is the defining feature of the discipline, indeed since the inference step is reduced to the application of the product and negation rules.
For example, in the problem you pose, we can analyze the background information and notice that: 1. we suppose that each outcome is independent; 2. we know that the coin does indeed have a head and a tail; 3. we know nothing else about the coin. These three observations alone are sufficient to decide for a single model and a single prior.
Choosing a different model or a different prior means starting from a different background information, and that amounts to answering questions about a problem that was not posed in the first place.
objective Bayesianism is just the logically correct way (as per Cox theorem and further amendations) to assign probabilities to logical formulae. There’s nothing in the discipline that forces anyone to find a universal model, and since one can do model comparison just as ‘easily’, any Bayesian can live happily in a many-models environment. What would be cool to have is a universal logical analysis tool, that is something that inputs a verbal description of the problem and outputs the most general model that is warranted by the description. The MAXENT princple is right now our best attempt at coming up with such a tool.
universal models already do exists, they are called universal semi-measures and the most famous of those is the Solomonoff prior. This also means that it’s true that there’s not a single universal model, as you said, but you can also show that any such model differs only in a finite initial ‘segment’, matching different initial information encoded in the universal Turing machine used to measure the Kolmogorov complexity.
I will go ahead and answer your first three questions
Objective Bayesians might have “standard operating procedures” for common problems, but I bet you that I can construct realistic problems where two Objective Bayesians will disagree on how to proceed. At the very least the Objective Bayesians need an “Objective Bayesian manifesto” spelling out what are the canonical procedures. For the “coin-flipping” example, see my response to RichardKennaway where I ask whether you would still be content to treat the problem as coin-flipping if you had strong prior infromation on g(x).
MaxENT is not invariant to parameterization, and I’m betting that there are examples where it works poorly. Far from being a “universal principle” it ends up being yet another heuristic joining the ranks of asymptotic optimality, minimax, minimax relative to oracle, etc. Not to say these are bad principles—each of them is very useful, but when and where to use them is still subjective.
That would be great if you could implement a Solomonoff prior. It is hard to say whether implementing an approximate algorithmic prior which doesn’t produce garbage is easier or harder than encoding the sum total of human scientific knowledge and heuristics into a Bayesian model, but I’m willing to bet that it is. (This third bet is not a serious bet, the first two are.)