Reflective Bayesianism

I’ve argued in several places that traditional Bayesian reasoning is unable to properly handle embeddedness, logical uncertainty, and related issues.

However, in the light of hindsight, it’s possible to imagine a Bayesian pondering these things, and “escaping the trap”. This is because the trap was largely made out of assumptions we didn’t usually explicitly recognize we were making.

Therefore, in order to aid moving beyond the traditional view, I think it’s instructive to paint a detailed picture of what a traditional Bayesian might believe. This can be seen as a partner to my post Radical Probabilism, which explained in considerable detail how to move beyond the traditional Bayesian view.

Simple Belief vs Reflective Belief

There is an important distinction between the explicit dogmas of a view (ie, what adherents explicitly endorse) vs what one would need to believe in order to agree with them. One can become familiar with this distinction by studying logic, especially Gödel’s incompleteness theorems and Tarski’s undefinability theorem. In particular:

  • Axiomatic systems such as set theory cannot proclaim themselves self-consistent (unless they are, indeed, inconsistent). However, adherents believe the system to be consistent. If they seriously doubted this, they would not use the system.

  • Such axiomatic systems cannot discuss their own semantics (their map-territory correspondence). However, adherents regularly discuss this.

This becomes especially interesting when we’re studying rationality (rather than, say, mathematics), because a theory of rationality is supposed to characterize normatively correct thinking. Yet, due to the above facts, philosophers will typically end up in a strange position where they endorse principles very different from the ones they themselves are using. For example, a philosopher arguing that set theory is the ultimate foundation of rational thought would find themselves in the strange position of utilizing “irrational” tools (tools which go beyond set theory) to come to its defense, EG in discussing the semantics of set theory or arguing for its consistency.

I’ll use the term simple belief to indicate accepting the explicit dogmas, and reflective belief to indicate accepting the meta-dogmas which justify the dogmas. (These are not terms intended to stand the test of time; I’m not sure about the best terminology, and probably won’t reliably use these particular terms outside of this blog post. Leverage Research, iiuc, uses the term “endorsed belief” for what I’m calling “reflective belief”, and “belief” for what I’m calling “simple belief”.)

Reflective belief and simple belief need not go together. There’s a joke from Eliezer (h/​t to Ben Pace for pointing me to the source):

The rules say we must use consequentialism, but good people are deontologists, and virtue ethics is what actually works.

—Eliezer Yudkowsky, Twitter

If this were true, then simple belief in consequentialism would imply reflective belief in virtue ethics (because you evaluate moral frameworks on their effects, not whether they’re morally correct!). Similarly, simple belief in virtue ethics would imply reflective belief in deontology, and simple belief in deontology would imply reflective belief in consequentialism.

So, not only does simple belief in X not imply reflective belief in X; furthermore, reflective belief in X need not imply simple belief in X! (This is, indeed, belief-in-belief.)

Hence, by “reflective belief” I do not necessarily mean “reflectively consistent belief”. Reflective consistency occurs only when simple belief and reflective belief are one and the same: the reasoning system you use is also the one you endorse.

Reflective Bayesianism

Applying the naive/​reflective distinction to our case-in-point, I’ll define two different types of Bayesian:

Simple Bayesian: simple belief in Bayesianism. This describes an agent who reasons according to the laws of probability theory, and updates beliefs via Bayes’ Law. This is the type of reasoner which Bayesians study.

Reflective Bayesian: reflective belief in Bayesianism. For simplicity, I’ll assume this also involves simple belief in Bayesianism. Realistically, Bayesian philosophers can’t reason in a perfectly Bayesian way; so, this is a simplified, idealized Bayesian philosopher.

(A problem with the terminology in this section is that by “Bayesian” I specifically mean “dogmatic probabilism” in the terminology from my Radical Probabilism post. I don’t want to construe “Bayesianism” to necessarily include Bayesian updates. The central belief of Bayesianism is subjective probability theory. However, repeating “simple dogmatic probabilism vs reflective dogmatic probabilism” over and over again in this essay was not very appealing.)

Actually, there’s not one canonical “reflective belief in Bayes”—one can justify Bayesianism in many ways, so there can correspondingly be many reflective-Bayesian positions. I’m going to discuss a number of these positions.

My prior is best.

The easiest way to reflectively endorse Bayesianism is to simply believe my prior is best. No other distribution can have more information about the world, unless it actually observes something about the world.

I think of this as multiverse frequentism. You think your prior literally gives the frequency of different possible universes. Now, I’m not accusing anyone of really believing this, but I have heard a (particularly reflective and self-critical) Bayesian articulate the idea. And I think a lot of people have an assumption like this in mind when they think about priors like the Solomonoff prior, which are designed to be particularly “objective”. This is essentially a map/​territory error.

Some group of readers may think: “Now wait a minute. Shouldn’t a Bayesian necessarily believe this? The expected Bayes loss of any other prior is going to be worse, when an agent considers them! Similarly, no other prior is going to look like a better tool for making decisions. So, yeah… I expect my prior to be best!”

On the other hand, those who endorse some level of modest epistemology might be giving that first group of readers some serious side-eye. Surely it’s crazy to think your own beliefs are optimal only because they’re yours?

To the first group, I would agree that if you’ve fully articulated a probability distribution, then you shouldn’t be in a position where you think a different one is better than yours: in that case, you should update to the other one! (Or possibly to some third, even better distribution). The multiverse-frequentist fallaciously extends this result to apply to all priors.

But this doesn’t mean you should think your distribution is best in general. For example, you can believe that someone else knows more than you, without knowing exactly what they believe.

In particular, it’s easy to believe that some computation knows more than you. If a task somehow involves factoring numbers, you might not know the relevant prime factorizations. However, you can justifiably trust a probability distribution whose description includes running an accurate prime factorization algorithm. You can prefer to replace your own beliefs with such a probability distribution. This lays the groundwork for justified non-Bayesian updates.

I can’t gain information without observing things.

Maybe our reflective Bayesian doesn’t literally think theirs is the best prior possible. However, they might be a staunch empiricist: they believe knowledge is entanglement with reality, and you can only get entanglement with reality by looking.

Unlike the multiverse-frequentist described in the previous section, the empiricist can think other people have better probability distributions. What the empiricist doesn’t believe is that we can emulate any of their expertise merely by thinking about it. Thinking is useless (except, of course, for the computational requirements of Bayes’ Law itself). Therefore, although helpful non-Bayesian updates might technically be possible (eg, if you could morph into a more knowledgeable friend of yours), it’s not possible to come up with any which you can implement just by thinking.

I can’t think of any way to “justify” this assumption except if you really do have unbounded computational resources.

The best prior is already one of my hypotheses.

This is, of course, just the usual assumption of realizability: we assume that the world is in our hypothesis-space. We just have to find which of our hypotheses is the true one.

This doesn’t imply as strong a rejection of non-Bayesian updates. It could be that we can gain some useful information by computation alone. However, the need for this must be limited, because we already have enough computational power to simulate the whole world.

What the assumption does gain you is a guarantee that you will make decisions well, eventually. If the correct hypothesis is in your hypothesis space, then once you learn it with sufficiently high confidence (which can usually happen pretty fast), you’ll be making optimal decisions. This is a much stronger guarantee than the simple Bayesian has. So, the assumption does buy our Reflective Bayesian a lot of power in terms of justifying Bayesian reasoning.

My steel-man of this perspective is the belief that the universe is intelligible. I’m not sure what to call this belief. Here are a few versions of it:

  • Computationalism. Everything is computable. It’s absurd to imagine that anything physically realized would not be computable. So, it’s sufficient to assume that the universe might be any computer program.

  • Set-theory-ism. Anything real must have a description in ZFC. Physics might include some uncomputable aspects, but they’d be things like halting oracles, which can be captured within ZFC.

  • Mathematicalism. Anything real must be mathematically describable. Not necessarily in any one axiom system such as ZFC—we know (from Tarski’s undefinability theorem) that there are mathematically describable things which fall outside any fixed axiom set. However, the universe must be mathematically describable in some sense.

I used to believe the third theory here. After all, what could it possibly mean to suppose the universe is not mathematically describable? Failing to assume this just seems like giving up.

But there is no necessary law saying the universe must be mathematical, any more than there’s a necessary law saying the universe has to be computational. It does seem like we have strong evidence that the universe is mathematical in nature; mathematics has been surprisingly helpful for describing the universe we observe. However, philosophically, it makes more sense for this to be a contingent fact, not a necessary one.

There is a best hypothesis, out of those I can articulate.

The weak realizability assumption doesn’t say that the universe is one of my hypotheses; instead, it postulates that out of my hypotheses, one of them is best. This is much more plausible, and gets us most of the theoretical implications.

For example, if you use the Solomonoff prior, strong realizability says that the universe is computable. Weak realizability just says that there’s one computer program that’s best for predicting the universe.

It makes a lot more sense to think of Solomonoff induction as searching for the best computational way to predict. The universe isn’t necessarily computable, but computers are. If we’re building AGI on computers, they can only use computable methods of predicting the world around them.

However, the assumption that one of your hypotheses is best is more questionable than you might realize. It’s easy to set up circumstances in which no prior is best. My favorite example is a Bayesian who is observing coin-flips, and who has two hypotheses: that the coin is biased with 1/​3rd probability of heads, and symmetrically, that it’s biased with 1/​3rd on tails. In truth, the coin is fair. We can show that the Bayesian will alternate between the two hypotheses forever: sometimes favoring one, sometimes the other.

The simple Bayesian believes that such non-convergence is possible. The reflective Bayesian thinks it is not possible—one of the hypotheses has to be best, so beliefs cannot go back and forth forever.

The simple Bayesian therefore can reflectively prefer non-Bayesian updates—for example, in the case of the fair coin, you’d be better off to converge to an even posterior over the two hypotheses, rather than continue updating via Bayes. (Or, even better, make an update which adds “fair coin” to the hypothesis set.)

I am calibrated, or can easily become calibrated.

Calibration is the property that of cases where you estimate, say, and 80% probability, the long-run frequency of those things happening is actually 80%. (Formally: for any number greater than zero, for any probability , considering the sequence of all cases where you assign probability within of , the actual limiting frequency of those things turning out to be true is within of .)

Calibration is a lesser substitute for saying that a probabilistic hypothesis is “true”, much like “best hypothesis out of the space” is. Or, flipping it around: being uncalibrated is a particularly egregious way for a hypothesis to be false.

To illustrate: if your sequence is 010101010101010101..., a fair coin is a calibrated model, even though there’s a much better model. On the other hand, a biased coin is not a calibrated model. If we think the probability of “1” is 13, we will keep reporting a probability of 13, but the limiting frequency of the events will actually be 50%.

So, clearly, believing your probabilities to be calibrated is a way to reflectively endorse them, although not an extremely strong way.

I don’t know that calibration implies any really strong defense of classical Bayesianism. However, it does provide somewhat stronger decision-theoretic guarantees. Namely, a calibrated estimate of the risks means that your strategy can’t be outperformed by really simple adjustments. For example, if you’re using a fair-coin model to make bets on the 010101010101… sequence, you will balance risks and rewards correctly (we can’t make you do better by simply making you more/​less risk-averse). The same cannot be said if you’re using a biased-coin model.

I recently (in private correspondence) dealt with an example where calibration provided a stronger justification for Bayesian approaches over frequentist ones, but I feel the details would be a distraction here. In general I expect a calibration assumption helps justify Bayes in a lot of contexts.

A naive argument in favor of calibration might be “if I thought I weren’t calibrated, I would adjust my beliefs to become more calibrated. Therefore, I must be calibrated.” This makes two mistakes:

  1. It’s perfectly possible to believe “I’m not calibrated” without thinking your miscalibration is in a particular direction.

  2. Even if that weren’t the case, the argument that you’d want to correct your probabilities is by no means decisive. Correct away! This might be a legitimate non-Bayesian update.

My steelman of the calibration assumption is this: in general, it doesn’t seem too hard to watch your calibration graph and adjust your reported probabilities in response. If you’re an alien intelligence watching the 01010101… sequence, it might be hard to invent the “every other” hypothesis from scratch. However, it’s easy to see that your P(1)=1/​3 model is too low and should be adjusted upwards.

(OK, it’s not hard at all to invent the “every other” pattern. But in more complicated cases, it’s difficult to come up with a really new hypothesis, but it’s relatively easy to improve the calibration on hypotheses.)

Note that the formal definition of “calibration” I gave at the beginning of this subsection doesn’t really distinguish between “already calibrated” vs “will become calibrated at some point”; it’s all asymptotic. So if we can calibrate by looking at a calibration chart and compensating for our over-/​under- confidence, then we “are already calibrated” from a technical standpoint. (Nonetheless, I think the intuitive distinction is meaningful and useful.)

A counterargument to my steelman: it’s actually computationally quite difficult to be calibrated. Sure, it doesn’t seem so hard for humans to improve their calibration in practice, but the computational difficulty should give you pause. It might not make sense to suppose that humans are even approximately calibrated in general.


I think Bayesian philosophy before Radical Probabilism over-estimated its self-consistency, underestimating the difference between simple Bayesianism and reflective Bayesianism (effectively making a map-territory error). It did so by implicitly making the mistakes above, as well as others. Sophisticated authors added technical assumptions such as calibration and realizability. These assumptions were then progressively forgotten through iterated summarization/​popularization—EG,

  1. Original author includes technical assumptions as lynchpin of argumets.

  2. Popular summary is careful to mention that the result is proved “under some technical assumptions”, but doesn’t include them.

  3. The readers take away that the result is true, forgetting that technical assumptions were mentioned.

I think this happens all the time, with even the original authors possibly forgetting their own technical assumptions when they’re not thinking hard about it.

Note that I’m not accusing anyone of literally believing the Reflective Bayesian positions I’ve outlined. (Actually, in particular, I want to avoid accusing you… some of my other readers, perhaps...) What I’m actually saying is that it was a belief operating in the background, heuristically influencing how people thought about things.

For example:

A naive argument for the reflective consistency of UDT: “UDT just takes the actions which are optimal according to its prior. It can evaluate the expected utility of alternate policies by forward-sampling interactions from its prior. The actions which it indeed selects are going to be optimal, by definition. So, other policies look at best equally good. Therefore, it should never want to self-modify to become anything else.”

I think most of the people who thought about UDT probably believed something like this at some point.

There are several important mistakes in this line of reasoning.

  • It plays fast and loose with the distinction between what UDT does and what UDT expects itself to do. If UDT isn’t logically omniscient, it may not know exactly what its own future actions are. Thus, it might prefer to make precommitments even if it worsens its strategy in doing so, because it might prefer to know with certainty that it will make a halfway decent selection, rather than stay in the dark about what it will do.

  • It mistakes the “outer” expected value (the average value of sampling) with the “inner” expected value (the actual subjective value of an action, according to the agent). This is usually not a distinction we need to make, but in the face of logical uncertainty, subjective expectations can certainly be different from the expectations which can be calculated by averaging according to the prior.

  • It implicitly assumes that alternate policies have no additional information about the environment, IE, their actions cannot be correlated with what happens. This reflects the “I can’t get information without observing things” assumption. But in fact, it’s possible to get better information by thinking longer, so UDT can prefer to self-modify into something which thinks longer and ends up with a totally different policy.

My overall point, here, is just that we should be careful about these things. Simple belief and reflective belief are not identical. A Bayesian reasoner does not necessarily prefer to keep being a Bayesian reasoner. And a Bayesian reasoner can prefer a non-Bayesian update to become a different Bayesian reasoner.

  1. It’s very possible to prefer a different probability distribution to your own. In particular, you’d usually like to update to use priors which are better-informed. This can be characterized as “thinking longer” in the cases where the better-informed prior is expressed as a computation which you can run.

  2. It’s similarly possible to prefer a different utility function to your own. There is no law of Bayesian reasoning which says that your utility function is best. The Gandhi murder pill thought experiment does illustrate an important fact, that agents will tend to protect themselves from arbitrary value-shifts. However, viewing some value shifts positively is totally allowed.

The goal of a Radical Probabilist should be to understand these non-Bayesian updates, trimming the notion of “rationality” to include only that which is essential.



Truth and Paradox by Tim Maudlin is an extreme example of this; by the end of the book, Maudlin admits that what he is writing cannot be considered true on his own account. He proceeds to develop a theory of permissible assertions, which may not be true, but are normatively assertible. To top it off, he shows that no theory of permissibility can be satisfactory! He even refers to this as “defeat”. Yet, he sees no better alternative, and so continues to justify his work as (mostly) permissible, though untrue.


Note that although the simple/​reflective distinction is inspired by rigorous formal ideas in logic, I’m not in fact taking a super formal approach here. Note the absence of a formal definition of “reflective belief”. I think there are several different formal definitions one could give. I mean any of those. I consider my definition to include any reason why someone might argue for a position, perhaps even dishonestly (although dishonesty isn’t relevant to the current discussion, and should probably be viewed as a borderline case).


Aside: it’s difficult to reliably maintain this distinction! When asserting things, are you asserting them simply or reflectively? Suppose I read Tim Maudlin’s book (see footnote #1). What is “Tim Maudlin’s position”? I can see good reasons to take it as (a) the explicit assertions, (b) the belief system which would endorse those explicit assertions, or (c) the belief system which the explicit assertions would themselves endorse.

In many circumstances, you’d say that what an author reflectively believes is their explicit assertions, and what they simply believe is the implicit belief system which leads them to make those assertions. Note what this implies: if you claim X, then your simple belief is the reflective belief in X, and your reflective position is simple belief in X! Headache-inducing, right?

But this often gets more confusing, not less, if (as in Tim Maudlin’s case) the author starts explicitly dealing with these level distinctions. What should you think if I tell you I simply believe in X? I think it depends on how much you trust my introspective ability. If you don’t trust it, then you’ll conclude that I have belief-in-belief; I endorse simple belief in X (which is probably the same as endorsing X, ie, reflectively believing X). On the other hand, if you do trust my introspective ability, then you might take it to mean “I believe X, but I don’t know why /​ I don’t know whether I endorse my reasons for that belief”. This is like the Leverage Research concept of “belief report”. This means you can take my assertion at face value: I’ve given you one of my simple beliefs.

But what if someone makes a habit of giving you their simple beliefs, rather than their reflective beliefs? This might be an honesty thing, or possibly an unreflective habit. Philosophers, academics, and smart people generally might be stuck in a rut of only giving reflective positions, because they’re expecting to have to defend their assertions (and they like making defensible assertions). This calls into question whether/​when we should assume that someone is giving us their reflective beliefs rather than their simple beliefs.

And what if I tell you I reflectively believe in X? Do you take that at face value? Or do you think I reflectively reflectively believe X (so my simple belief is Z, a position which reflectively endorses the position Y—where Y is a position which reflectively endorses X).

… You can see where things get difficult.


By “Bayesianism before radical probabilism” I don’t mean a temporal/​historic thing, EG, Bayesianism before the 1950s (when Jeffrey first began inventing Radical Probabilism). Rather, I mean “the version of Bayesianism which strongly weds itself to Bayesian updates.” Most centrally, I’m referring to LessWrong before Logical Induction.


Simply put, early LessWrong reflectively believed in (classical) Bayesianism, and thus simply believed the justifying assumptions associated with Bayesianism. But few, if any reflectively believed those assumptions—indeed, those assumptions have little justification when examined, and life gets more interesting when assuming their negation.

The only general advice I can think of to avoid this mistake is “don’t lose track of your assumptions”.