Timeless Modesty?

Looking back at my description of policy-level thinking, one might charge it of the same crime as outside view; namely, being overly likely to lead to modest epistemology:

Policy-level thinking, on the other hand, helps you to not get lost in the details. It provides the rudder which can keep you moving in the right direction. It’s better at cooperating with others, maintaining sanity before you figure out how it all adds up to normality, and optimizing your daily life.

I aim to clarify that “cooperating with others” and “maintaining sanity” does not mean modesty, here.

Both Eliezer’s initial argument against the modesty argument and his recent detailed explication, and reductio ad absurdum, of that argument call to mind timeless decision theory. In his early post, Eliezer says:

The central argument for Modesty proposes something like a Rawlsian veil of ignorance—how can you know which of you is the honest truthseeker, and which the stubborn self-deceiver?

Does reasoning from behind a veil of ignorance really support the modesty argument? Would a TDT agent use modest epistemology? I think not.

A TDT agent is supposed to think of itself as having logical control over anything implementing a relevantly similar decision procedure, for any particular decision being considered. We can think of such an agent reasoning about whether te re-write the value of a particular belief. From a position of ignorance, wouldn’t it be better in expectation to average your belief with that of other people? Quoting the old post again:

Should George—the customer—have started doubting his arithmetic, because five levels of Verizon customer support, some of whom cited multiple years of experience, told him he was wrong? Should he have adjusted his probability estimate in their direction? [...] Jensen’s inequality proves even more straightforwardly that, if George and the five levels of tech support had averaged together their probability estimates, they would have improved their average log score.

It’s true: committing to average beliefs whenever there’s a disagreement makes you better off in expectation, if you’re in a situation where you could just as well end up on either side of the error. But, there are two things stopping a TDT agent from reasoning in this way.

It often won’t make sense to say you could just as easily be the one in the wrong.
Even if it does make sense, although averaging beliefs beats doing nothing, there are much better epistemological strategies available.

A TDT agent doesn’t think “What if I’m not a TDT agent?”—it only sees itself as having logical control over sufficiently similar decision procedres. Eliezer said:

Those who dream do not know they dream, but when you are awake, you know you are awake.

The dreamer may lack the awareness needed to know whether it’s a dream. The one who is awake can see the difference. The fact of seeing the difference is, in itself, enough to break the symmetry between the waking one and the dreamer. The TDT machinery isn’t facing a relevantly similar decision in the two cases; it has no logical control over the dreamer.

So, the TDT argument for averaging could only make sense if the other person would consider averaging beliefs with you for the same reason. Even though it is an argument for personal gain (in a TDT sense), the potential gain just isn’t there if the other person isn’t executing the same decision procedure. If Alice and Bob disagree, and Alice moves her belief halfway between hers and Bob’s out of modesty, but Bob doesn’t move his, TDT can’t be justifying Alice; Bob isn’t running a sufficiently similar algorithm. If there’s an argument for modest epistemology in that case, it has to be different from the TDT argument. Alice just considers Bob as someone else who has different beliefs, perhaps rational or perhaps not. But then, it seems like Alice should just be doing a regular Bayesian update on Bob’s beliefs, with no extra averaging step.

As for the second point, even if you are both TDT agents, it still makes more sense to deal with each other in a more Bayesian way.

Imagine you could put a microchip in everyone’s brain which could influence what they believed. You don’t want to install specific facts—that’s crossing a line. What kind of program would you install, to improve everyone’s beliefs as much as possible? You’re essentially in the place of the TDT agent reasoning from behind the veil of ignorance. The chip really will end up in everyone. However, you can’t just remove people’s biases—you’re stuck with the wetware as-is. All you can do is add some programmed responses to tip things in the right direction. What’s the best policy?

You could install modest epistemology. Objection #1 is gone; you know everyone is following the same policy. Do you nudge people to average their beliefs with each other whenever they disagree?

Ah, but remember that people’s biases are intact. Wouldn’t you be better off estimating some kind of epistemic trust in the other person, first, before averaging? People with very poor native calibration should just copy the beliefs of those with good calibration, and those with good calibration should mostly not average with others. This will increase the expected score much more.

But in order to do that, you need to set up good calibration estimation. Before you know it, you’re setting up a whole system of Bayesian mechanics. Given the limitations of the situation, it’s not clear whether the optimal policy would result in agreement (be it Aumann-style Bayesian agreement or modest belief-averaging style).

So, the modest epistemologist isn’t being creative enough in their policy. They recognize a legitimate improvement over the do-nothing strategy, but not a very good policy on the whole.