Reflexive decision theory is an unsolved problem

By “reflexive decision theory”, hereafter RDT, I mean any decision theory that can incorporate information about one’s own future decisions into the process of making those decisions. RDT is not itself a decision theory, but a class of decision theories, or a property of decision theories. Some say it is an empty class. FDT (to the extent that it has been worked out — this is not something I have kept up with) is an RDT.

The use of information of this sort is what distinguishes Newcomb’s problem, Parfit’s Hitchhiker, the Smoking Lesion, and many other problems from “ordinary” decision problems which treat the decision-maker as existing outside the world that he is making decisions about.

There is currently no generally accepted RDT. Unfortunately, that does not stop people insisting that every decision theory but their favoured one is wrong or crazy. There is even a significant literature (which I have never seen cited on LW, but I will do so here) saying that reflexive decision theory itself is an impossibility.

Reflexive issues in logic

We have known that reflexiveness is a problem for logic ever since Russell said to Frege, “What about the set of sets that aren’t members of themselves?” (There is also the liar paradox going back to the ancient Greeks, but it remained a mere curiosity until people started formalising logic in the 19th century. Calculemus, nam veritas in calculo est.)

In set theory, this is a solved problem, solved by the limited comprehension axiom. Since Gödel, we also have ways of making theories talk about themselves, and there are all manner of theorems about the limits of how well they can introspect: Gödel’s incompleteness theorem, Löb’s theorem, etc.

Compared with that, reflexive decision theory has hardly even started.

Those who know not, and know not that they know not

Many think they have solutions, but they disagree with each other, and keep on disagreeing. So we have the situation where CDT-ers say “but the boxes already contain what they contain!”, and everyone with an RDT replies “then you’ll predictably lose!”, and both point with scorn at EDT and say “you think you can change reality by managing the news!” The words “flagrantly, confidently, egregiously wrong” get bandied about, at least by one person. Everyone thinks everyone else is crazy. There is also a curious process by which an XDT’er, for any value of X, responds to counterexamples to X by modifying XDT and claiming it’s still XDT, to the point of people ending up saying that CDT and EDT are the same. Now that’s crazy.

Those who know not, and know that they know not

Some people know that they do not have a solution. Andy Egan, in “Some Counterexamples to Causal Decision Theory” (2007, Philosophical Review), shoots down both CDT and EDT, but only calls for a better theory, without any suggestions for finding it.

Those who reject the very idea of an RDT

Some deny the possibility of any such theory, such as Marion Ledwig (“The No Probabilities for Acts-Principle”), who formulates the principle thus: “Any adequate quantitative decision model must not explicitly or implicitly contain any subjective probabilities for acts.” This rejects the very idea of reflexive decision theory. It also implies that one-boxing is wrong for Newcomb’s problem, and Ledwig explicitly says that it is.

For the original statement of the principle, Ledwig cites Spohn (1999, “Strategic Rationality”, and 1983 “Ein Theorie Der Kausalität”. My German is not good enough to analyse what he says in the latter reference, but in the former he says that the rational strategy in one-shot Prisoners’ Dilemma is to defect, and in one-shot Newcomb is to two-box. Ledwig and Spohn trace the idea back to Savage’s 1954 “The Foundations of Statistics”. Savage’s whole framework, however, in common with the other “classical” theories such as Jeffrey-Bolker and VNM, does not have room in it for any sort of reflexiveness, ruling it out implicitly rather than considering the idea and explicitly rejecting it. There is more in Spohn 1977 “Where Luce and Krantz Do Really Generalize Savage’s Decision Model” where Spohn says:

“[P]robabilities for acts play no role in decision making. For, what only matters in a decision situation is how much the decision maker likes the various acts available to him, and relevant to this, in turn, is what he believes to result from the various acts and how much he likes these results. At no place does there enter any subjective probability for an act.”

There is also Itzhak Gilboa “Can free choice be known?”. He says, “[W]e are generally happier with a model in which one cannot be said to have beliefs (let alone knowledge) of one’s own choice while making this choice”, and looks for a way to resolve reflexive paradoxes by ruling out reflexiveness.

These people all defect in PD and two-box in Newcomb. The project of RDT is to do better.

(ETA: Thanks to Sylvester Kollin (see comments) for drawing my attention to a later paper by Spohn in which he has converted to one-boxing within a causal decision theory.)

Reflexiveness in the real world

People often make decisions that take into account the states of mind of the people they are interacting with, including other people’s assessments of one’s own state of mind. This is an essential part of many games (such as Diplomacy) and of many real-world interactions (such as diplomacy). A theory satisfying “no foreknowledge of oneself” (my formulation of “no probabilities for acts”) cannot handle these. (Of course one can have foreknowledge of oneself; the principle only excludes this information from input into one’s decisions.)

The principle “know thyself” is as old as the Liar paradox.

Just as there have been contests of bots playing iterated prisoners’ dilemma, so there have been contests where these bots are granted access to each others’ source code. Surely we need a decision theory that can deal with the reasoning processes used by such bots.

The elephant looming behind the efforts of Eliezer and his colleagues to formulate FDT is AI. It might be interesting at some point to have a tournament of bots whose aim is to get other bots to “let them out of the box”.

These practical problems need solutions. The no foreknowledge principle rejects any attempt to think about them; thinking about them therefore requires rejecting the principle. That is the aim of RDT. I do not have an RDT, but I do think that duelling intuition pumps is a technique whose usefulness for this problem has been exhausted. It is no longer enough to construct counterexamples to everyone else, for they can as easily do the same to your own theory. Some general principle is needed that will be as decisive for this problem as limited comprehension is for building a consistent set theory.

A parable

Far away and long ago, each morning the monks at a certain monastery would go out on their alms rounds. To fund the upkeep of the monastery buildings, once every month each monk would buy a single gold coin out of the small coins that he had received, and deposit it in a chest through a slot in the lid.

This system worked well for many years.

One month, a monk thought to himself, “What if I drop in a copper coin instead? No-one will know I did it.”

That month, when the chest was opened, it contained nothing but copper coins.