Contra Heighn Contra Me Contra Functional Decision Theory

In my article about why Eliezer was often confidently and egregiously wrong, I gave his decision theory as a solid example of this. Eliezer thinks that the correct theory of rationality—of what it is wise to do in a particular situation—is FDT which says roughly that you should act in ways such that agents acting in those ways get the most utility. Heighn and I are having a debate about this that will go back and forth for a bit—so here’s my response to Heighn’s first post.

I think basically all of Heighn’s replies rest on the same crucial error, so I won’t go through line by line. Instead, I’ll briefly explain the core argument against FDT, which has various manifestations, and then why the basic reply doesn’t work. I’ll then clean up a few other extraneous points.

The most important argument against FDT is that, while it’s a fine account of what type of agent you want to be, at least, in many circumstances, it’s a completely terrible account of rationality—of what it’s actually wise to do when you’re in one situation. Suppose that there’s an agent who has a very high probability of creating people who once they exist will cut off their legs in ways that don’t benefit them. In this case, cutting off one’s legs is clearly irrational—one doesn’t benefit at all and yet is harmed greatly. But despite that, FDT instructs one to cut off their legs because those agents get more utility, on average.

Heighn’s response to this argument is that this is a perfectly fine prescription. After all, agents who follow their advice get more utility on average than agents who follow EDT or CDT.

But the CDTists have a perfectly adequate description of this. Sometimes, it pays to be irrational. If there is a demon who will pay only those who are irrational, then it obviously pays to be irrational. But this doesn’t make it rational to be irrational. Cutting off your legs in ways that you know will never benefit you or anyone else is flagrantly irrational—it is bad for everyone—and this is so even if such agents win more.

Heighn would presumably agree with this. After all, if there’s a demon who pays a billion dollars to everyone who follows CDT or EDT then FDTists will lose out. The fact you can imagine a scenario where people following one decision theory are worse off is totally irrelevant—the question is whether a decision theory provides a correct account of rationality. But if a theory involves holding that you should cut off your legs for no reason, it clearly does not. It doesn’t matter that the type of agent who cuts off their legs is better off on average—when you’re in the situation, you don’t care about what kind of agents do what, you only care about the utility that you can get from the ac. Thus when Heighn asks:

Another way of looking at this is asking: “Which decision theory do you want to run, keeping in mind that you might run into the Blackmail problem?” If you run FDT, you virtually never get blackmailed in the first place.

they are asking the wrong question. Decision theories are not about what kind of agent you want to be. There is no one on god’s green earth who disputes that the types of agents who one box are better off on average. Decision theory is about providing a theory of what is rational.

The next case I give which comes from Wolfgang Schwarz, is the following:

Procreation. I wonder whether to procreate. I know for sure that doing so would make my life miserable. But I also have reason to believe that my father faced the exact same choice, and that he followed FDT. If FDT were to recommend not procreating, there’s a significant probability that I wouldn’t exist. I highly value existing (even miserably existing). So it would be better if FDT were to recommend procreating. So FDT says I should procreate. (Note that this (incrementally) confirms the hypothesis that my father used FDT in the same choice situation, for I know that he reached the decision to procreate.)

Heighn says I don’t “explain why [I believe] FDT gives the wrong recommendation here.” I think the quote from Schwarz that I provide is pretty clear but just to make it clearer—FDT instructs one to act in some way if FDT prescribing that one act that way produces highest average utility. Thus, in procreation, a person gets more utility if FDT prescribes procreation because the person is more likely to exist in possible worlds where procreation prescribes that. Heighn’s comments here strike me as pretty confused:

This problem doesn’t fairly compare FDT to CDT though. By specifying that the father follows FDT, FDT’ers can’t possibly do better than procreating. Procreation directly punishes FDT’ers—not because of the decisions FDT makes, but for following FDT in the first place.

They could do better. They could follow CDT and never pass up on the free value of remaining child-free. This is not a scenario where a person is punished directly for following FDT—it is a scenario where FDT prescribes acting in a certain way because doing so makes your father more likely to create you, even once they have already created you, even though that is clearly bad for you. This becomes easier to see with Heighn’s attempted parody:

I can easily make an analogous problem that punishes CDT’ers for following CDT:

ProcreationCDT. I wonder whether to procreate. I know for sure that doing so would make my life miserable. But I also have reason to believe that my father faced the exact same choice, and that he followed CDT. I highly value existing (even miserably existing). Should I procreate?

FDT’ers don’t procreate here and live happily. CDT’ers wouldn’t procreate either and don’t exist. So in this variant, FDT’ers fare much better than CDT’ers.

It is true that we can easily gerrymander a scenario where following every decision theory ends up being bad for you. But that’s not a problem—decision theories are not theories of what’s good for you. Indeed, there is no across-the-board theory of which decision theory will make your life go best. They are intended as theories of what is rational to do in particular scenarios. Procreating seems clearly irrational here, even if such agents end up getting punished. Again, it’s important to disambiguate the question of “which agent would you rather be” from “which agent is rational.” Rationality doesn’t always pay (E.g. when demons artificially rig things to be bad for the rational).

Once the person already exists, it doesn’t matter what % of agents of a certain type exist. They exist—and as such, they have no reason to lose out on free value. Once you already exists, you don’t care about other agents in the reference class. And these agents all are making decisions after they already exist, so they have no reason to take into account causally irrelevant subjective dependence.

In short, FDT often prescribes harming yourself in ways that are guaranteed never to benefit you. This would be totally fine if it were a theory of what kind of agent to be, but once you exist, there’s no good reason to harm yourself in ways that are guaranteed to give you no rewards. Its appeal comes entirely from mixing up what decision-making procedure you want to have from which one is rational. Those clearly come apart in lots of situations. Finally, Heighn quotes me saying:

The basic point is that Yudkowsky’s decision theory is totally bankrupt and implausible, in ways that are evident to those who know about decision theory.

In reply, Heighn says:

Are you actually going to argue from authority here?! I’ve spoken to Nate Soares, one of the authors of the FDT paper, many times, and I assure you he “knows about decision theory”.

Yes, yes I am. Not from the authority of me in particular—I’m a random undergraduate who no one should defer to. I do not know of a single academic decision theorist who accepts FDT. When I bring it up with people who know about decision theory, they treat it with derision and laughter. There have been maybe one or two published papers ever defending something in the vicinity. Thus, it is

opposed by nearly all academics

something that I think rests on basic errors.

I don’t know much about Soares, so I won’t comment on how much he knows about decision theory. But I sort of dubious that he knows a lot about it and even if he does, it’s not hard to find one or two informed people defending crazy, fringe positions.

Finally, Heighn accuses MacAskill of misrepresenting FDT. MacAskill says:

First, take some physical processes S (like the lesion from the Smoking Lesion) that causes a ‘mere statistical regularity’ (it’s not a Predictor). And suppose that the existence of S tends to cause both (i) one-boxing tendencies and (ii) whether there’s money in the opaque box or not when decision-makers face Newcomb problems. If it’s S alone that results in the Newcomb set-up, then FDT will recommending two-boxing.

But now suppose that the pathway by which S causes there to be money in the opaque box or not is that another agent looks at S and, if the agent sees that S will cause decision-maker X to be a one-boxer, then the agent puts money in X’s opaque box. Now, because there’s an agent making predictions, the FDT adherent will presumably want to say that the right action is one-boxing.

In response, Heighn says:

This is just wrong: the critical factor is not whether “there’s an agent making predictions”. The critical factor is subjunctive dependence, and there is no subjunctive dependence between S and the decision maker here.

But in this case there is subjective dependence. The agent’s report depends on whether the person will actually one box on account of the lesion. Thus, there is an implausible continuity on account of it mattering whether to one box the precise causal mechanisms of the box.

To recap, I think that once we disambiguate what you should do when you’re already in the scenario, and your actions are guaranteed not to affect your odds of existence from what kind of agent gets more average utility, FDT seems crazy. It results in the consequence that you should burn money in ways that are guaranteed never to benefit you. My description of it as crazy was somewhat harsh but, I think, accurate.