gjm comments on What is Wei Dai’s Updateless Decision Theory?

gjm 7 Jul 2015 14:23 UTC
4 points
1
UDT says: Instead of choosing actions piecemeal, make a once-for-all choice of how you will behave in all situations. (This choice may affect what happens not only via your actions but also because there may be things in the universe that are able to predict your behaviour and act accordingly.)

That’s pretty much it.

More tersely: “Choose strategies, not actions.”

What’s nice about it: (1) It’s simple. (2) In some scenarios (mostly rather implausible ones, but things that resemble them may arise in real life) where other decision theories give “bad” answers (i.e., ones where it seems like different choices would have left you better off) UDT lets you do better than they do.

Example 1: Newcomb’s paradox. You are confronted by a superpowered being (conventionally called Omega) who, you have good reason to believe, is fantastically successful in predicting people’s actions. It places in front of you two boxes. One is transparent and contains $1. The other is opaque. Omega explains that you may take neither box, either box, or both boxes, that it has already predicted your choice, and that the opaque box contains $1M if Omega predicted you would not take the transparent box and nothing otherwise.

Some decision theories would have you reason as follows: Whatever Omega has done, it has already done and my action now makes no difference to it. Taking the transparent box gets me an extra $1. Taking the opaque box may or may not get me an extra $1M, depending on things now outside my control. So I’ll take both boxes. -- And then you almost certainly get just $1.

UDT says: When you are deciding on how you will act in every possible situation, work out what happens for each choice of boxes to take here. The result is (of course) that you should take just the opaque box. So, when the situation actually arises, you take just the opaque box. Omega can reliably predict that you will do so, so you almost certainly get $1M.

Of course you can’t really decide ahead of time what you will do in every possible situation. But what you can do is make a commitment that you will always attempt to act as if you had done. So when Omega astonishes you by appearing before you and presenting you with this choice, you can go through the above reasoning even if you’ve never considered the matter before.

Example 2: Parfit’s hitchhiker. You are very rich and utterly selfish, and you are stranded in the desert far from civilization; you will die if you don’t get help. A car comes along, you flag it down, and you are surprised to see it being driven by Omega—who, let’s remember, is extremely good at predicting your behaviour. What you would like to say is: “Please, please, please drive me to the nearest city. Then I can get to a bank and give you a big pile of money as a reward for saving my life”. But if you follow some varieties of decision theory, then as soon as you’re back in the city you will run off and pay Omega nothing; and Omega, having the ability to predict your behaviour, will see this and decline to help. (We shall suppose for the sake of argument that despite Omega’s great powers, it is in want of both time and money.)

UDT says: When you are deciding on how you will act in every possible situation, work out what happens for each choice when you get into the city. The result is (of course) that you should pay Omega. So, when the situation actually arises, you pay up. Omega can reliably predict that you will do so, and will almost certainly save your life in order to get the reward.

There is probably no Omega in the real world. But there are other people who are quite good at predicting people’s actions; they may be quite good at detecting dishonesty and malicious intent and so forth. Dealing with them may resemble dealing with Omega in some respects.