1) A good decision theory should always do what it would have precommitted to doing.
It’s dangerous to phrase it this way, since coordination (which is what really happens) allows using more knowledge than was available at the time of a possible precommitment, as I described here.
4) Writing P is equivalent to supplying only one bit: should P pay up if asked?
Not if the correct decision depends on an abstract fact that you can’t access, but can reference. In that case, P should implement a strategy of acting depending on the value of that fact (computing and observing that value to feed to the strategy). That is, abstract facts that will only be accessible in the future play the same role as observations that will only be accessible in the future, and a strategy can be written conditionally on either.
The difference between abstract facts and observations however is that observations may tell you where you are, without telling you what exists and what doesn’t (both counterfactuals exist and have equal value, you’re in one of them), while abstract facts can tell you what exists and what doesn’t (the other logical counterfactual doesn’t exist and has zero value).
4) Writing P is equivalent to supplying only one bit: should P pay up if asked?
Not if the correct decision depends on an abstract fact that you can’t access, but can reference.
In general, the distinction is important. But, for this puzzle, the proposition “asked” is equivalent to the relevant “abstract fact”. The agent is asked iff the millionth digit of pi is odd. So point (4) already provides as much of a conditional strategy as is possible.
It’s assumed that the agent doesn’t know if the digit is odd (and whether it’ll be in the situation described in the post) at this point. The proposal to self-modify is a separate event that precedes the thought experiment.
I see, so there’s indeed just one bit, and it should be “don’t cooperate”.
This is interesting in that UDT likes to ignore epistemic significance of observations, but here we have an observation that implies something about the world, and not just tells where the agent is. How does one reason about strategies if different branches of those strategies tell something about the value of the other branches?..
As Tyrrell points out, it’s not as simple. When you’re considering the strategy of what to do if you’re on the giving side of the counterfactual (“Should P pay up if asked?”), the fact that you’re in that situation already implies all you wanted to know about the digit of pi, so the strategy is not to play conditionally on the digit of pi, but just to either pay up or not, one bit as you said. But the value of the decision on that branch of the strategy follows from the logical implications of being on that branch, which is something new for UDT!
It’s dangerous to phrase it this way, since coordination (which is what really happens) allows using more knowledge than was available at the time of a possible precommitment, as I described here.
Not if the correct decision depends on an abstract fact that you can’t access, but can reference. In that case, P should implement a strategy of acting depending on the value of that fact (computing and observing that value to feed to the strategy). That is, abstract facts that will only be accessible in the future play the same role as observations that will only be accessible in the future, and a strategy can be written conditionally on either.
The difference between abstract facts and observations however is that observations may tell you where you are, without telling you what exists and what doesn’t (both counterfactuals exist and have equal value, you’re in one of them), while abstract facts can tell you what exists and what doesn’t (the other logical counterfactual doesn’t exist and has zero value).
In general, the distinction is important. But, for this puzzle, the proposition “asked” is equivalent to the relevant “abstract fact”. The agent is asked iff the millionth digit of pi is odd. So point (4) already provides as much of a conditional strategy as is possible.
It’s assumed that the agent doesn’t know if the digit is odd (and whether it’ll be in the situation described in the post) at this point. The proposal to self-modify is a separate event that precedes the thought experiment.
Yes. Similarly, it doesn’t know whether it will be asked (rather than do the asking) at this point.
I see, so there’s indeed just one bit, and it should be “don’t cooperate”.
This is interesting in that UDT likes to ignore epistemic significance of observations, but here we have an observation that implies something about the world, and not just tells where the agent is. How does one reason about strategies if different branches of those strategies tell something about the value of the other branches?..
Good point, thanks. I think it kills my argument.
ETA: no, it doesn’t.
As Tyrrell points out, it’s not as simple. When you’re considering the strategy of what to do if you’re on the giving side of the counterfactual (“Should P pay up if asked?”), the fact that you’re in that situation already implies all you wanted to know about the digit of pi, so the strategy is not to play conditionally on the digit of pi, but just to either pay up or not, one bit as you said. But the value of the decision on that branch of the strategy follows from the logical implications of being on that branch, which is something new for UDT!