I agree, that’s a serious issue with the setup here. The simple answer is that I didn’t think of that when I was writing the post. I later noticed the problem, but how to react isn’t totally obvious.
Defense #1: An easy response is that I was talking about updateful DTs in my smoking lesion discussion. If a DT learns, it is hard to see why it would have seriously miscalibrated estimates of its own behavior. For UDT, there is no similar argument. Therefore the post as written above stands.
Reply: Perhaps that’s not very satisfying, though—despite UDT’s fixed prior, failure due to lack of calibration about oneself seems like a particularly damning sort of failure. We might construct the prior using something similar to a reflective oracle to block this sort of problem.
Defense #2: Then, the next easy response is that material-conditional-based UDT 1.0 with such a self-knowledgeable prior has two possible fixed points. The probability distribution described in the post isn’t one of them, but one with a more extreme assignment favoring dancing is: if the prior expects the agent to dance with certainty or almost certainly, then dancing looks good, and not dancing looks like a way to guarantee you don’t get the money. Again, the concern raised in the post is a valid one, just requiring a tweak to the probabilities in the example.
Reply: Sure, but the solution in this case is very clear: you have to select the best fixed point. This seems like an option which is available to the agent, or to the agent designer.
Defense #3: True, but then you’re essentially taking a different counterfactual to decide the consequences of a policy: consideration of what fixed point it puts you in. This implies that you have something richer than just a probability distribution to work with, vindicating the overall point of the post, which is to discuss an issue which arises if you try to “condition on a conditional” when given only a probability distribution on actions and outcomes. Reasoning involving fixed points is going to end up being a (very particular) way to add a more basic counterfactual, as suggested by the post.
Also, even if you do this, I would conjecture there’s going to be some other problem with using the material conditional formulation of conditioning-on-conditionals. I would be interested if this turned out not to be true! Maybe there’s some proof that the material-conditional approach turns out not to be equivalent to other possible approaches under some assumptions relating to self-knowledge and fixed-points. That would be interesting.
Also also, if we take the fixed-point idea seriously, there are problems we run into there as well. Reflective oracles (and their bounded cousins, for constructing computable priors) don’t offer a wonderful notion of counterfactual. Selecting a fixed point offers some logical control over predictors which themselves call the reflective oracle to predict you, but if a predictor does something else (perhaps even re-computes the reflective oracle in a slightly different way, side-stepping a direct call to it but simulating it anyway), the result of using selection of fixed point as a notion of counterfactual could be intuitively wrong. You could try to define a special type of reflective oracle which lack this problem. You could also try other options like conditional oracles. But, it isn’t clear how everything should fit together. In particular, if the oracle itself is treated as a part of the observation, what is the type of a policy?
So, “select the best fixed point” may not be the straightforward option it sounds like.
Reply: This seems to not take the concern seriously enough. The overall type signature of “conditioning on conditionals” seems wrong here. The idea of having a probability distribution on actions may be wrong, stopping the argument in the post in its tracks—IE, the post may be right in its conclusion that there is a problem, but we should have been reasoning in a way which never went down that wrong path in the first place, and the conclusion of the post is making too small of a change to accomplish that.
For example, maybe distributed oracles offer a better picture of decision-making: the real process of deciding occurs in the construction of the fixed point, with nothing left over to decide once a fixed point has been constructed.
Clearly matters are getting too complicated for a simple correction to the argument in the post.
Defense #4: I still stand by the post as a cautionary tale about how not to define UDT, barring any “if you deal with self-reference appropriately, the material conditional option turns out to be equivalent to [some other options]” result, which could make me think the problem is more fundamental as opposed to a problem with a naive material-conditional approach to conditioning. The post might be improved by explicitly dealing with the self-reference issue, but the fact that it’s not totally clear how to do so (ie ‘select the best fixed point’ seems to fix things on the surface but has its own more subtle issues when considered as a general approach) makes such a treatment potentially very complicated, so that it’s better to look at the happy dance problem without explicitly worrying about all of that.
The basic point of the post is that formally specifying UDT is complicated even if you assume classical bayesian probability w/o worrying about logical uncertainty. Making UDT into a simple well-defined object requires the further assumption that there’s a basic ‘policy’ object (the observation counterfactual, in the language of the post), with known probabilistic relationships to everything else. This essentially just gives you all the counterfactuals you need, begging the question of where such counterfactual information comes from. This point stands, however naive we might think such an approach is.
I agree, that’s a serious issue with the setup here. The simple answer is that I didn’t think of that when I was writing the post. I later noticed the problem, but how to react isn’t totally obvious.
Defense #1: An easy response is that I was talking about updateful DTs in my smoking lesion discussion. If a DT learns, it is hard to see why it would have seriously miscalibrated estimates of its own behavior. For UDT, there is no similar argument. Therefore the post as written above stands.
Reply: Perhaps that’s not very satisfying, though—despite UDT’s fixed prior, failure due to lack of calibration about oneself seems like a particularly damning sort of failure. We might construct the prior using something similar to a reflective oracle to block this sort of problem.
Defense #2: Then, the next easy response is that material-conditional-based UDT 1.0 with such a self-knowledgeable prior has two possible fixed points. The probability distribution described in the post isn’t one of them, but one with a more extreme assignment favoring dancing is: if the prior expects the agent to dance with certainty or almost certainly, then dancing looks good, and not dancing looks like a way to guarantee you don’t get the money. Again, the concern raised in the post is a valid one, just requiring a tweak to the probabilities in the example.
Reply: Sure, but the solution in this case is very clear: you have to select the best fixed point. This seems like an option which is available to the agent, or to the agent designer.
Defense #3: True, but then you’re essentially taking a different counterfactual to decide the consequences of a policy: consideration of what fixed point it puts you in. This implies that you have something richer than just a probability distribution to work with, vindicating the overall point of the post, which is to discuss an issue which arises if you try to “condition on a conditional” when given only a probability distribution on actions and outcomes. Reasoning involving fixed points is going to end up being a (very particular) way to add a more basic counterfactual, as suggested by the post.
Also, even if you do this, I would conjecture there’s going to be some other problem with using the material conditional formulation of conditioning-on-conditionals. I would be interested if this turned out not to be true! Maybe there’s some proof that the material-conditional approach turns out not to be equivalent to other possible approaches under some assumptions relating to self-knowledge and fixed-points. That would be interesting.
Also also, if we take the fixed-point idea seriously, there are problems we run into there as well. Reflective oracles (and their bounded cousins, for constructing computable priors) don’t offer a wonderful notion of counterfactual. Selecting a fixed point offers some logical control over predictors which themselves call the reflective oracle to predict you, but if a predictor does something else (perhaps even re-computes the reflective oracle in a slightly different way, side-stepping a direct call to it but simulating it anyway), the result of using selection of fixed point as a notion of counterfactual could be intuitively wrong. You could try to define a special type of reflective oracle which lack this problem. You could also try other options like conditional oracles. But, it isn’t clear how everything should fit together. In particular, if the oracle itself is treated as a part of the observation, what is the type of a policy?
So, “select the best fixed point” may not be the straightforward option it sounds like.
Reply: This seems to not take the concern seriously enough. The overall type signature of “conditioning on conditionals” seems wrong here. The idea of having a probability distribution on actions may be wrong, stopping the argument in the post in its tracks—IE, the post may be right in its conclusion that there is a problem, but we should have been reasoning in a way which never went down that wrong path in the first place, and the conclusion of the post is making too small of a change to accomplish that.
For example, maybe distributed oracles offer a better picture of decision-making: the real process of deciding occurs in the construction of the fixed point, with nothing left over to decide once a fixed point has been constructed.
Clearly matters are getting too complicated for a simple correction to the argument in the post.
Defense #4: I still stand by the post as a cautionary tale about how not to define UDT, barring any “if you deal with self-reference appropriately, the material conditional option turns out to be equivalent to [some other options]” result, which could make me think the problem is more fundamental as opposed to a problem with a naive material-conditional approach to conditioning. The post might be improved by explicitly dealing with the self-reference issue, but the fact that it’s not totally clear how to do so (ie ‘select the best fixed point’ seems to fix things on the surface but has its own more subtle issues when considered as a general approach) makes such a treatment potentially very complicated, so that it’s better to look at the happy dance problem without explicitly worrying about all of that.
The basic point of the post is that formally specifying UDT is complicated even if you assume classical bayesian probability w/o worrying about logical uncertainty. Making UDT into a simple well-defined object requires the further assumption that there’s a basic ‘policy’ object (the observation counterfactual, in the language of the post), with known probabilistic relationships to everything else. This essentially just gives you all the counterfactuals you need, begging the question of where such counterfactual information comes from. This point stands, however naive we might think such an approach is.