I was thrown off by the word “precommit”, which implies a reflectively inconsistent strategy, which is TDT-anathema. On the other hand, rational agents win, so having that strategy does make sense in that case, despite the fact that we might incur negative utility relative to playing submissively if we had to actually carry it out.
The solution, I think, is to be “the type of agent who would be ruthlessly vindictive against opponents who have enough predictive capability to see that I’m this type of agent, and enough strategic capability to accept that this means they gain nothing by defecting against me.” That makes it a reflectively consistent part of a decision theory, by keeping the negative-utility behavior in the realm of the pure counterfactual. As long as you know that having that strategy will effectively deter the other player, I think it can work.
And if not, or if I’ve made an error in some detail of my reasoning of how to make it work, I’m fairly confident at this point that an ideal TDT-agent could find a valid way to address the problem case in a reflectively consistent and strategically sound manner.
You know, you’re right.
I was thrown off by the word “precommit”, which implies a reflectively inconsistent strategy, which is TDT-anathema. On the other hand, rational agents win, so having that strategy does make sense in that case, despite the fact that we might incur negative utility relative to playing submissively if we had to actually carry it out.
The solution, I think, is to be “the type of agent who would be ruthlessly vindictive against opponents who have enough predictive capability to see that I’m this type of agent, and enough strategic capability to accept that this means they gain nothing by defecting against me.” That makes it a reflectively consistent part of a decision theory, by keeping the negative-utility behavior in the realm of the pure counterfactual. As long as you know that having that strategy will effectively deter the other player, I think it can work.
And if not, or if I’ve made an error in some detail of my reasoning of how to make it work, I’m fairly confident at this point that an ideal TDT-agent could find a valid way to address the problem case in a reflectively consistent and strategically sound manner.