My short take: your decision algorithm that outputs saving or not saving the murderer is instantiated multiple times. Anyone who tries to predict your output also runs a more or less precise simulation of your algorithm. Suppose a perfect predictor murderer in the past. In this case, no matter what your decision is, the prediction was the same.
So, you can reason this way: “although I don’t know my final decision yet, I know that it correlates with the prediction perfectly. Therefore I also have to consider the consequences and resulting utilities of the prediction when making the decision. Shouldn’t I just act then as if was controlling the output of both my current algorithm and that of the predictor, weighing the utilities together? I should output a decision now such that maximizes utility over present and past, because the past prediction mirrors the current me perfectly.”
And if there are imperfect predictors involved (or algorithms with imperfectly correlated outputs), you reason as if you had imperfect control over their outputs. As far as I managed to understand it, this is TDT. Note that there is some interesting self-referentiality: the TDT algorithm computes the expected utility of its own “possible” outputs, and then makes output with maximum utility.
My short take: your decision algorithm that outputs saving or not saving the murderer is instantiated multiple times. Anyone who tries to predict your output also runs a more or less precise simulation of your algorithm. Suppose a perfect predictor murderer in the past. In this case, no matter what your decision is, the prediction was the same.
So, you can reason this way: “although I don’t know my final decision yet, I know that it correlates with the prediction perfectly. Therefore I also have to consider the consequences and resulting utilities of the prediction when making the decision. Shouldn’t I just act then as if was controlling the output of both my current algorithm and that of the predictor, weighing the utilities together? I should output a decision now such that maximizes utility over present and past, because the past prediction mirrors the current me perfectly.”
And if there are imperfect predictors involved (or algorithms with imperfectly correlated outputs), you reason as if you had imperfect control over their outputs. As far as I managed to understand it, this is TDT. Note that there is some interesting self-referentiality: the TDT algorithm computes the expected utility of its own “possible” outputs, and then makes output with maximum utility.