My instinct is to not kill the loved one, but on virtue-ethics grounds, not because of any sort of counterfactual reciprocity argument. My understanding is that UDT is not actually computable. As a result, no possible agent can act as you describe. So this doesn’t seem like a particularly compelling thought experiment.
if I’m deciding what to do with a hostage, it makes no difference what the other party decides. What matters is my judgement of them right before we became causally separated—and I am skeptical that my decision-making after the separation is useful evidence on this point.
More broadly, I can think of lots of reasons to take counterfactual possibilities into account. But none of them require me to say that the counterfactual “really exists”. For instance, I’m worried about people judging me for being reckless, dishonorable, etc. What’s the case where I actually care about non-causal interactions?
My understanding is that UDT is not actually computable. As a result, no possible agent can act as you describe. So this doesn’t seem like a particularly compelling thought experiment.
Are you confusing UDT with AIXI? It is certainly possible for an agent to act as described and the tricky part isn’t anything to do with “UDT” (but rather the possible but difficult task of making the predictions.)
What’s the case where I actually care about non-causal interactions?
The case given is sufficient. Anyone who is capable of one-boxing on Newcomb’s problem will, if consistent, also cooperate with agents that cross out of the future light cone based on utility maximisation grounds given the payoffs described. If they either two box or defect then they are implementing a faulty decision algorithm.
My understanding is that UDT requires agent A to have some prediction for what agent B will do. This is, in general, not computable. (The proof follows from Rice’s theorem.)
My instinct is to not kill the loved one, but on virtue-ethics grounds, not because of any sort of counterfactual reciprocity argument. My understanding is that UDT is not actually computable. As a result, no possible agent can act as you describe. So this doesn’t seem like a particularly compelling thought experiment.
if I’m deciding what to do with a hostage, it makes no difference what the other party decides. What matters is my judgement of them right before we became causally separated—and I am skeptical that my decision-making after the separation is useful evidence on this point.
More broadly, I can think of lots of reasons to take counterfactual possibilities into account. But none of them require me to say that the counterfactual “really exists”. For instance, I’m worried about people judging me for being reckless, dishonorable, etc. What’s the case where I actually care about non-causal interactions?
Are you confusing UDT with AIXI? It is certainly possible for an agent to act as described and the tricky part isn’t anything to do with “UDT” (but rather the possible but difficult task of making the predictions.)
The case given is sufficient. Anyone who is capable of one-boxing on Newcomb’s problem will, if consistent, also cooperate with agents that cross out of the future light cone based on utility maximisation grounds given the payoffs described. If they either two box or defect then they are implementing a faulty decision algorithm.
For an example that doesn’t include any potential exploitation of loved ones see Belief in the Implied Invisible.
My understanding is that UDT requires agent A to have some prediction for what agent B will do. This is, in general, not computable. (The proof follows from Rice’s theorem.)