I’ve been confused by the traditional description of CDT failing at the Newcomb problem. I understand CDT to be something like “pick the choice with the highest expected value”. This is how I imagine such an agent reasoning about the problem:
“If I one-box, then I get $0 with epsilon probability and $1m with one-minus-epsilon probably for an expected value of ~$1m. If I two-box, then I get $1k + $1m with epsilon probability and $1k with one-minus-epsilon probably for an expected value of ~$1k. One-boxing has higher expected value, so I should one-box.”
How does the above differ from actual CDT? Most descriptions I’ve heard have the agent considering Omega’s prediction as a single unknown variable with some distribution and then showing that this cancels out of the EV comparison, but what’s wrong with considering the two EV calculations independently of each other and only comparing the final numbers?
I struggled with this for a long time. I forget which of the LW regulars finally explained it simply enough for me. CTD is not “classical decision theory”, as I previously believed, and it does not summarize to “pick highest expected value”. It’s “causal decision theory”, and it optimizes on results based on a (limited) causal model, which does not allow the box contents to be influenced by the (later in time) choice the agent makes.
“naive, expectation-based decision theory” one-boxes based on probability assignments, regardless of causality—it shuts up and multiplies (sum of probability times outcome). But it’s not a formal prediction model (which causality is), so doesn’t help much in designing and exploring artificial agents.
IOW, causal decision theory is only as good as it’s causal model, which is pretty bad for situations like this.
I’ve been confused by the traditional description of CDT failing at the Newcomb problem. I understand CDT to be something like “pick the choice with the highest expected value”. This is how I imagine such an agent reasoning about the problem:
“If I one-box, then I get $0 with epsilon probability and $1m with one-minus-epsilon probably for an expected value of ~$1m. If I two-box, then I get $1k + $1m with epsilon probability and $1k with one-minus-epsilon probably for an expected value of ~$1k. One-boxing has higher expected value, so I should one-box.”
How does the above differ from actual CDT? Most descriptions I’ve heard have the agent considering Omega’s prediction as a single unknown variable with some distribution and then showing that this cancels out of the EV comparison, but what’s wrong with considering the two EV calculations independently of each other and only comparing the final numbers?
I struggled with this for a long time. I forget which of the LW regulars finally explained it simply enough for me. CTD is not “classical decision theory”, as I previously believed, and it does not summarize to “pick highest expected value”. It’s “causal decision theory”, and it optimizes on results based on a (limited) causal model, which does not allow the box contents to be influenced by the (later in time) choice the agent makes.
“naive, expectation-based decision theory” one-boxes based on probability assignments, regardless of causality—it shuts up and multiplies (sum of probability times outcome). But it’s not a formal prediction model (which causality is), so doesn’t help much in designing and exploring artificial agents.
IOW, causal decision theory is only as good as it’s causal model, which is pretty bad for situations like this.