Charlie Steiner comments on The Happy Dance Problem

Charlie Steiner 18 Nov 2017 20:09 UTC
1 point
0
I would expect something called updateless-CDT to have a causal model of the world, with nodes that it’s picked out (by some magical process) as nodes controlled by the agent, and then it maximizes a utility function over histories of the causal model by following the utility-maximizing strategy, which is a function from states of knowledge at a controlled node (state of some magically-labeled agent nodes that are parents of the controlled node?) to actions (setting the state of the controlled node).
If the magical labeling process has labeled no nodes inside Omega as controlled, then this will probably two-box even on standard Newcomb. On the other hand, if Omega is known to fully simulate the agent, then we might suppose that updateless-CDT plans as if its strategy is controlling Omega’s prediction, and always one-box even with transparent boxes.
I haven’t read Conditioning on Conditionals yet. I am doing so now, but could you explain more about the similarities you were thinking of?
- abramdemski 24 Nov 2017 21:19 UTC
  2 points
  0
  Parent
  Yeah, I agree that updateless-CDT needs to somehow label which nodes it controls.
  You’re glossing over a second magical part, though:
  and then it maximizes a utility function over histories of the causal model by following the utility-maximizing strategy,
  How do you calculate the expected utility of following a strategy? How do you condition on following a strategy? That’s the whole point here. You obviously can’t just condition on taking certain values of the nodes you control, since a strategy takes different actions in different worlds; so, regular causal conditioning is out. You can try conditioning on the material cenditionals specifying the strategy, which falls on its face as mentioned.
  That’s why I jumped to the idea that UCDT would use the conditioning-on-conditionals approach. It seems like what you want to do, to condition on a strategy, is change the conditional probabilities of actions given their parent nodes.
  Also, I agree that conditioning-on-conditionals can work fine if combined with a magical locate-which-nodes-you-control step. Observation-counterfactuals are supposed to be a less magical way of dealing with the problem.
  - Charlie Steiner 2 Dec 2017 7:23 UTC
    1 point
    0
    Parent
    Yeah, I agree that observation-counterfactuals are what you’d like the UCDT agent to be thinking of as a strategy—a mapping between information-states and actions.
    The reason I used weird language like “state of magically labeled nodes that are parents of the controlled nodes” is just because of how it’s nontrivial to translate the idea of “information available to the agent” into a naturalized causal model. But if that’s what the agent is using to predict the world, I think that’s what things have to get cashed out into.