AlephNeil comments on What is Wei Dai’s Updateless Decision Theory?

AlephNeil 20 May 2010 7:26 UTC
20 points
Yeah, but the thing is: I don’t think there is very much to it. (And in fact about 25% of my motivation for writing this is to see whether others will ‘correct me’ on that.)

If I say “it’s extremely simple, it’s blurgh” that means among other things “it is blurgh”. Not “it is blurgh and whole bunch of other stuff which I’m never going to get round to.”

To be fair, one thing I haven’t mentioned at all is the concept of logical uncertainty, which plays a critical role in TDT (which was after all the motivation for UDT) and in a number of past threads where UDT was under discussion. But again, I personally don’t think we need to go into this to explain what UDT is.
- PhilGoetz 20 May 2010 17:51 UTC
  2 points
  1
  Parent
  The description you gave is not enough for me to have any idea what it is. This seems to defeat the purpose of your post.
  - AlephNeil 20 May 2010 23:56 UTC
    8 points
    Parent
    So what are you saying?
    
    (a) You disagree that UDT is what I say it is? (b) You don’t understand the sentence where I say what UDT is? (c) You take for granted that what I’ve talked about as UDT is indeed part of UDT but you think all the non-trivial stuff hasn’t been touched on?
    - PhilGoetz 24 May 2010 21:45 UTC
      2 points
      1
      Parent
      (b). But you don’t need to try to explain it—I need to study Wei Dai’s posts. After which your post might make perfect sense to me. I was just hoping, from its name, that I wouldn’t have to do that.
      - AlephNeil 25 May 2010 3:58 UTC
        6 points
        Parent
        It’s a good idea to look at Wei’s posts, of course, but in terms of presentation, the original UDT post is a very long way away from mine, and it won’t immediately be evident why I phrased my definition of UDT as I did.
        
        If you want to understand my post purely on its own terms, then the key concept (besides probability and conditional probability) is just that of a game. If we have a one-player game, and we fix the player’s strategy, then we obtain a probability distribution over ‘branches’, and a utility lying at the end of each branch. And these are exactly the ingredients we need to calculate an expected utility. So UDT is simply the instruction ‘choose the strategy that yields the greatest expected utility’. The reason why it’s “updateless” is that the probability distribution with respect to which we’re calculating expected utilities is the ‘prior’ rather than ‘posterior’ - we haven’t ‘conditioned on’ the subset of branches that pass through a particular information state.
        
        For each of Newcomb’s Problem, Parfit’s Hitchhiker, Counterfactual Mugging and the Absent-Minded Driver, there is a sense in which when you ‘condition on the blue box’ you choose a different strategy than when you don’t. (This is paradoxical because, intuitively, what you ought to decide to do at a given time shouldn’t depend on whether you’re contemplating the decision from afar, timelessly, or actually there ‘in the moment’.)
        
        (Technical Note: The concept of ‘conditioning on the blue box’ can be a bit more complicated than just ‘conditioning on an event’. For instance, in the case of Newcomb’s problem, you find that one-boxing is optimal if you don’t condition on anything, but two-boxing is optimal if you condition on the sigma-algebra generated by the event ‘predictor predicts that you will one-box’.)