ryan_greenblatt comments on shortplav

ryan_greenblatt 30 Oct 2025 16:52 UTC
6 points
0
Isn’t there already a kinda reasonable solution via something like UDASSA? See e.g. here (and this response to Joe’s objections here).
- niplav 30 Oct 2025 21:14 UTC
  2 points
  0
  Parent
  My understanding is that UDASSA doesn’t give you unbounded utility, by virtue of directly assigning $U (eval (p)) \propto 2^{- | p |}$ , and the sum of utilities is proportional to $\sum_{i = 0}^{\infty} 2^{- i} = 2$ . The whole dance I did was in order to be able to have unbounded utilities. (Maybe you don’t care about unbounded utilities, in which case UDASSA seems like a fine choice.)
  
  (I think that the other horn of de Blanc’s proof is satisfied by UDASSA, unless the proportion of non-halting programs bucketed by simplicity declines faster than any computable function. Do we know this? “Claude!…”)
  
  Edit: Claude made up plausible nonsense, but GPT-5 upon request was correct, proportion of halting programs declines more slowly than some computable functions.
  
  Edit 2: Upon some further searching (and soul-searching) I think UDASSA is currently underspecified wrt whether its utility is bounded or unbounded. For example, the canonical explanation doesn’t mention utility at all, and none of the other posts about it mention how exactly utility is defined..
  - interstice 30 Oct 2025 22:21 UTC
    4 points
    0
    Parent
    The “UDASSA/UDT-like solution” is basically to assign some sort of bounded utility function to the output of various Turing machines weighted by a universal prior, like here. Although Wei Dai doesn’t specify that the preference function has to be bounded in that post, and he allows preferences over entire trajectories(but I think you should be able to do away with that by having another Turing machine running the first and evaluating any particular property of its trajectory)
    
    “Bounded utility function over Turing machine outputs weighted by simplicity prior” should recover your thing as a special case, actually, at least in the sense of having identical expected values. You could have a program which outputs 1 utility with probability 2^-[(log output of your utility turing machine) - (discount factor of your utility turing machine)]. That this is apparently also the same as Eliezer’s solution suggests there might be convergence on a unique sensible way to do EU maximization in a Turing-machine-theoretic mathematical multiverse.
    - niplav 30 Oct 2025 23:36 UTC
      4 points
      2
      Parent
      It’s a bit of a travesty there’s no canonical formal write-up of UDASSA, given all the talk about it. Ugh, TODO for working on this I guess.