mako yass comments on Wei Dai’s Shortform

mako yass 10 Jan 2026 19:30 UTC
6 points
0
A utility function that enjoys moving between those places isn’t the same as a utility function with cycles, which would trade unlimited time money for tickets to them that it never cashes.
The argument against this is that is also going to be somewhat instrumental in flavour but more along the lines of like, that’s a known attracter that few who matter want to be in.
- TsviBT 11 Jan 2026 4:28 UTC
  2 points
  0
  Parent
  Does this one make more sense? https://www.lesswrong.com/posts/HbkNAyAoa4gCnuzwa/wei-dai-s-shortform?commentId=mrF2hxyp2gbeaLZEZ
  - mako yass 11 Jan 2026 20:18 UTC
    2 points
    0
    Parent
    The VNM axiom isn’t about road trips, a utility function is allowed to value different things at different times because the time component distinguishes those things. You aren’t addressing VNM utility here. You’re writing about a misunderstanding of it that you had.
    You die if you have VNM cycles. A superior trader eats you (People feel like they can simply stop communicating with the sharps and retire to a simple life in the hills, but this is a very costly solution and I’d prefer to find a real one). You stop existing. This is kind of a much more essential category of instrumental vice than like “I don’t equate money to utility” type stuff (which I wouldn’t call a vice).
    One criticism of decision theory that you could explore is that many practical philosophy enjoyers would find it difficult write utility functions that compose scripted components (like “I want A, then B, then C, then A”) with nonscripted components (“I will always instantly trade X for Y, and Y for Z”), that we may need higher level abstractions on top of the basics to help people to stop conflating ABC with XYZ… but… is it really going to be complicated? That one doesn’t seem like it’s going to be complicated to me.
    What does seem difficult is expressing constrained indifference about utility function changes. Something that seems to be common in humans (eg, I’m indifferent to the change/annihilation of my values if it’s being done by beautiful and cool things like love, literary fiction, or reason, but I hate it if it’s being done by ugly or stupid or hostile things.) and is needed for ASI alignment (corrigibility), but it seems tricky to define a utility function that permits it. (though again I don’t know whether it turns out to be tricky in practice)
    - TsviBT 12 Jan 2026 1:55 UTC
      2 points
      0
      Parent
      How are you distinguishing a dutch book argument from an assumption that the utility function takes some specific form (e.g. doesn’t have a time index)? Cf. https://www.lesswrong.com/posts/HbkNAyAoa4gCnuzwa/wei-dai-s-shortform?commentId=SqrgPRinYbh8JCoaN
      
      (It’s a genuine question, I’m not meaning to assert there isn’t such a meaningful distinction; I think there is one.)
      - mako yass 12 Jan 2026 4:48 UTC
        2 points
        0
        Parent
        I don’t know what you’re asking. The answer is either trivial or mu depending on what you mean by specific form. I think if you could articulate what you’re asking you wouldn’t have to ask it.
        TsviBT 12 Jan 2026 18:50 UTC
        2 points
        0
        Parent
        ( https://tsvibt.blogspot.com/2025/11/id-probably-need-more-proof-of-work-of.html )