paulfchristiano comments on Decision theory and dynamic inconsistency

paulfchristiano 4 Jul 2022 16:10 UTC
9 points
0
I generally agree that a creature with inconsistent preferences should respect the values of its predecessors and successors in the same kind of way that it respects the values of other agents (and that the similarity somewhat increases the strength of that argument). It’s a subtle issue, especially when we are considering possible future versions of ourselves with different preferences (just as its always subtle how much to respect the preferences of future creatures who may not exist based on our actions). I lean towards being generous about the kinds of value drift that have occurred over the previous millennia (based on some kind of “we could have been in their place” reasoning) while remaining cautious about sufficiently novel kinds of changes in values.
In the particular case of the inconsistencies highlighted by transparent Newcomb, I think that it’s unusually clear that you want to avoid your values changing—because your current values are a reasonable compromise amongst the different possible future versions of yourself, and maintaining those values is a way to implement important win-win trades across those versions.
- Lukas Finnveden 13 Apr 2026 23:14 UTC
  2 points
  0
  Parent
  In the particular case of the inconsistencies highlighted by transparent Newcomb, I think that it’s unusually clear that you want to avoid your values changing—because your current values are a reasonable compromise amongst the different possible future versions of yourself, and maintaining those values is a way to implement important win-win trades across those versions.
  I slightly disagree with this. In cases where there are win-win trades, different future versions of yourself are probably similar enough that they can get these win-win trades via correlated decision-making. (If they follow EDT.)
  If you stop your values from changing, I think the main additional benefit you get is that you (i) change which of your future selves are more or less likely to exist in the first place (which it’s not obvious that they themselves will care about; c.f. my other comment), and (ii) impose one-way utility transfers from versions of you who have good helping opportunities to versions of yourselves who have good being-helped opportunities, according to your own view about how you want to do interpersonal utility comparisons between your future selves (which will predictably benefit some of them and harm some other of them). ^[1]
  Overall this still seems fine and good to me. But I think win-win trades are a small fraction of the benefits.
  1. ^
    Or maybe this is also just about changing which future versions of yourselves exist, since any difference in your present actions will arguably lead to somewhat different memories in future versions of yourself.
- Davidmanheim 6 Jul 2022 19:07 UTC
  2 points
  0
  Parent
  Agreed!