Vladimir_Nesov comments on [missing post]

Vladimir_Nesov 14 Oct 2021 9:54 UTC
3 points
0
When preference makes references to self, copying (that doesn’t also edit these references) changes the meaning of preference, doesn’t preserve it. So if you are expecting copying, reflective consistency could be ensured by reformulating preference to avoid explicit references to self, such as by replacing them with references to a particular person, or to a reference class of people, whether they are yourself or not.
What links here?
- Vladimir_Nesov's comment on A Defense of Functional Decision Theory by Heighn (16 Nov 2021 1:50 UTC; 11 points)
- [ ]
  [deleted]
  - Vladimir_Nesov 14 Oct 2021 12:31 UTC
    3 points
    0
    Parent
    The reformulation of preference to replace references to self with specific people it already references doesn’t change its meaning, so semantically such rewriting doesn’t affect alignment. It only affects copying, which doesn’t respect semantics of preferences. Other procedures that meddle with minds can disrupt semantics of preference in a way that can’t be worked around.
    
    (All this only makes sense for toy agent models, that furthermore have a clear notion of references to self, not for literal humans. Humans don’t have preferences in this sense, human preference is a theoretical construct that needs something like CEV to access, the outcome of a properly set up long reflection.)
    - [ ]
      [deleted]