Mitchell_Porter comments on On The Formal Definition of Alignment

Mitchell_Porter 2 Jul 2025 17:19 UTC
2 points
0
I think this ends up being the same thing as CEV…
- Davey 2 Jul 2025 19:45 UTC
  3 points
  0
  Parent
  This is not the same as CEV. CEV involves the AI extrapolating a user’s idealized future values and acting to implement them, even overriding current preferences if needed, whereas my model forbids that. In my framework, the AI never drives or predicts value change; it simply provides accurate world models and optimal plans based on the user’s current values, which only the user can update.
  CEV also assumes convergence; my model protects normative autonomy and allows value diversity to persist.
  - Mitchell_Porter 3 Jul 2025 0:32 UTC
    2 points
    0
    Parent
    CEV extrapolates the volition of humanity, that’s one reason it has to be “coherent”.
    In your proposal, people have autonomy, but this principle can be violated in “extremely dangerous” situations. People are free to do what they want (“volition”)… but their AI advisors look ahead (“extrapolated”)… and people are not allowed to exercise their freedom so as to jeopardize the freedom of others (“coherent”).