Davidmanheim comments on What is ambitious value learning?

Davidmanheim 9 Nov 2018 11:23 UTC
2 points
0
That’s an important question, bu it’s also fundamentally hard, since it’s almost certainly true that human values are inconsistent—if not individually, than at an aggregate level. (You can’t reconcile opposite preferences, or maximize each person’s share of a finite resource.)
The best answer I have seen is Eric Drexler’s discussion of Pareto-topia, where he suggests that we can make huge progress and gain of utility according to all value-systems held by humans, despite the fact that they are inconsistent.
- Rohin Shah 10 Nov 2018 18:00 UTC
  6 points
  0
  Parent
  That seems right. Though if you accept that human values are inconsistent and you won’t be able to optimize them directly, I still think “that’s a really good reason to assume that the whole framework of getting the true human utility function is doomed.”
  By “true human utility function” I really do mean a single function that when perfectly maximized leads to the optimal outcome.
  I think “human values are inconsistent” and “people with different experiences will have different values” and “there are distributional shifts which cause humans to be different than they would otherwise have been” are all different ways of pointing at the same problem.
  What links here?
  - sunwillrise's comment on Instruction-following AGI is easier and more likely than value aligned AGI by Seth Herd (12 Jul 2024 15:34 UTC; 6 points)