Rohin Shah comments on What is ambitious value learning?

Rohin Shah 10 Nov 2018 18:00 UTC
6 points
0
That seems right. Though if you accept that human values are inconsistent and you won’t be able to optimize them directly, I still think “that’s a really good reason to assume that the whole framework of getting the true human utility function is doomed.”
By “true human utility function” I really do mean a single function that when perfectly maximized leads to the optimal outcome.
I think “human values are inconsistent” and “people with different experiences will have different values” and “there are distributional shifts which cause humans to be different than they would otherwise have been” are all different ways of pointing at the same problem.
What links here?
- sunwillrise's comment on Instruction-following AGI is easier and more likely than value aligned AGI by Seth Herd (12 Jul 2024 15:34 UTC; 6 points)