As a thought experiment, imagine that human values change cyclically. For 1000 years we value freedom and human well-being, for 1000 years we value slavery and the joy of hurting other people, and again, and again, forever… that is, unless we create a superhuman AI who can enforce a specific set of values.
Would you want the AI to promote the values of freedom and well-being, or the slavery and hurting, or something balanced in the middle, or to keep changing its mind in a cycle that mirrors the natural cycle of human values?
(It is easy to talk about “enforcing values other than our own” in abstract, but it becomes less pleasant when you actually imagine some specific values other than your own.)
As a thought experiment, imagine that human values change cyclically. For 1000 years we value freedom and human well-being, for 1000 years we value slavery and the joy of hurting other people, and again, and again, forever… that is, unless we create a superhuman AI who can enforce a specific set of values.
Would you want the AI to promote the values of freedom and well-being, or the slavery and hurting, or something balanced in the middle, or to keep changing its mind in a cycle that mirrors the natural cycle of human values?
(It is easy to talk about “enforcing values other than our own” in abstract, but it becomes less pleasant when you actually imagine some specific values other than your own.)