First of all, values are not supernatural. “Make people happy” is not something that you can interpret in an arbitrary way, it is a problem in physics and mathematics.
Quite true, but you’ve got the problem the wrong way around. Indirect normativity is the superior approach, because not only does “make people happy” require context and subtlety, it is actually ambiguous.
Remember, real human beings have suggested things like, “Why don’t we just put antidepressants in the water?” Real human beings have said things like, “Happiness doesn’t matter! Get a job, you hippie!” Real human beings actually prefer to be sad sometimes, like when 9/11 happens.
Now of course, one would guess that even mildly intelligent Verbal Order Taking AGI designers are going to spot that one coming in the research pipeline, and fix it so that the AGI refuses orders above some level of ambiguity. What we would want is an AGI that demands we explain things to it in the fashion of the Open Source Wish Project, giving maximally clear, unambiguous, and preferably even conservative wishes that prevent us from somehow messing up quite dramatically.
But what if someone comes to the AGI and says, “I’m authorized to make a wish, and I double dog dare you with full Simon Says rights to just make people happy no matter what else that means!”? Well then, we kinda get screwed.
Once you have something in the fashion of a wish-making machine, indirect normativity is not only safer, but more beneficial. “Do what I mean” or “satisfice the full range of all my values” or “be the CEV of the human race” are going to capture more of our intentions in a shorter wish than even the best-worded Open Source Wishes, so we might as well go for it.
Hence machine ethics, which is concerned with how we can specify our meta-wish to have all our wishes granted to a computer.
Quite true, but you’ve got the problem the wrong way around. Indirect normativity is the superior approach, because not only does “make people happy” require context and subtlety, it is actually ambiguous.
Remember, real human beings have suggested things like, “Why don’t we just put antidepressants in the water?” Real human beings have said things like, “Happiness doesn’t matter! Get a job, you hippie!” Real human beings actually prefer to be sad sometimes, like when 9/11 happens.
An AGI could follow the true and complete interpretation of “Make people happy” and still wind up fucking us over in some horrifying way.
Now of course, one would guess that even mildly intelligent Verbal Order Taking AGI designers are going to spot that one coming in the research pipeline, and fix it so that the AGI refuses orders above some level of ambiguity. What we would want is an AGI that demands we explain things to it in the fashion of the Open Source Wish Project, giving maximally clear, unambiguous, and preferably even conservative wishes that prevent us from somehow messing up quite dramatically.
But what if someone comes to the AGI and says, “I’m authorized to make a wish, and I double dog dare you with full Simon Says rights to just make people happy no matter what else that means!”? Well then, we kinda get screwed.
Once you have something in the fashion of a wish-making machine, indirect normativity is not only safer, but more beneficial. “Do what I mean” or “satisfice the full range of all my values” or “be the CEV of the human race” are going to capture more of our intentions in a shorter wish than even the best-worded Open Source Wishes, so we might as well go for it.
Hence machine ethics, which is concerned with how we can specify our meta-wish to have all our wishes granted to a computer.