Instead, we should just build a simple utility function that captures some value that we consider important, and sacrifices everything else.
I actually wrote a post on this idea. But I consider it to be a contingency plan for “moral philosophy turns out to be easy” (i.e., we solve ‘morality’ ourselves without having to run CEV and can determine with some precision how much worse turning the universe into orgasmium is, compared to the best possible outcome, and how much better it is compared to just getting wiped out). I don’t think it’s a good backup plan for “seed AI turns out to be easy”, because for one thing you’ll probably have trouble finding enough AI researchers/programmers willing to deliberately kill everyone for the sake of turning the universe into orgasmium, unless it’s really clear that’s the right thing to do.
If we posit a putative friendly AI as one which e.g. kills no one as a base rule AND is screened by competent AI researchers for any maximizing functions then any remaining “nice to haves” can just be put to a vote.
I actually wrote a post on this idea. But I consider it to be a contingency plan for “moral philosophy turns out to be easy” (i.e., we solve ‘morality’ ourselves without having to run CEV and can determine with some precision how much worse turning the universe into orgasmium is, compared to the best possible outcome, and how much better it is compared to just getting wiped out). I don’t think it’s a good backup plan for “seed AI turns out to be easy”, because for one thing you’ll probably have trouble finding enough AI researchers/programmers willing to deliberately kill everyone for the sake of turning the universe into orgasmium, unless it’s really clear that’s the right thing to do.
Maybe you already have the answer Wei Dai.
If we posit a putative friendly AI as one which e.g. kills no one as a base rule AND is screened by competent AI researchers for any maximizing functions then any remaining “nice to haves” can just be put to a vote.