There’s several grounds for criticism here. Criticizing CEV by saying, “I think CEV will lead to good dogs, because that’s what a lot of people would like,” sounds valid to me, but would merit more argumentation (on both sides).
Another problem I mentioned is a possibly fundamental problem with CEV. Is it legitimate to say that, when CEV assumes that reasoned extrapolation trumps all existing values, that that is not the same as asserting that reason is the primary value? You could argue that reason is just an engine in service of some other value. There’s some evidence that that actually works, as demonstrated by the theologians of the Roman Catholic Church, who have a long history of using reason to defeat reason. But I’m not convinced that makes sense. If it doesn’t, then it means that CEV already assumes from the start the very kind of value that its entire purpose is to prevent being assumed.
Third, most human values, like dog-values, are neutral with respect to rationality or threatened by rationality. The dog itself needs to not be much more rational or intelligent than it is.
The only solution is to say that the rationality and the values are in the FAI sysop, while the conscious locus of the values is in the humans. That is, the sysop gets smarter and smarter, with dog-values as its value system. It knows that to get the experiential value out of dog-values, the conscious experiencer needs limited cognition; but that’s okay, because the humans are the designated experiencers, while the FAI is the designated thinker and keeper-of-the-values.
There are two big problems with this.
By keeping the locus of consciousness out of the sysop, we’re steering dangerously close to one of the worst-possible-of-all-worlds, which is building a singleton that, one way or the other, eventually ends up using most of the universe’s computational energy, yet is not itself conscious. That’s a waste of a universe.
Value systems are deictic, meaning they use the word “I” a lot. To interpret their meaning, you fill in the “I” with the identity of the reasoning agent. The sysop literally can’t have human values if it doesn’t have deictic values; and if it has deictic values, they’re not going to stay doglike under extrapolation. (You could possibly get around this by using a non-deictic representation, and saying that the values have meaning only when seen in light of the combined sysop+humans system. Like the knowledge of Chinese in Searle’s Chinese room.)
The FAI document says it’s important to use non-deictic representations in the AI. Aside from the fact that this is probably impossible—cognition is compression, and deictic representations are much more compact, so any intelligence is going to end up using something equivalent to deictic representations—I don’t know if it’s meaningful to talk about non-deictic values. That would be like saying “I value the taste of chocolate” without saying who is tasting the chocolate. (That’s one entry-point into paperclipping scenarios.)
The final, biggest problem illustrated by dog-values is that it’s just not sensible to preserve “human values”, when human values, even those found within the same person at different times of life, are as different as it is possible for values to be different. Sure, maybe we would have different values if we could see in the ultraviolet, or had seven sexes; but there is just no bigger difference between values than “valuing states of the external world”, and “valuing phenomenal perceptions within my head.” And there are already humans committed to each of those two fundamental value systems.
There’s several grounds for criticism here. Criticizing CEV by saying, “I think CEV will lead to good dogs, because that’s what a lot of people would like,” sounds valid to me, but would merit more argumentation (on both sides).
Another problem I mentioned is a possibly fundamental problem with CEV. Is it legitimate to say that, when CEV assumes that reasoned extrapolation trumps all existing values, that that is not the same as asserting that reason is the primary value? You could argue that reason is just an engine in service of some other value. There’s some evidence that that actually works, as demonstrated by the theologians of the Roman Catholic Church, who have a long history of using reason to defeat reason. But I’m not convinced that makes sense. If it doesn’t, then it means that CEV already assumes from the start the very kind of value that its entire purpose is to prevent being assumed.
Third, most human values, like dog-values, are neutral with respect to rationality or threatened by rationality. The dog itself needs to not be much more rational or intelligent than it is.
The only solution is to say that the rationality and the values are in the FAI sysop, while the conscious locus of the values is in the humans. That is, the sysop gets smarter and smarter, with dog-values as its value system. It knows that to get the experiential value out of dog-values, the conscious experiencer needs limited cognition; but that’s okay, because the humans are the designated experiencers, while the FAI is the designated thinker and keeper-of-the-values.
There are two big problems with this.
By keeping the locus of consciousness out of the sysop, we’re steering dangerously close to one of the worst-possible-of-all-worlds, which is building a singleton that, one way or the other, eventually ends up using most of the universe’s computational energy, yet is not itself conscious. That’s a waste of a universe.
Value systems are deictic, meaning they use the word “I” a lot. To interpret their meaning, you fill in the “I” with the identity of the reasoning agent. The sysop literally can’t have human values if it doesn’t have deictic values; and if it has deictic values, they’re not going to stay doglike under extrapolation. (You could possibly get around this by using a non-deictic representation, and saying that the values have meaning only when seen in light of the combined sysop+humans system. Like the knowledge of Chinese in Searle’s Chinese room.)
The FAI document says it’s important to use non-deictic representations in the AI. Aside from the fact that this is probably impossible—cognition is compression, and deictic representations are much more compact, so any intelligence is going to end up using something equivalent to deictic representations—I don’t know if it’s meaningful to talk about non-deictic values. That would be like saying “I value the taste of chocolate” without saying who is tasting the chocolate. (That’s one entry-point into paperclipping scenarios.)
The final, biggest problem illustrated by dog-values is that it’s just not sensible to preserve “human values”, when human values, even those found within the same person at different times of life, are as different as it is possible for values to be different. Sure, maybe we would have different values if we could see in the ultraviolet, or had seven sexes; but there is just no bigger difference between values than “valuing states of the external world”, and “valuing phenomenal perceptions within my head.” And there are already humans committed to each of those two fundamental value systems.