The idea, trying to rely on as few problematic words as possible, is:
There exists a class of computations M which sort proposed actions/states/strategies into an order, and which among humans underly the inclination to label certain actions/states “good”, “bad”, “right”, “wrong”, “moral”, etc.(4) For convenience I’ll label the class of all labels like that “M-labels.”
If two beings B1 and B2 don’t implement (1) a common M-instance, M-labels may not be meaningful, even in principle, in discussions between B1 and B2. For example, B1 and B2 may fundamentally not mean the same thing by “right” or “moral.”
If B1 and B2 implement (1) one and only one M-instance (Mb), then M-labels are in-principle meaningful in discussions between B1 and B2 (although this is no guarantee that B1 and B2 will actually understand one another, or even that they are capable of discussion in the first place).
There exists (6) an M-instance at-least-partially implemented (1) by all humans (2). We label this the Coherent Extrapolated Volition (CEV).
Two humans might implement other M-instances in addition to CEV, or might not, but either way all human M-instances are (by definition) consistent with CEV. In other words, all within-group moralities among humans can be treated as special cases of CEV, and implementing CEV will satisfy all of them. (3)
Edit: Later, I conclude that this isn’t what you’re claiming. Rather, you’re claiming that CEV is the intersection of all within-group moralities among humans. Implementing CEV won’t necessarily fully satisfy all of them, nor even necessarily fully satisfy any of them, it simply won’t violate any of them. That is, it may not do anything right, but it’s guaranteed not to do anything wrong.
We therefore want to ensure that any system X powerful enough to impose its preferences on us also implements CEV. This will ensure that it is at least meaningful for us to communicate with it using M-labels… e.g., talk about whether a given course of action is right or wrong. (5)
Other optimization processes (like the Pebblesorters, or natural selection) might implement an M-instance that is inconsistent with CEV. There is no guarantee that implementing CEV will satisfy all within-group moralities among all sapient species, let alone among all optimization processes.
Edit: It seems to be important to you that we not call M-instances within nonhuman species “moralities”, though I haven’t quite understood why.
There might exist an M-instance at-least-partially implemented (1) by all sapient species. We could label this the Universal Coherent Extrapolated Volition (U-CEV) and desire to ensure that X also implements U-CEV. (7)
===
(1) Note that implementing a shared M doesn’t necessarily mean B1 and B2 can apply M consistently to a specific situation, any more than knowing what a prime number is means I can always recognize or calculate one. It also doesn’t mean B1 and B2 can articulate M. It doesn’t even guarantee that any given M-label will be correctly or consistently used and understood when they converse. All of this means it may be difficult in practice to determine whether B1 and B2 share an M, or what that M might be.
(2) I’m not sure if this is quite what is being asserted… there may be humans excluded from this formulation, such as psychopaths who would refuse treatment.
(3) I haven’t seen this actually being asserted, but it seems implicit. Otherwise, we shouldn’t expect CEV to converge and include everybody’s volition. (2) above seems relevant here.
(4) We’re deliberately ignoring issues of language here and doing everything in English, but we expect that other human languages are isomorphic to English in relevant respects.
(5) There seems to also be an expectation that this is sufficient to avoid X doing bad things to us. I don’t quite follow that leap, but never mind that for now.
(6) I don’t actually see why I should believe that any such thing exists, though it would be nice if it did. Presumably arguments for this are coming.
Edit: Given the “intersection not superset” correction above, then this definitely exists, and there’s good reason to believe it’s non-empty. Whether it’s useful once we leave out all the stuff anyone disagrees with is still unclear to me.
(7) Although apparently we don’t, judging from what I’ve seen so far… either we don’t believe U-CEV exists, or we don’t care. Presumably arguments for this are coming, as well.
OK… let me see if I’m following.
The idea, trying to rely on as few problematic words as possible, is:
There exists a class of computations M which sort proposed actions/states/strategies into an order, and which among humans underly the inclination to label certain actions/states “good”, “bad”, “right”, “wrong”, “moral”, etc.(4) For convenience I’ll label the class of all labels like that “M-labels.”
If two beings B1 and B2 don’t implement (1) a common M-instance, M-labels may not be meaningful, even in principle, in discussions between B1 and B2. For example, B1 and B2 may fundamentally not mean the same thing by “right” or “moral.”
If B1 and B2 implement (1) one and only one M-instance (Mb), then M-labels are in-principle meaningful in discussions between B1 and B2 (although this is no guarantee that B1 and B2 will actually understand one another, or even that they are capable of discussion in the first place).
There exists (6) an M-instance at-least-partially implemented (1) by all humans (2). We label this the Coherent Extrapolated Volition (CEV).
Two humans might implement other M-instances in addition to CEV, or might not, but either way all human M-instances are (by definition) consistent with CEV. In other words, all within-group moralities among humans can be treated as special cases of CEV, and implementing CEV will satisfy all of them. (3) Edit: Later, I conclude that this isn’t what you’re claiming. Rather, you’re claiming that CEV is the intersection of all within-group moralities among humans. Implementing CEV won’t necessarily fully satisfy all of them, nor even necessarily fully satisfy any of them, it simply won’t violate any of them. That is, it may not do anything right, but it’s guaranteed not to do anything wrong.
We therefore want to ensure that any system X powerful enough to impose its preferences on us also implements CEV. This will ensure that it is at least meaningful for us to communicate with it using M-labels… e.g., talk about whether a given course of action is right or wrong. (5)
Other optimization processes (like the Pebblesorters, or natural selection) might implement an M-instance that is inconsistent with CEV. There is no guarantee that implementing CEV will satisfy all within-group moralities among all sapient species, let alone among all optimization processes. Edit: It seems to be important to you that we not call M-instances within nonhuman species “moralities”, though I haven’t quite understood why.
There might exist an M-instance at-least-partially implemented (1) by all sapient species. We could label this the Universal Coherent Extrapolated Volition (U-CEV) and desire to ensure that X also implements U-CEV. (7)
=== (1) Note that implementing a shared M doesn’t necessarily mean B1 and B2 can apply M consistently to a specific situation, any more than knowing what a prime number is means I can always recognize or calculate one. It also doesn’t mean B1 and B2 can articulate M. It doesn’t even guarantee that any given M-label will be correctly or consistently used and understood when they converse. All of this means it may be difficult in practice to determine whether B1 and B2 share an M, or what that M might be.
(2) I’m not sure if this is quite what is being asserted… there may be humans excluded from this formulation, such as psychopaths who would refuse treatment.
(3) I haven’t seen this actually being asserted, but it seems implicit. Otherwise, we shouldn’t expect CEV to converge and include everybody’s volition. (2) above seems relevant here.
(4) We’re deliberately ignoring issues of language here and doing everything in English, but we expect that other human languages are isomorphic to English in relevant respects.
(5) There seems to also be an expectation that this is sufficient to avoid X doing bad things to us. I don’t quite follow that leap, but never mind that for now.
(6) I don’t actually see why I should believe that any such thing exists, though it would be nice if it did. Presumably arguments for this are coming. Edit: Given the “intersection not superset” correction above, then this definitely exists, and there’s good reason to believe it’s non-empty. Whether it’s useful once we leave out all the stuff anyone disagrees with is still unclear to me.
(7) Although apparently we don’t, judging from what I’ve seen so far… either we don’t believe U-CEV exists, or we don’t care. Presumably arguments for this are coming, as well.