Thinking more about this, it may have been better if Eliezer had not framed his meta-ethics sequence around “the meaning of right.”
If we play rationalist’s taboo with our moral terms and thus avoid moral terms altogether, what Eliezer seems to be arguing is that what we really care about is not (a) that whatever states of affairs our brains are wired to send reward signals in response to be realized, but (b) that we experience peace and love and harmony and discovery and so on.
His motivation for thinking this way is a thought experiment—which might become real in the relatively near future—about what would happen if a superintelligent machine could rewire our brains. If what we really care about is (a), then we shouldn’t object if the superintelligent machine rewires our brains to send reward signals only when we are sitting in a jar. But we would object to that scenario. Thus, what we care about seems not to be (a) but (b).
In a meta-ethicists terms, we could interpret Eliezer not as making an argument about the meaning of moral terms, but instead as making an argument that (b) is what gives us Reasons, not (a).
Now, all this meta-babble might not matter much. I’m pretty sure even if I was persuaded that the correct meta-ethical theory states that I should be okay with releasing a superintelligence that would rewire me to enjoy sitting in a jar, I would do whatever I could to prevent such a scenario and instead promote a superintelligence that would bring peace and joy and harmony and discovery and so on.
This is a cool formulation. It’s interesting that there are other things that can happen to you not similar to “being persuaded of a metaethical theory” that entail that whenever you are told to do X you’re compelled to do X. (Voodoo or whatever.)
I could get into how much I hate this kind of rejoinder if you bait me some more. I wasn’t asking you for the number of acres in a square mile. Let me just rephrase:
I hadn’t heard of motivational internalism before, could you expand your comment?
what Eliezer seems to be arguing is that what we really care about is not (a) that whatever states of affairs our brains are wired to send reward signals in response to be realized, but (b) that we experience peace and love and harmony and discovery and so on.
His motivation for thinking this way is a thought experiment—which might become real in the relatively near future—about what would happen if a superintelligent machine could rewire our brains. If what we really care about is (a), then we shouldn’t object if the superintelligent machine rewires our brains to send reward signals only when we are sitting in a jar.
I don’t see what plausible reasoning process could lead you to infer this unlikely statement (about motivation, given how many detail would need to be just right for the statement to happen to be true).
Also, even if you forbid modifying definition of human brain, things that initiate high-reward signals in our brains (or that we actually classify as “harmony” or “love”) are very far from what we care about, just as whatever a calculator actually computes is not the same kind of consideration as the logically correct answer, even if you use a good calculator and aren’t allowed sabotage. There are many reasons (and contexts) for reward in human brain to not be treated as indicative of goodness of a situation.
It was an explanation for why your thought experiment provides a bad motivation: we can just forbid modification of human brains to stop the thought experiment from getting through, but that would still leave a lot of problems, which shows that just this thought experiment is not sufficient motivation.
Sure, the superintelligence thought experiment is not the fully story.
One problem with the suggestion of writing a rule to not alter human brains comes in specifying how the machine is not allowed to alter human brains. I’m skeptical about our ability to specify that rule in a way that does not lead to disastrous consequences. After all, our brains are being modified all the time by the environment, by causes that are on a wide spectrum of ‘direct’ and ‘indirect.’
Other problems with adding such a rule are given here.
Come on, this tiny detail isn’t worth the discussion. Classical solution to wireheading, asking the original and not the one under the influence, referring to you-at-certain-time and not just you-concept that resolves to something unpredicted at any given future time in any given possible world, rigid-designator-in-time.
Thinking more about this, it may have been better if Eliezer had not framed his meta-ethics sequence around “the meaning of right.”
If we play rationalist’s taboo with our moral terms and thus avoid moral terms altogether, what Eliezer seems to be arguing is that what we really care about is not (a) that whatever states of affairs our brains are wired to send reward signals in response to be realized, but (b) that we experience peace and love and harmony and discovery and so on.
His motivation for thinking this way is a thought experiment—which might become real in the relatively near future—about what would happen if a superintelligent machine could rewire our brains. If what we really care about is (a), then we shouldn’t object if the superintelligent machine rewires our brains to send reward signals only when we are sitting in a jar. But we would object to that scenario. Thus, what we care about seems not to be (a) but (b).
In a meta-ethicists terms, we could interpret Eliezer not as making an argument about the meaning of moral terms, but instead as making an argument that (b) is what gives us Reasons, not (a).
Now, all this meta-babble might not matter much. I’m pretty sure even if I was persuaded that the correct meta-ethical theory states that I should be okay with releasing a superintelligence that would rewire me to enjoy sitting in a jar, I would do whatever I could to prevent such a scenario and instead promote a superintelligence that would bring peace and joy and harmony and discovery and so on.
I thought being persuaded of a metaethical theory entails that whenever the theory tells you you should do X, you would feel compelled to do X.
This is a cool formulation. It’s interesting that there are other things that can happen to you not similar to “being persuaded of a metaethical theory” that entail that whenever you are told to do X you’re compelled to do X. (Voodoo or whatever.)
Only if motivational internalism is true. But motivational internalism is false.
What’s that?
Here, let me Google that for you.
I could get into how much I hate this kind of rejoinder if you bait me some more. I wasn’t asking you for the number of acres in a square mile. Let me just rephrase:
I hadn’t heard of motivational internalism before, could you expand your comment?
I don’t see what plausible reasoning process could lead you to infer this unlikely statement (about motivation, given how many detail would need to be just right for the statement to happen to be true).
Also, even if you forbid modifying definition of human brain, things that initiate high-reward signals in our brains (or that we actually classify as “harmony” or “love”) are very far from what we care about, just as whatever a calculator actually computes is not the same kind of consideration as the logically correct answer, even if you use a good calculator and aren’t allowed sabotage. There are many reasons (and contexts) for reward in human brain to not be treated as indicative of goodness of a situation.
I don’t understand your second paragraph. It sounds like you are agreeing to me, but your tone suggests you think you are disagreeing with me.
It was an explanation for why your thought experiment provides a bad motivation: we can just forbid modification of human brains to stop the thought experiment from getting through, but that would still leave a lot of problems, which shows that just this thought experiment is not sufficient motivation.
Sure, the superintelligence thought experiment is not the fully story.
One problem with the suggestion of writing a rule to not alter human brains comes in specifying how the machine is not allowed to alter human brains. I’m skeptical about our ability to specify that rule in a way that does not lead to disastrous consequences. After all, our brains are being modified all the time by the environment, by causes that are on a wide spectrum of ‘direct’ and ‘indirect.’
Other problems with adding such a rule are given here.
(I meant that subjective experience that evaluates situations should be specified using unaltered brains, not that brains shouldn’t be altered.)
You’ve got my curiosity. What does this mean? How would you realize that process in the real world?
Come on, this tiny detail isn’t worth the discussion. Classical solution to wireheading, asking the original and not the one under the influence, referring to you-at-certain-time and not just you-concept that resolves to something unpredicted at any given future time in any given possible world, rigid-designator-in-time.