I agree that, in order for me to behave ethically with respect to the AGI, I need to know whether the AGI is experiencing various morally relevant states, such as pain or fear or joy or what-have-you. And, as you say, this is also true about other physical systems besides AGIs; if monkeys or dolphins or dogs or mice or bacteria or thermostats have morally relevant states, then in order to behave ethically it’s important to know that as well. (It may also be relevant for non-physical systems.)
I’m a little wary of referring to those morally relevant states as “qualia” because that term gets used by so many different people in so many different ways, but I suppose labels don’t matter much… we can call them that for this discussion if you wish, as long as we stay clear about what the label refers to.
Leaving that aside… so, OK. We have a complex AGI with a variety of internal structures that affect its behavior in various ways. One of those structures is such that creating a cat gives the AGI an orgasm, which it finds rewarding. It wants orgasms, and therefore it wants to create cats. Which we didn’t expect.
So, OK. If the AGI is designed such that it creates more cats in this situation than it ought to (regardless of our expectations), that’s a problem. 100% agreed.
But it’s the same problem whether the root cause lies within the AGI’s emotions, or its reasoning, or its qualia, or its ability to predict the results of creating cats, or its perceptions, or any other aspect of its cognition.
You seem to be arguing that it’s a special problem if the failure is due to emotions or qualia or feelings?
I’m not sure why.
I can imagine believing that if I were overgeneralizing from my personal experience. When it comes to my own psyche, my emotions and feelings are a lot more mysterious than my surface-level reasoning, so it’s easy for me to infer some kind of intrinsic mysteriousness to emotions and feelings that reasoning lacks. But I reject that overgeneralization. Emotions are just another cognitive process. If reliably engineering cognitive processes is something we can learn to do, then we can reliably engineer emotions. If it isn’t something we can learn to do, then we can’t reliably engineer emotions… but we can’t reliably engineer AGI in general either. I don’t think there’s anything especially mysterious about emotions, relative to the mysteriousness of cognitive processes in general.
So, if your reasons for believing that are similar to the ones I’m speculating here, I simply disagree. If you have other reasons, I’m interested in what they are.
I don’t think an AGI failing to behave in the anticipated manner due to its qualia* (orgasms during cat creation, in this case) is a special or mysterious problem, one that must be treated differently than errors in its reasoning, prediction ability, perception, or any aspect of its cognition. On second thought, I do think it’s different: it actually seems less important than errors in any of those systems. (And if an AGI is Provably Safe, it’s safe—we need only worry about its qualia from an ethical perspective.) My original comment here is (I believe) fairly mild: I do think the issue of qualia will involve a practical class of problems for FAI, and knowing how to frame and address them could benefit from more cross-pollination from more biology-focused theorists such as Chalmers and Tononi. And somewhat more boldly, a “qualia translation function” would be of use to all FAI projects.
*I share your qualms about the word, but there really are few alternatives with less baggage, unfortunately.
Ah, I see. Yeah, agreed that what we are calling qualia here (not to be confused with its usage elsewhere) underlie a class of practical problems. And what you’re calling a qualia translation function (which is related to what EY called a non-person predicate elsewhere, though finer-grained) is potentially useful for a number of reasons.
I agree that, in order for me to behave ethically with respect to the AGI, I need to know whether the AGI is experiencing various morally relevant states, such as pain or fear or joy or what-have-you. And, as you say, this is also true about other physical systems besides AGIs; if monkeys or dolphins or dogs or mice or bacteria or thermostats have morally relevant states, then in order to behave ethically it’s important to know that as well. (It may also be relevant for non-physical systems.)
I’m a little wary of referring to those morally relevant states as “qualia” because that term gets used by so many different people in so many different ways, but I suppose labels don’t matter much… we can call them that for this discussion if you wish, as long as we stay clear about what the label refers to.
Leaving that aside… so, OK. We have a complex AGI with a variety of internal structures that affect its behavior in various ways. One of those structures is such that creating a cat gives the AGI an orgasm, which it finds rewarding. It wants orgasms, and therefore it wants to create cats. Which we didn’t expect.
So, OK. If the AGI is designed such that it creates more cats in this situation than it ought to (regardless of our expectations), that’s a problem. 100% agreed.
But it’s the same problem whether the root cause lies within the AGI’s emotions, or its reasoning, or its qualia, or its ability to predict the results of creating cats, or its perceptions, or any other aspect of its cognition.
You seem to be arguing that it’s a special problem if the failure is due to emotions or qualia or feelings?
I’m not sure why.
I can imagine believing that if I were overgeneralizing from my personal experience. When it comes to my own psyche, my emotions and feelings are a lot more mysterious than my surface-level reasoning, so it’s easy for me to infer some kind of intrinsic mysteriousness to emotions and feelings that reasoning lacks. But I reject that overgeneralization. Emotions are just another cognitive process. If reliably engineering cognitive processes is something we can learn to do, then we can reliably engineer emotions. If it isn’t something we can learn to do, then we can’t reliably engineer emotions… but we can’t reliably engineer AGI in general either. I don’t think there’s anything especially mysterious about emotions, relative to the mysteriousness of cognitive processes in general.
So, if your reasons for believing that are similar to the ones I’m speculating here, I simply disagree. If you have other reasons, I’m interested in what they are.
I don’t think an AGI failing to behave in the anticipated manner due to its qualia* (orgasms during cat creation, in this case) is a special or mysterious problem, one that must be treated differently than errors in its reasoning, prediction ability, perception, or any aspect of its cognition. On second thought, I do think it’s different: it actually seems less important than errors in any of those systems. (And if an AGI is Provably Safe, it’s safe—we need only worry about its qualia from an ethical perspective.) My original comment here is (I believe) fairly mild: I do think the issue of qualia will involve a practical class of problems for FAI, and knowing how to frame and address them could benefit from more cross-pollination from more biology-focused theorists such as Chalmers and Tononi. And somewhat more boldly, a “qualia translation function” would be of use to all FAI projects.
*I share your qualms about the word, but there really are few alternatives with less baggage, unfortunately.
Ah, I see. Yeah, agreed that what we are calling qualia here (not to be confused with its usage elsewhere) underlie a class of practical problems. And what you’re calling a qualia translation function (which is related to what EY called a non-person predicate elsewhere, though finer-grained) is potentially useful for a number of reasons.