I don’t think it makes any moral difference whether a paperclip maximizer likes paperclips. What makes moral differences are things like, y’know, life, consciousness, activity, blah blah.
What difference would CEV make from a universe in which a Paperclip Maximizer equipped everyone with the desire to maximize paperclips? Of what difference is a universe with as many discrete consciousness entities as possible from one with a single universe-spanning consciousness?
If it doesn’t make any difference, then how can we be sure that the SIAI won’t just implement the first fooming AI with whatever terminal goal it desires?
I don’t see how you can argue that the question “What is right?” is about the state of affairs that will help people to have more fun and yet claim that you don’t think that “it makes any moral difference whether a paperclip maximizer likes paperclips”
What difference would CEV make from a universe in which a Paperclip Maximizer equipped everyone with the desire to maximize paperclips? Of what difference is a universe with as many discrete consciousness entities as possible from one with a single universe-spanning consciousness?
If a paperclip maximizer modified everyone such that we really only valued paperclips and nothing else, and we then ran CEV, then CEV would produce a powerful paperclip maximizer. This is… I’m not going to say it’s a feature, but it’s not a bug, at least. You can’t expect CEV to generate accurate information about morality if you erase morality from the minds it’s looking at. (You could recover some information about morality by looking at history, or human DNA (if the paperclip maximizer didn’t modify that), etc., but then you’d need a strategy other than CEV.)
I don’t think I understand your second question.
I don’t see how you can argue that the question “What is right?” is about the state of affairs that will help people to have more fun and yet claim that you don’t think that “it makes any moral difference whether a paperclip maximizer likes paperclips”
That depends on whether the paperclip maximizer is sentient, whether it just makes paperclips or it actually enjoys making paperclips, etc. If those are the case, then its preferences matter… a little. (So let’s not make one of those.)
That depends on whether the paperclip maximizer is sentient, whether it just makes paperclips or it actually enjoys making paperclips, etc.
All those concepts seem to be vague. To be sentient, to enjoy. Do you need to figure out how to define those concepts mathematically before you’ll be able to implement CEV? Or are you just going to let extrapolated human volition decide about that? If so, how can you possible make claims about how valuable, or how much the preference of a paperclip maximizer matter? Maybe it will all turn out to be wireheading in the end...
What is really weird is that Yudkowsky is using the word right in reference to actions affecting other agents, yet doesn’t think that it would be reasonable to assign moral weight to the preferences of a paperclip maximizer.
CEV will decide. In general, it seems unlikely that the preferences of nonsentient objects will have moral value.
Edit: Looking back, this comment doesn’t really address the parent. Extrapolated human volition will be used to determine which things are morally significant. I think it is relatively probable that wireheading might turn out to be morally necessary. Eliezer does think that the preferences of a paperclip maximizer would have moral value if one existed. (If a nonexistent paperclip maximizer had moral worth, so would a nonexistent paperclip minimizer. This isn’t completely certain, because paperclip maximizers might gain moral significance from a property other than existence that is not shared with paperclip minimizers, but at this point, this is just speculation and we can do little better without CEV.) A nonsentient paperclip maximizer probably has no more moral value than a rock with “make paperclips” written on the side.
The reason that CEV is only based on human preferences is because, as humans, we want to create an algorithm that does what is right and humans are the only things we have that know what is right. If other species have moral value then humans, if we knew more, would care about them. If there is nothing in human minds that could motivate us to care about some specific thing, than what reason could we possibly have for designing an AI to care about that thing?
Paperclips aren’t part of fun, on EY’s account as I understand it, and therefore not relevant to morality or right. If paperclip maximizers believe otherwise they are simply wrong (perhaps incorrigibly so, but wrong nonetheless)… right and wrong don’t depend on the beliefs of agents, on this account.
So those claims seem consistent to me.
Similarly, a universe in which a PM equipped everyone with the desire to maximize paperclips would therefore be a universe with less desire for fun in it. (This would presumably in turn cause it to be a universe with less fun in it, and therefore a less valuable universe.)
I should add that I don’t endorse this view, but it does seem to be pretty clearly articulated/presented. If I’m wrong about this, then I am deeply confused.
If paperclip maximizers believe otherwise they are simply wrong (perhaps incorrigibly so, but wrong nonetheless)… right and wrong don’t depend on the beliefs of agents, on this account.
I don’t understand how someone can arrive at “right and wrong don’t depend on the beliefs of agents”.
I conclude that you use “I don’t understand” here to indicate that you don’t find the reasoning compelling. I don’t find it compelling, either—hence, my not endorsing it—so I don’t have anything more to add on that front.
If those people propose that utility functions are timeless (e.g. the Mathematical Universe), or simply an intrinsic part of the quantum amplitudes that make up physical reality (is there a meaningful difference?), then under that assumption I agree. If beauty can be captured as a logical function then women.beautiful is right independent of any agent that might endorse that function. The problem of differing tastes, differing aesthetic value, that lead to sentences like “beauty is in the eye of the beholder” are a result of trying to derive functions by the labeling of relations. There can be different functions that designate the same label to different relations. x is R-related to y can be labeled “beautiful” but so can xSy. So while some people talk about the ambiguity of the label beauty and conclude that what is beautiful is agent-dependent, other people talk about the set of functions that are labeled as beauty-function or assign the label beautiful to certain relations and conclude that their output is agent-independent.
(nods) Yes, I think EY believes that rightness can be computed as a property of physical reality, without explicit reference to other agents.
That said, I think he also believes that the specifics of that computation cannot be determined without reference to humans. I’m not 100% clear on whether he considers that a mere practical limitation or something more fundamental.
What difference would CEV make from a universe in which a Paperclip Maximizer equipped everyone with the desire to maximize paperclips? Of what difference is a universe with as many discrete consciousness entities as possible from one with a single universe-spanning consciousness?
If it doesn’t make any difference, then how can we be sure that the SIAI won’t just implement the first fooming AI with whatever terminal goal it desires?
I don’t see how you can argue that the question “What is right?” is about the state of affairs that will help people to have more fun and yet claim that you don’t think that “it makes any moral difference whether a paperclip maximizer likes paperclips”
If a paperclip maximizer modified everyone such that we really only valued paperclips and nothing else, and we then ran CEV, then CEV would produce a powerful paperclip maximizer. This is… I’m not going to say it’s a feature, but it’s not a bug, at least. You can’t expect CEV to generate accurate information about morality if you erase morality from the minds it’s looking at. (You could recover some information about morality by looking at history, or human DNA (if the paperclip maximizer didn’t modify that), etc., but then you’d need a strategy other than CEV.)
I don’t think I understand your second question.
That depends on whether the paperclip maximizer is sentient, whether it just makes paperclips or it actually enjoys making paperclips, etc. If those are the case, then its preferences matter… a little. (So let’s not make one of those.)
All those concepts seem to be vague. To be sentient, to enjoy. Do you need to figure out how to define those concepts mathematically before you’ll be able to implement CEV? Or are you just going to let extrapolated human volition decide about that? If so, how can you possible make claims about how valuable, or how much the preference of a paperclip maximizer matter? Maybe it will all turn out to be wireheading in the end...
What is really weird is that Yudkowsky is using the word right in reference to actions affecting other agents, yet doesn’t think that it would be reasonable to assign moral weight to the preferences of a paperclip maximizer.
CEV will decide. In general, it seems unlikely that the preferences of nonsentient objects will have moral value.
Edit: Looking back, this comment doesn’t really address the parent. Extrapolated human volition will be used to determine which things are morally significant. I think it is relatively probable that wireheading might turn out to be morally necessary. Eliezer does think that the preferences of a paperclip maximizer would have moral value if one existed. (If a nonexistent paperclip maximizer had moral worth, so would a nonexistent paperclip minimizer. This isn’t completely certain, because paperclip maximizers might gain moral significance from a property other than existence that is not shared with paperclip minimizers, but at this point, this is just speculation and we can do little better without CEV.) A nonsentient paperclip maximizer probably has no more moral value than a rock with “make paperclips” written on the side.
The reason that CEV is only based on human preferences is because, as humans, we want to create an algorithm that does what is right and humans are the only things we have that know what is right. If other species have moral value then humans, if we knew more, would care about them. If there is nothing in human minds that could motivate us to care about some specific thing, than what reason could we possibly have for designing an AI to care about that thing?
near future : “you are paper clip maximazer! Kill him!”
What is this supposed to mean?
Paperclips aren’t part of fun, on EY’s account as I understand it, and therefore not relevant to morality or right. If paperclip maximizers believe otherwise they are simply wrong (perhaps incorrigibly so, but wrong nonetheless)… right and wrong don’t depend on the beliefs of agents, on this account.
So those claims seem consistent to me.
Similarly, a universe in which a PM equipped everyone with the desire to maximize paperclips would therefore be a universe with less desire for fun in it. (This would presumably in turn cause it to be a universe with less fun in it, and therefore a less valuable universe.)
I should add that I don’t endorse this view, but it does seem to be pretty clearly articulated/presented. If I’m wrong about this, then I am deeply confused.
I don’t understand how someone can arrive at “right and wrong don’t depend on the beliefs of agents”.
I conclude that you use “I don’t understand” here to indicate that you don’t find the reasoning compelling. I don’t find it compelling, either—hence, my not endorsing it—so I don’t have anything more to add on that front.
If those people propose that utility functions are timeless (e.g. the Mathematical Universe), or simply an intrinsic part of the quantum amplitudes that make up physical reality (is there a meaningful difference?), then under that assumption I agree. If beauty can be captured as a logical function then women.beautiful is right independent of any agent that might endorse that function. The problem of differing tastes, differing aesthetic value, that lead to sentences like “beauty is in the eye of the beholder” are a result of trying to derive functions by the labeling of relations. There can be different functions that designate the same label to different relations. x is R-related to y can be labeled “beautiful” but so can xSy. So while some people talk about the ambiguity of the label beauty and conclude that what is beautiful is agent-dependent, other people talk about the set of functions that are labeled as beauty-function or assign the label beautiful to certain relations and conclude that their output is agent-independent.
(nods) Yes, I think EY believes that rightness can be computed as a property of physical reality, without explicit reference to other agents.
That said, I think he also believes that the specifics of that computation cannot be determined without reference to humans. I’m not 100% clear on whether he considers that a mere practical limitation or something more fundamental.