Because the Sword of Good didn’t kill him; also he seems to be quite an excellent moral philosopher—someone who actually perceives morality. And if not him, then who else on the next try? (Of course there’s going to be a next try eventually, given that it’s possible in the first place.)
Why does Hirou trust the Sword of Good? How does he know that it’s Friendly?
also he seems to be quite an excellent moral philosopher—someone who actually perceives morality.
I didn’t get that from the story. All those fantasy books he’s read, and he only now ponders whether something is good just because the author labeled it “Good”? He only now considers how immoral the actions of many fantasy heroes would be were they real? I remember being bothered by Aragorn’s divine right to lead when I was eight and my Dad was reading Lord of the Rings to me.
As your acknowledgments show, pondering whether it could really be moral to kill “bad guys” so willy-nilly is common in fantasy circles. One of the Austin Powers movies used this to humorous effect with a little vignette about how one of the henchmen killed by Powers had a loving family and had just celebrated his retirement surrounded by loving friends.
Maybe these thoughts never occur to many fantasy readers, but I don’t think that we’re talking about some vanishingly rare perspicacity here.
And if not him, then who else on the next try?
Maybe someone who’s developed a rigorous theory of friendliness :).
I guess I’m just surprised to see an allegory from you in which someone solves Friendliness by applying thirty seconds of his at-best-slightly-above-average moral intuition. I did not get the impression that Hirou was any kind of moral savant. And I had thought that even a moral savant, on your view, couldn’t reliably make such a decision in thirty seconds.
I didn’t get that from the story. All those fantasy books he’s read, and he only now ponders whether something is good just because the author labeled it “Good”?
I think you’re being a little optimistic here in thinking your skepticism is at all general.
Why was Norman Spinrad’s _The Iron Dream_ so critically well-received and still read? (If you haven’t read it, it’s much like Eliezer’s story except without the sane hero.) Because it demonstrated that most readers weren’t critical, that they’d been reading fantasy stories for literally decades without cottoning onto how well the same stories justified genocide and fascism!
I thought the point of The Iron Dream was that Hitler’s novel (the story is set in an alternate world where Hitler became a pulp writer) was the nastiest sort of inappropriate fantasy.
also he seems to be quite an excellent moral philosopher—someone who actually perceives morality.
I didn’t get that from the story. All those fantasy books he’s read
Not Hirou, Vhazhar. For some reason, even as a very young child facing religious indoctrination, I couldn’t quite accept that Abraham had made the right choice in trying to sacrifice Isaac upon God’s command. That was one of my first moral breaks with Judaism. The Lord of Dark is—almost necessarily—actually visualizing situations and reacting to them as if seen, rather than processing words however the people around him expect to process them; there’s no other way he could reject the values of his society to that extent, and even then, the amount of convergence he exhibits with our own civilization is implausible barring extremely optimistic assumptions about (a) the amount of absolute coherence (b) our own society’s intelligence and (c) the Lord of Dark’s intelligence; but of course the story wouldn’t have worked otherwise.
I guess I’m just surprised to see an allegory from you in which someone solves Friendliness by applying thirty seconds of his at-best-slightly-above-average moral intuition.
Vhazhar’s been working on it for some unknown number of years, having successfully realized that sucking the life from worms may be icky but doesn’t actually hurt any sentient beings. (Though I wasn’t assuming Vhazhar was ancient, he very well could be, and that would make a number of things more plausible, really.) Hirou has a whole civilization behind him and just needed to wake up and actually think.
Okay, Hirou has evidence that Vhazhar is a moral savant. But the reader, and Hirou, sees little evidence that Vhazhar has worked out a formal, rigorous theory of Friendliness. I thought that anything less than that, on your view, virtually guaranteed the obliteration of almost everything valuable.
But I draw a weaker inference from Vhazhar’s ability to overcome indoctrination. Yes, it implies that he probably had a high native aptitude for correct moral reasoning. But the very fact that he was subjected to the indoctrination means that he’s probably damaged anyways. If someone survives a disease that’s usually deadly, you should expect that she went into the disease with an uncommonly strong constitution. But, given that she’s had the disease, you should expect that she’s now less healthy than average.
But the reader, and Hirou, sees little evidence that Vhazhar has worked out a formal, rigorous theory of Friendliness. I thought that anything less than that, on your view, virtually guaranteed the obliteration of almost everything valuable.
Only by AIs. Human uploads would be a whole different story. Not necessarily a good story, but a different story, and one in which—whatever the objective frequency of winning—I’d have to say that, relative to my subjective knowledge, there’s a pretty sizable chunk of chance.
If Vhazhar was literally casting a spell to run the world directly, and he wasn’t able to take advantage of moral magic like that embodied in the Sword of Good itself (which, conceivably, could be a lot less sophisticated than its name implies) then it’s a full-fledged Friendly AI problem.
What are the justifiable expectations one could have about the Sword of Good? In particular, why suppose that it’s a Sword of Good in anything other than name only? Why suppose that it’s any protection against evil?
I also didn’t consider the possibility that Vhazhar was planning to run the world himself directly. A human just doesn’t have the computational capacity to run the world. If a human tried to run the world, there would still be both fortune and misfortune.
For that reason, I assumed that his plan was for some extrapolated version of his volition to run the world. But if he’s created something that will implement his CEV accurately, hasn’t he solved FAI?
I also didn’t consider the possibility that Vhazhar was planning to run the world himself directly. A human just doesn’t have the computational capacity to run the world. If a human tried to run the world, there would still be both fortune and misfortune.
There could be less misfortune. A cautious human god who wasn’t corrupted by power certainly could plausibly accomplish a lot of good with a few minimal actions. Of course the shaky part is that “cautious” and “not corrupted” part.
Where does the ability to specify complex wishes become distinct from the ability to implement them though? What are the capabilities of a god with human mind? If there is a lot of automation for implementing the wishes, how much of the person’s preference does this automation anticipate? In what sense does the limitation on a god’s mind to be merely human affect god’s capacity to control the world? There doesn’t seem to be a natural concept that captures this.
Okay. I had taken the Prophecy of Doom to be saying that there would no longer be both “luck and misfortune”. I can see that it could be read otherwise, though.
Well, there are at least several obvious fixes that we humans would want to make to the world we live in, but are unable to. For example, we would like to wipe out the malaria parasite that infects humans. The dragon is bad, the world is full of really, really horrible things, and I’d rather just make it stop rather than worry too much about being corrupted by power.
I wrote and deleted a comment to the effect of “The Sword of Good didn’t kill him, and the Sword appears to be a judge of good intentions = Friendliness (though not good reasoning)”, then deleted it on consideration that unfriendliness-through-failures-of-reasoning might be worse than the current state of the world. But “there’s going to be a next try” indeed outweighs that. I think.
Because the Sword of Good didn’t kill him; also he seems to be quite an excellent moral philosopher—someone who actually perceives morality. And if not him, then who else on the next try? (Of course there’s going to be a next try eventually, given that it’s possible in the first place.)
Why does Hirou trust the Sword of Good? How does he know that it’s Friendly?
I didn’t get that from the story. All those fantasy books he’s read, and he only now ponders whether something is good just because the author labeled it “Good”? He only now considers how immoral the actions of many fantasy heroes would be were they real? I remember being bothered by Aragorn’s divine right to lead when I was eight and my Dad was reading Lord of the Rings to me.
As your acknowledgments show, pondering whether it could really be moral to kill “bad guys” so willy-nilly is common in fantasy circles. One of the Austin Powers movies used this to humorous effect with a little vignette about how one of the henchmen killed by Powers had a loving family and had just celebrated his retirement surrounded by loving friends.
Maybe these thoughts never occur to many fantasy readers, but I don’t think that we’re talking about some vanishingly rare perspicacity here.
Maybe someone who’s developed a rigorous theory of friendliness :).
I guess I’m just surprised to see an allegory from you in which someone solves Friendliness by applying thirty seconds of his at-best-slightly-above-average moral intuition. I did not get the impression that Hirou was any kind of moral savant. And I had thought that even a moral savant, on your view, couldn’t reliably make such a decision in thirty seconds.
I think you’re being a little optimistic here in thinking your skepticism is at all general.
Why was Norman Spinrad’s _The Iron Dream_ so critically well-received and still read? (If you haven’t read it, it’s much like Eliezer’s story except without the sane hero.) Because it demonstrated that most readers weren’t critical, that they’d been reading fantasy stories for literally decades without cottoning onto how well the same stories justified genocide and fascism!
I thought the point of The Iron Dream was that Hitler’s novel (the story is set in an alternate world where Hitler became a pulp writer) was the nastiest sort of inappropriate fantasy.
Not Hirou, Vhazhar. For some reason, even as a very young child facing religious indoctrination, I couldn’t quite accept that Abraham had made the right choice in trying to sacrifice Isaac upon God’s command. That was one of my first moral breaks with Judaism. The Lord of Dark is—almost necessarily—actually visualizing situations and reacting to them as if seen, rather than processing words however the people around him expect to process them; there’s no other way he could reject the values of his society to that extent, and even then, the amount of convergence he exhibits with our own civilization is implausible barring extremely optimistic assumptions about (a) the amount of absolute coherence (b) our own society’s intelligence and (c) the Lord of Dark’s intelligence; but of course the story wouldn’t have worked otherwise.
Vhazhar’s been working on it for some unknown number of years, having successfully realized that sucking the life from worms may be icky but doesn’t actually hurt any sentient beings. (Though I wasn’t assuming Vhazhar was ancient, he very well could be, and that would make a number of things more plausible, really.) Hirou has a whole civilization behind him and just needed to wake up and actually think.
Okay, Hirou has evidence that Vhazhar is a moral savant. But the reader, and Hirou, sees little evidence that Vhazhar has worked out a formal, rigorous theory of Friendliness. I thought that anything less than that, on your view, virtually guaranteed the obliteration of almost everything valuable.
But I draw a weaker inference from Vhazhar’s ability to overcome indoctrination. Yes, it implies that he probably had a high native aptitude for correct moral reasoning. But the very fact that he was subjected to the indoctrination means that he’s probably damaged anyways. If someone survives a disease that’s usually deadly, you should expect that she went into the disease with an uncommonly strong constitution. But, given that she’s had the disease, you should expect that she’s now less healthy than average.
Only by AIs. Human uploads would be a whole different story. Not necessarily a good story, but a different story, and one in which—whatever the objective frequency of winning—I’d have to say that, relative to my subjective knowledge, there’s a pretty sizable chunk of chance.
If Vhazhar was literally casting a spell to run the world directly, and he wasn’t able to take advantage of moral magic like that embodied in the Sword of Good itself (which, conceivably, could be a lot less sophisticated than its name implies) then it’s a full-fledged Friendly AI problem.
What are the justifiable expectations one could have about the Sword of Good? In particular, why suppose that it’s a Sword of Good in anything other than name only? Why suppose that it’s any protection against evil?
I also didn’t consider the possibility that Vhazhar was planning to run the world himself directly. A human just doesn’t have the computational capacity to run the world. If a human tried to run the world, there would still be both fortune and misfortune.
For that reason, I assumed that his plan was for some extrapolated version of his volition to run the world. But if he’s created something that will implement his CEV accurately, hasn’t he solved FAI?
There could be less misfortune. A cautious human god who wasn’t corrupted by power certainly could plausibly accomplish a lot of good with a few minimal actions. Of course the shaky part is that “cautious” and “not corrupted” part.
Where does the ability to specify complex wishes become distinct from the ability to implement them though? What are the capabilities of a god with human mind? If there is a lot of automation for implementing the wishes, how much of the person’s preference does this automation anticipate? In what sense does the limitation on a god’s mind to be merely human affect god’s capacity to control the world? There doesn’t seem to be a natural concept that captures this.
Okay. I had taken the Prophecy of Doom to be saying that there would no longer be both “luck and misfortune”. I can see that it could be read otherwise, though.
Well, there are at least several obvious fixes that we humans would want to make to the world we live in, but are unable to. For example, we would like to wipe out the malaria parasite that infects humans. The dragon is bad, the world is full of really, really horrible things, and I’d rather just make it stop rather than worry too much about being corrupted by power.
I wrote and deleted a comment to the effect of “The Sword of Good didn’t kill him, and the Sword appears to be a judge of good intentions = Friendliness (though not good reasoning)”, then deleted it on consideration that unfriendliness-through-failures-of-reasoning might be worse than the current state of the world. But “there’s going to be a next try” indeed outweighs that. I think.