I’d thought this question would have already been raised, given the nature of this site and the author, but I haven’t found it, so here goes.
Harry has already stated his intention of becoming as a god, and I’m not inclined to bet against him. He has already achieved partial Transmutation and Dementor-eradication, both considered impossible by other wizards, by virtue of his greater understanding of the Underlying Nature of Reality, and it seems likely he’s just getting warmed up, and the Rule of Cool is with him, and the author seems sympathetic to that sort of transcendence (unlike most authors, whom I would expect to be setting Harry up for an Icarus flameout and a lecture on hubris).
It would not take much, really. Let him start researching an “Increasium Intelligencium” spell, and all bets are off.
So it seems reasonable to ask the question: is Harry Friendly?
I mean, obviously he avoids the Vast majority of unFriendly design space, simply by virtue of being human. He isn’t going to tile the galaxy with paperclips or anything like that. But as I understand the idea, most human minds aren’t Friendly either (are any?). It doesn’t ordinarily matter, because humans aren’t capable of hard takeoff, but given the precarious power imbalances inherent in the MOR-verse perhaps Harry is an exception.
Then again, perhaps not. Perhaps the kind of “god” Harry is capable of “becoming as” isn’t sufficiently scary to be worth invoking that kind of decision process for.
But how would we (or, more to the point, his peers) know that? At the moment, they seem to be doing the typical human thing of estimating Harry based on their prior experience with teenagers/wizards, and they will probably go on doing that because that’s what humans do. But if they were more rational people who updated their model of Harry based on the evidence of his exceptionality, what would they conclude?
And if he isn’t guaranteed to be either not-that-scary or Friendly, does his very existence pose an existential threat to humanity? Is the rational thing to do to stop him before it’s too late?
My feeling is that the more obvious UFAI-alike in MoR is Quirrelmort. Consider: Inhumanly fast (newspaper reading; duel with Bahry). Inhumanly intelligent (passim). Very powerful in part because of being inhumanly fast and intelligent. Interested—supposing him to be Voldemort, on which topic I shall say no more—in defeating death (just the sort of thing someone might ask a superintelligent AI to find ways to do).
Speaking of which: “Tell them I ate it”, says Quirrell concerning the destroyed Dementor. Dementors in MoR are a symbol of death, even if many wizards don’t quite grasp that. Dumbledore doesn’t, but Harry surely does. Is Quirrell really not concerned that Harry may get the message: “I am a death-eater”?
So it seems reasonable to ask the question: is Harry Friendly?
Perhaps the purpose of the entire narrative will be to gut-punch the readers with a lesson in Friendly AI, by showing Harry acting determinedly and rationally to ensure his own Friendliness, but failing anyway.
To me, the most likely candidate for the role of an allegory of AI in MoR currently seems to be the Source of [Atlantean] Magic, assuming that Harry’s speculation wasn’t completely off the mark.
(My Wild Mass Guess on the matter would be that Harry will eventually discover that the Source was the Atlanteans’ equivalent of a moderately unFriendly AI, which didn’t destroy the world but did wipe away Atlantis. Eliezer will then be sorely tempted to go on an all-out Author Filibuster (TVTropes) on the subject, but will manage to restrain himself to a couple of paragraphs in the Author’s Notes.)
I’ll be disappointed if Harry doesn’t turn out to have been completely off the mark there.
I mean, the process he went through was roughly “Hey, look at this thing I don’t understand. It doesn’t behave at all the way I would expect it to. Um… maybe X is an explanation!” with no particular justification for privileging X over the uncountable number of hypotheses he could have come up with instead.
Worse, the hypothesis he came up with was pretty much unfalsifiable, and doesn’t make any particular predictions. It doesn’t “pay rent,” to quote an early OB post.
The moralist in me does not want Harry “rewarded” with a correct answer for reasoning that way.
That having been said, I grant that if I knew the SO[A]M existed, I would conclude that Harry was somehow being influenced by its existence such that the theory of SO[A]M was more available, which doesn’t require any additional assumptions given that it is necessarily responsive to wizards’ thoughts to begin with. (That is, I don’t want to repeat the reasoning error I made elsewhere regarding Quirrell being polyjuiced.)
I mean, obviously he avoids the Vast majority of unFriendly design space, simply by virtue of being human. He isn’t going to tile the galaxy with paperclips or anything like that.
We don’t know much about how stable human values are under recursive self-modification. It’s entirely possible (albeit seemingly unlikely) that humans even tend towards tiling the galaxy with paperclips in particular.
Indeed, it seems likely. Many humans have the concept that ‘locked’ values are better than ‘wishy-washy’ ones; few have the concept of local maximums and even fewer the understanding of complex, changing human value systems. Thus a priori we should expect there is some bias or leaning in that direction, which would presumably have a chance of affecting one human in particular. This chance is greater than that of an AI’s, who chooses at random.
Harry is aware of these ideas, but he often catches himself in errors. When it comes to self-modification there are no opportunities to catch your errors; you are stuck with them and will never even realise there are any.
I wonder if Quirrell realises Harry desires to be an actual god, and not just Supreme Emperor of the magical world.
Ugh, he’s exactly bright enough to do just that, complete with justification that he can’t trust anyone else to both be safe (Quirell, Dumbledore, Draco all too dangerous) and effective (Hermione wouldn’t exploit enough).
(ch 58)
I’d thought this question would have already been raised, given the nature of this site and the author, but I haven’t found it, so here goes.
Harry has already stated his intention of becoming as a god, and I’m not inclined to bet against him. He has already achieved partial Transmutation and Dementor-eradication, both considered impossible by other wizards, by virtue of his greater understanding of the Underlying Nature of Reality, and it seems likely he’s just getting warmed up, and the Rule of Cool is with him, and the author seems sympathetic to that sort of transcendence (unlike most authors, whom I would expect to be setting Harry up for an Icarus flameout and a lecture on hubris).
It would not take much, really. Let him start researching an “Increasium Intelligencium” spell, and all bets are off.
So it seems reasonable to ask the question: is Harry Friendly?
I mean, obviously he avoids the Vast majority of unFriendly design space, simply by virtue of being human. He isn’t going to tile the galaxy with paperclips or anything like that. But as I understand the idea, most human minds aren’t Friendly either (are any?). It doesn’t ordinarily matter, because humans aren’t capable of hard takeoff, but given the precarious power imbalances inherent in the MOR-verse perhaps Harry is an exception.
Then again, perhaps not. Perhaps the kind of “god” Harry is capable of “becoming as” isn’t sufficiently scary to be worth invoking that kind of decision process for.
But how would we (or, more to the point, his peers) know that? At the moment, they seem to be doing the typical human thing of estimating Harry based on their prior experience with teenagers/wizards, and they will probably go on doing that because that’s what humans do. But if they were more rational people who updated their model of Harry based on the evidence of his exceptionality, what would they conclude?
And if he isn’t guaranteed to be either not-that-scary or Friendly, does his very existence pose an existential threat to humanity? Is the rational thing to do to stop him before it’s too late?
My feeling is that the more obvious UFAI-alike in MoR is Quirrelmort. Consider: Inhumanly fast (newspaper reading; duel with Bahry). Inhumanly intelligent (passim). Very powerful in part because of being inhumanly fast and intelligent. Interested—supposing him to be Voldemort, on which topic I shall say no more—in defeating death (just the sort of thing someone might ask a superintelligent AI to find ways to do).
Speaking of which: “Tell them I ate it”, says Quirrell concerning the destroyed Dementor. Dementors in MoR are a symbol of death, even if many wizards don’t quite grasp that. Dumbledore doesn’t, but Harry surely does. Is Quirrell really not concerned that Harry may get the message: “I am a death-eater”?
But does Quirrell know, or suspect, that Dementors are a form of death? If not he wouldn’t even notice.
Well, like Harry he is unable to cast Patronus v1.0, and he seems to understand just fine when Harry calls the Dementors “life-eaters” in ch58.
Perhaps the purpose of the entire narrative will be to gut-punch the readers with a lesson in Friendly AI, by showing Harry acting determinedly and rationally to ensure his own Friendliness, but failing anyway.
To me, the most likely candidate for the role of an allegory of AI in MoR currently seems to be the Source of [Atlantean] Magic, assuming that Harry’s speculation wasn’t completely off the mark.
(My Wild Mass Guess on the matter would be that Harry will eventually discover that the Source was the Atlanteans’ equivalent of a moderately unFriendly AI, which didn’t destroy the world but did wipe away Atlantis. Eliezer will then be sorely tempted to go on an all-out Author Filibuster (TVTropes) on the subject, but will manage to restrain himself to a couple of paragraphs in the Author’s Notes.)
I’ll be disappointed if Harry doesn’t turn out to have been completely off the mark there.
I mean, the process he went through was roughly “Hey, look at this thing I don’t understand. It doesn’t behave at all the way I would expect it to. Um… maybe X is an explanation!” with no particular justification for privileging X over the uncountable number of hypotheses he could have come up with instead.
Worse, the hypothesis he came up with was pretty much unfalsifiable, and doesn’t make any particular predictions. It doesn’t “pay rent,” to quote an early OB post.
The moralist in me does not want Harry “rewarded” with a correct answer for reasoning that way.
That having been said, I grant that if I knew the SO[A]M existed, I would conclude that Harry was somehow being influenced by its existence such that the theory of SO[A]M was more available, which doesn’t require any additional assumptions given that it is necessarily responsive to wizards’ thoughts to begin with. (That is, I don’t want to repeat the reasoning error I made elsewhere regarding Quirrell being polyjuiced.)
But right now I don’t know that.
We don’t know much about how stable human values are under recursive self-modification. It’s entirely possible (albeit seemingly unlikely) that humans even tend towards tiling the galaxy with paperclips in particular.
Compared to the Vast space of minds in general, they certainly do. Few minds in that Vast space have heard of the concept of a paperclip, after all.
Indeed, it seems likely. Many humans have the concept that ‘locked’ values are better than ‘wishy-washy’ ones; few have the concept of local maximums and even fewer the understanding of complex, changing human value systems. Thus a priori we should expect there is some bias or leaning in that direction, which would presumably have a chance of affecting one human in particular. This chance is greater than that of an AI’s, who chooses at random.
Harry is aware of these ideas, but he often catches himself in errors. When it comes to self-modification there are no opportunities to catch your errors; you are stuck with them and will never even realise there are any.
I wonder if Quirrell realises Harry desires to be an actual god, and not just Supreme Emperor of the magical world.
Hopefully Harry is bright enough not to test invasive intelligence improvement on himself.
Ugh, he’s exactly bright enough to do just that, complete with justification that he can’t trust anyone else to both be safe (Quirell, Dumbledore, Draco all too dangerous) and effective (Hermione wouldn’t exploit enough).
He could time-turn himself to allow for self-monitoring of the experiment.