But in practice, the “told”/”meant” distinction does not carve reality at the joints and primarily applies to the plausible deniability.
Yep—humans can’t even talk about misunderstandings of human language without postulating an agent that thinks like a human.
(Thinking the opposite, being close enough to end up with a hideous misunderstanding, is another version of thinking like a human, because the intelligence needs to think almost entirely like a human to understand well enough to get it so badly wrong. Edit: By which I mean: an AI wouldn’t be able to understand language close enough to the way a human does to get it that badly wrong, without having a pretty good simulation of a human in the first place—the same way a human understands and construes or misconstrues language. And don’t forget that humans do a lot of this in evolved hardware.)
I … don’t actually recall what I meant the opposite of!
What I’m actually saying: The “genie” who you order to “cure cancer!” and it understands you well enough to get it hideously wrong and cure cancer by killing everyone—for an AI to go that wrong, I suspect it’s got to be just about human, and have a good internal model of a human, and understand you like a human would.
I would be amazed if we’re lucky enough to get a first AI that we actually understand or that actually understands us—a copy of the thousand contradictory evolved shards that make up human utility strikes me as a very unlikely first AI, and worrying in particular about the hidden complexity of wishes strikes me as privileging the hypothesis. Surely we’re much more likely to end up with something utterly alien out there in mindspace.
Yep—humans can’t even talk about misunderstandings of human language without postulating an agent that thinks like a human.
Some “simple” algorithms for “understanding” language—such as “All I need is a dictionary and a grammar parser”—lead to predictable categories of mistakes. For example, there’s the story of the automatic translating software that, when asked to translate “The spirit is willing but the flesh is weak” into Russian and then back into English, returned “The vodka is good but the meat is rotten”.
Some of these problems arise simply because the translation software doesn’t have an ontology and therefore doesn’t recognize category mistakes (like vodka not being the right kind of thing to be willing). The other sort of problem is a result of its failure to reason about probable utterances. This would require an even huger amount of knowledge. In practice, this is the issue: it’s just too much data to handcode. But it’s not a problem in principle, I would think.
The “cure cancer” → “kill everyone” example is perfectly silly, by the way. In order to get that, you have to reinterpret “cure” in a rather non-trivial way, which actually requires malicious intent.
This is good:
Yep—humans can’t even talk about misunderstandings of human language without postulating an agent that thinks like a human.
(Thinking the opposite, being close enough to end up with a hideous misunderstanding, is another version of thinking like a human, because the intelligence needs to think almost entirely like a human to understand well enough to get it so badly wrong. Edit: By which I mean: an AI wouldn’t be able to understand language close enough to the way a human does to get it that badly wrong, without having a pretty good simulation of a human in the first place—the same way a human understands and construes or misconstrues language. And don’t forget that humans do a lot of this in evolved hardware.)
I don’t think I’m following—what is the opposite of what?
I … don’t actually recall what I meant the opposite of!
What I’m actually saying: The “genie” who you order to “cure cancer!” and it understands you well enough to get it hideously wrong and cure cancer by killing everyone—for an AI to go that wrong, I suspect it’s got to be just about human, and have a good internal model of a human, and understand you like a human would.
I would be amazed if we’re lucky enough to get a first AI that we actually understand or that actually understands us—a copy of the thousand contradictory evolved shards that make up human utility strikes me as a very unlikely first AI, and worrying in particular about the hidden complexity of wishes strikes me as privileging the hypothesis. Surely we’re much more likely to end up with something utterly alien out there in mindspace.
Ah, I understand what you’re saying now. Thanks!
Some “simple” algorithms for “understanding” language—such as “All I need is a dictionary and a grammar parser”—lead to predictable categories of mistakes. For example, there’s the story of the automatic translating software that, when asked to translate “The spirit is willing but the flesh is weak” into Russian and then back into English, returned “The vodka is good but the meat is rotten”.
I’ve heard that story since I was a kid. It sounds made-up. Does anyone know its actual source?
Apparently the details might well be made up, but the problem itself was real: http://www.snopes.com/language/misxlate/machine.asp
Very real indeed.
Some of these problems arise simply because the translation software doesn’t have an ontology and therefore doesn’t recognize category mistakes (like vodka not being the right kind of thing to be willing). The other sort of problem is a result of its failure to reason about probable utterances. This would require an even huger amount of knowledge. In practice, this is the issue: it’s just too much data to handcode. But it’s not a problem in principle, I would think.
The “cure cancer” → “kill everyone” example is perfectly silly, by the way. In order to get that, you have to reinterpret “cure” in a rather non-trivial way, which actually requires malicious intent.