Yep—humans can’t even talk about misunderstandings of human language without postulating an agent that thinks like a human.
Some “simple” algorithms for “understanding” language—such as “All I need is a dictionary and a grammar parser”—lead to predictable categories of mistakes. For example, there’s the story of the automatic translating software that, when asked to translate “The spirit is willing but the flesh is weak” into Russian and then back into English, returned “The vodka is good but the meat is rotten”.
Some of these problems arise simply because the translation software doesn’t have an ontology and therefore doesn’t recognize category mistakes (like vodka not being the right kind of thing to be willing). The other sort of problem is a result of its failure to reason about probable utterances. This would require an even huger amount of knowledge. In practice, this is the issue: it’s just too much data to handcode. But it’s not a problem in principle, I would think.
The “cure cancer” → “kill everyone” example is perfectly silly, by the way. In order to get that, you have to reinterpret “cure” in a rather non-trivial way, which actually requires malicious intent.
Some “simple” algorithms for “understanding” language—such as “All I need is a dictionary and a grammar parser”—lead to predictable categories of mistakes. For example, there’s the story of the automatic translating software that, when asked to translate “The spirit is willing but the flesh is weak” into Russian and then back into English, returned “The vodka is good but the meat is rotten”.
I’ve heard that story since I was a kid. It sounds made-up. Does anyone know its actual source?
Apparently the details might well be made up, but the problem itself was real: http://www.snopes.com/language/misxlate/machine.asp
Very real indeed.
Some of these problems arise simply because the translation software doesn’t have an ontology and therefore doesn’t recognize category mistakes (like vodka not being the right kind of thing to be willing). The other sort of problem is a result of its failure to reason about probable utterances. This would require an even huger amount of knowledge. In practice, this is the issue: it’s just too much data to handcode. But it’s not a problem in principle, I would think.
The “cure cancer” → “kill everyone” example is perfectly silly, by the way. In order to get that, you have to reinterpret “cure” in a rather non-trivial way, which actually requires malicious intent.