[believes that benevolence toward humanity might involve forcing human beings to do something violently against their will.]
But you didn’t ask the AI to maximize the value that humans call “benevolence”. You asked it to “maximize happiness”. And so the AI went out and mass produced the most happy humans possible.
The point of the thought experiment, is to show how easy it is to give an AI a bad goal. Of course ideally you could just tell it to “be benevolent”, and it would understand you and do it. But getting that to work is an entirely different problem. (The AI understands the words you say, but how do you get it to care. To actually follow your instructions?)
Alas, the article was a long, detailed analysis of precisely the claim that you made right there, and the “point of the thought experiment” was shown to be a meaningless fantasy about a type of AI that would be so broken that it would not be capable of serious intelligence at all.
You’ve argued that competence at generating plans given environments probably leads to competence at understanding meaning given text and context, but I still think that’s a far cry from showing that competence at generating plans given environments requires understanding meaning given text and context.
Yes the thought experiment is a fantasy. It requires an AI which takes English goals, but interprets them literally. We don’t even know how to get to an AI that takes English goals, and that’s probably FAI complete.
If you solve the problem of making an AI which wants to interpret what you want it do do correctly, you don’t need to bother telling it what to do. It already wants to do what you want it to do. There should be no need for the system to require English language inputs, any more than a calculator requires you to shout “add the numbers correctly!”
But you didn’t ask the AI to maximize the value that humans call “benevolence”. You asked it to “maximize happiness”. And so the AI went out and mass produced the most happy humans possible.
The point of the thought experiment, is to show how easy it is to give an AI a bad goal. Of course ideally you could just tell it to “be benevolent”, and it would understand you and do it. But getting that to work is an entirely different problem. (The AI understands the words you say, but how do you get it to care. To actually follow your instructions?)
Alas, the article was a long, detailed analysis of precisely the claim that you made right there, and the “point of the thought experiment” was shown to be a meaningless fantasy about a type of AI that would be so broken that it would not be capable of serious intelligence at all.
You’ve argued that competence at generating plans given environments probably leads to competence at understanding meaning given text and context, but I still think that’s a far cry from showing that competence at generating plans given environments requires understanding meaning given text and context.
Yes the thought experiment is a fantasy. It requires an AI which takes English goals, but interprets them literally. We don’t even know how to get to an AI that takes English goals, and that’s probably FAI complete.
If you solve the problem of making an AI which wants to interpret what you want it do do correctly, you don’t need to bother telling it what to do. It already wants to do what you want it to do. There should be no need for the system to require English language inputs, any more than a calculator requires you to shout “add the numbers correctly!”