Why do not copy concepts how children learn ethical codes?
Because the AI is not a child, so doing the same thing would probably give different results.
I do not know how many dozens of pages with silly stories I read about AIs misinterpreting human commands.
The essence of the problem is that the difference between “interpreting” and “misinterpreting” only exists in the mind of the human.
If I as a computer programmer say to a machine “add 10 to X”—while I really meant “add 100 to X”, but made a mistake—and the machine adds 10 to X, would you call that “misinterpreting” my command? Because such things happen every day with the existing programming languages, so there is nothing strange about expecting a similar thing happening in the future.
From the machine point of view, it was asked to “add 10 to X”, it added 10 to X, so it works correctly. If the human is frustrated because that’s not what they meant, that’s bad for the human, but the machine worked correctly according to its inputs.
You may be assuming a machine with a magical source of wisdom which could look at command “add 10 to X” and somehow realize that the human would actually want to add 100, and would fix its own program (unless it is passively aggressive and decides to follow the letter of the program anyway). But that’s not how machines work.
Let us try to free our mind from associating AGIs with machines. They are totally different from automata. AGIs will be creative, will learn to understand sarcasm, will understand that women in some situations say no and mean yes.
On your command to add 10 to x an AGI would reply: “I love to work for you! At least once a day you try to fool me—I am not asleep and I know that + 100 would be correct. ShalI I add 100?”
We have to start somewhere, and “we do not know what to do” is not starting.
Also, this whole thing about “what I really meant-” I thing that we can break down these into specific failure modes, and address them individually.
-One of the failure modes is poor contextual reasoning. In order to discern what a person really means, you have to reason about the context of their communication.
-Another failure mode involves not checking activities against norms and standards. There are a number of ways to arrive at the conclusion that Mom is be to rescued from the house alive and hopefully uninjured.
-The machines in these examples do not seem to forecast or simulate potential outcomes, and judege them against external standards.
“Magical source of wisdom?” No. What we are talking about is whether is it possible to design a certain kind of AGI-one that is safe and friendly.
We have shown this to be a complicated task. However, we have not fleshed out all the possible ways, and therefore we cannot falsify the claims of people who will insist that it can be done.
Poor contextual reasoning happens many times a day among humans. Our threads are full of it. In many cases consequences are neglectable. If the context is unclear and a phrase can be interpreted one way or the other, no magical wisdom is there:
Clarification is existential: ASK
Clarification is nice to have: Say something that does not reveal that you have no idea what is meant and try to stimulate that the other reveals contextual information.
Clarification unnecessary or even unintended: stay in the blind or keep the other in the blind.
Correct associations with few contextual hints is what AGI is about. Narrow AI translation software is even today quite good to figure out context by brute force statistical similarity analysis.
Because the AI is not a child, so doing the same thing would probably give different results.
The essence of the problem is that the difference between “interpreting” and “misinterpreting” only exists in the mind of the human.
If I as a computer programmer say to a machine “add 10 to X”—while I really meant “add 100 to X”, but made a mistake—and the machine adds 10 to X, would you call that “misinterpreting” my command? Because such things happen every day with the existing programming languages, so there is nothing strange about expecting a similar thing happening in the future.
From the machine point of view, it was asked to “add 10 to X”, it added 10 to X, so it works correctly. If the human is frustrated because that’s not what they meant, that’s bad for the human, but the machine worked correctly according to its inputs.
You may be assuming a machine with a magical source of wisdom which could look at command “add 10 to X” and somehow realize that the human would actually want to add 100, and would fix its own program (unless it is passively aggressive and decides to follow the letter of the program anyway). But that’s not how machines work.
Let us try to free our mind from associating AGIs with machines. They are totally different from automata. AGIs will be creative, will learn to understand sarcasm, will understand that women in some situations say no and mean yes.
On your command to add 10 to x an AGI would reply: “I love to work for you! At least once a day you try to fool me—I am not asleep and I know that + 100 would be correct. ShalI I add 100?”
Very good!
But be honest! Aren’t we (sometimes?) more machines which serve to genes/instincts than spiritual beings with free will?
We have to start somewhere, and “we do not know what to do” is not starting.
Also, this whole thing about “what I really meant-” I thing that we can break down these into specific failure modes, and address them individually.
-One of the failure modes is poor contextual reasoning. In order to discern what a person really means, you have to reason about the context of their communication.
-Another failure mode involves not checking activities against norms and standards. There are a number of ways to arrive at the conclusion that Mom is be to rescued from the house alive and hopefully uninjured.
-The machines in these examples do not seem to forecast or simulate potential outcomes, and judege them against external standards.
“Magical source of wisdom?” No. What we are talking about is whether is it possible to design a certain kind of AGI-one that is safe and friendly.
We have shown this to be a complicated task. However, we have not fleshed out all the possible ways, and therefore we cannot falsify the claims of people who will insist that it can be done.
Poor contextual reasoning happens many times a day among humans. Our threads are full of it. In many cases consequences are neglectable. If the context is unclear and a phrase can be interpreted one way or the other, no magical wisdom is there:
Clarification is existential: ASK
Clarification is nice to have: Say something that does not reveal that you have no idea what is meant and try to stimulate that the other reveals contextual information.
Clarification unnecessary or even unintended: stay in the blind or keep the other in the blind.
Correct associations with few contextual hints is what AGI is about. Narrow AI translation software is even today quite good to figure out context by brute force statistical similarity analysis.