Biological values might be greed, selfishness and competition while social values might be trust, altruism and cooperation.
I believe that this “biological = selfish, social = cooperative” dichotomy is wrong. It is a popular mistake to make, because it provides legitimacy to all kinds of political regimes, by allowing them to take credit for everything good that humans living in them do. It also allows one to express “edgy” opinions about human nature.
But if homo sapiens actually had no biological foundations for trust, altruism, and cooperation, then… it would be extremely difficult for our societies to instill such values in humans; and most likely we wouldn’t even try, because we simply wouldn’t think about such things as desirable. The very idea that fundamentally uncooperative humans somehow decided to cooperate at creating a society that brainwashes humans into being capable of cooperation is… somewhat self-contradictory.
(The usual argument is that it would make sense, even for a perfectly selfish asshole, to brainwash other people into becoming cooperative altruists. The problem with this argument is that the hypothetical perfectly selfish wannabe social engineer couldn’t accomplish such project alone. And when many people start cooperating on brainwashing the next generation, what makes even more sense for a perfectly selfish asshole is to… shirk their duty at creating the utopia, or even try to somehow profit from undermining this communal effort.)
Instead, I think we have dozens of instincts which under certain circumstances nudge us towards more or less cooperation. Perhaps in the ancient environment they were balanced in the way that maximized survival (sometimes by cooperation with others, sometimes at their expense), but currently the environment changes too fast for humans to adapt...
I agree with your main point: it is not obvious how training an AI on human preferences (which are sometimes “good” and sometimes “evil”) would help us achieve the goal (separating the “good” from “evil” from “neutral”).
Thanks for responding Viliam. Totally agree with you that “if homo sapiens actually had no biological foundations for trust, altruism, and cooperation, then… it would be extremely difficult for our societies to instill such values”.
As you say, we have a blend of values that shift as required by our environment. I appreciate your agreement that it’s not really clear how training an AI on human preferences solves the issue raised here.
Of all the things I have ever discussed in person or on-line values are the most challenging. I’ve been interested in human values for decades before AI came along and historically there is very little hard science to be found on the subject. I’m delighted that AI is causing values to be studied widely for the first time however in my view we are only about where the ancient Greeks were with regard to the structure of matter or where Gregor Mendel’s study of pea plants fall with regards to genetics. Both fields turned out to be unimaginably complex. Like those I expect the study of values will go on indefinitely as we discover how complicated they really are.
I can see how the math involved likely precludes us writing the necessary code and that “self-teaching” (sorry I don’t know the correct word) is the only way an AI could learn human values but again it seems as if Stuart’s approach is missing a critical component. I’ve finished his book now and although he goes on at length with regards to different scenarios he never definitively addresses the issue I raise here. I think the analogy that children learn many things from their parents, not all of them “good”, applies here and Stuart’s response to this problem re his approach still seems to gloss over the issue.
I believe that this “biological = selfish, social = cooperative” dichotomy is wrong. It is a popular mistake to make, because it provides legitimacy to all kinds of political regimes, by allowing them to take credit for everything good that humans living in them do. It also allows one to express “edgy” opinions about human nature.
But if homo sapiens actually had no biological foundations for trust, altruism, and cooperation, then… it would be extremely difficult for our societies to instill such values in humans; and most likely we wouldn’t even try, because we simply wouldn’t think about such things as desirable. The very idea that fundamentally uncooperative humans somehow decided to cooperate at creating a society that brainwashes humans into being capable of cooperation is… somewhat self-contradictory.
(The usual argument is that it would make sense, even for a perfectly selfish asshole, to brainwash other people into becoming cooperative altruists. The problem with this argument is that the hypothetical perfectly selfish wannabe social engineer couldn’t accomplish such project alone. And when many people start cooperating on brainwashing the next generation, what makes even more sense for a perfectly selfish asshole is to… shirk their duty at creating the utopia, or even try to somehow profit from undermining this communal effort.)
Instead, I think we have dozens of instincts which under certain circumstances nudge us towards more or less cooperation. Perhaps in the ancient environment they were balanced in the way that maximized survival (sometimes by cooperation with others, sometimes at their expense), but currently the environment changes too fast for humans to adapt...
I agree with your main point: it is not obvious how training an AI on human preferences (which are sometimes “good” and sometimes “evil”) would help us achieve the goal (separating the “good” from “evil” from “neutral”).
Thanks for responding Viliam. Totally agree with you that “if homo sapiens actually had no biological foundations for trust, altruism, and cooperation, then… it would be extremely difficult for our societies to instill such values”.
As you say, we have a blend of values that shift as required by our environment. I appreciate your agreement that it’s not really clear how training an AI on human preferences solves the issue raised here.
Of all the things I have ever discussed in person or on-line values are the most challenging. I’ve been interested in human values for decades before AI came along and historically there is very little hard science to be found on the subject. I’m delighted that AI is causing values to be studied widely for the first time however in my view we are only about where the ancient Greeks were with regard to the structure of matter or where Gregor Mendel’s study of pea plants fall with regards to genetics. Both fields turned out to be unimaginably complex. Like those I expect the study of values will go on indefinitely as we discover how complicated they really are.
I can see how the math involved likely precludes us writing the necessary code and that “self-teaching” (sorry I don’t know the correct word) is the only way an AI could learn human values but again it seems as if Stuart’s approach is missing a critical component. I’ve finished his book now and although he goes on at length with regards to different scenarios he never definitively addresses the issue I raise here. I think the analogy that children learn many things from their parents, not all of them “good”, applies here and Stuart’s response to this problem re his approach still seems to gloss over the issue.