To what I take as the bottom line: we should learn more about ethics.
More thoughts on how the above scenario in relation to existing alignment debates in a separate comment, since it’s a separate topic.
I’ve spent a fair amount of time reading and debating theories of ethics. Here’s my conclusion thus far:
I think what we mean by ethics is “how to win friends and influence people”. I think it’s a term for trying to codify our instincts about how to behave. And our instincts regarding social behavior are mostly directed at incluse reproductive fitness. This is served by winning friends (allies) and getting people to do what you want. This sounds cynical, and very much not like ethics in its optimistic sense, but I think it actually converges with some of our more optimistic thinking.
Dale Carnegie’s “How to win friends and influence people” is much less cynical than the title suggests. He’s actually focused on being an empathetic conversationalist. Most people aren’t good at this, so doing it makes people like you, and tend to do what you want because they like you. But he’s not suggesting pretending to do this; it’s easiest and most fun to be sincere about your interest in people’s topics and their wellbeing.
So I think the full answer to “how to win friends and influence people” includes actually being a good person in most situations. It’s certainly easiest to be a good person consistently, so that you don’t need to worry about keeping lies straight, or hiding instances when you weren’t a good person. That protects your reputation as a good person and friend, thereby helping you win friends and influence people. But those pushes toward a pro-social meaning of ethics may evaporate with smarter agents, and in some situations where your gain is more important than your reputation (for instance, if you could just take over the world instead of needing allies).
If we broaden the definition of ethics even farther to “how to get what you want”, it sounds even more cynical, but might not be. Getting what you want in a larger society may include creating and supporting a system that rewards cooperators and punishes defectors. That seems to produce win-win scenarios, where more people are likely to get more of what they want (including not constantly struggling for power, and fearing violence).
Such a system of checks needs to change to work for AI agents that can’t be reliably punished for defection in the way people are (by social reputation and criminal codes).
But by this formulation, “ethics” is almost orthogonal to AGI alignment. Unless we assume that there’s one true universal ethics (beyond the above logic), we want a machine that isn’t “ethical” in the way we are, but rather one that wants the best for us, by our own judgment of “best”.
To what I take as the bottom line: we should learn more about ethics.
More thoughts on how the above scenario in relation to existing alignment debates in a separate comment, since it’s a separate topic.
I’ve spent a fair amount of time reading and debating theories of ethics. Here’s my conclusion thus far:
I think what we mean by ethics is “how to win friends and influence people”. I think it’s a term for trying to codify our instincts about how to behave. And our instincts regarding social behavior are mostly directed at incluse reproductive fitness. This is served by winning friends (allies) and getting people to do what you want. This sounds cynical, and very much not like ethics in its optimistic sense, but I think it actually converges with some of our more optimistic thinking.
Dale Carnegie’s “How to win friends and influence people” is much less cynical than the title suggests. He’s actually focused on being an empathetic conversationalist. Most people aren’t good at this, so doing it makes people like you, and tend to do what you want because they like you. But he’s not suggesting pretending to do this; it’s easiest and most fun to be sincere about your interest in people’s topics and their wellbeing.
So I think the full answer to “how to win friends and influence people” includes actually being a good person in most situations. It’s certainly easiest to be a good person consistently, so that you don’t need to worry about keeping lies straight, or hiding instances when you weren’t a good person. That protects your reputation as a good person and friend, thereby helping you win friends and influence people. But those pushes toward a pro-social meaning of ethics may evaporate with smarter agents, and in some situations where your gain is more important than your reputation (for instance, if you could just take over the world instead of needing allies).
If we broaden the definition of ethics even farther to “how to get what you want”, it sounds even more cynical, but might not be. Getting what you want in a larger society may include creating and supporting a system that rewards cooperators and punishes defectors. That seems to produce win-win scenarios, where more people are likely to get more of what they want (including not constantly struggling for power, and fearing violence).
Such a system of checks needs to change to work for AI agents that can’t be reliably punished for defection in the way people are (by social reputation and criminal codes).
But by this formulation, “ethics” is almost orthogonal to AGI alignment. Unless we assume that there’s one true universal ethics (beyond the above logic), we want a machine that isn’t “ethical” in the way we are, but rather one that wants the best for us, by our own judgment of “best”.