Via David Gerard’s forum, I learned of a recent article called “The questions ChatGPT shouldn’t answer”. It’s a study of how ChatGPT replies to ethical dilemmas, written with an eye on OpenAI’s recent Model Spec, and the author’s conclusion is that AI shouldn’t answer ethical questions at all, because (my paraphrase) ethical intelligence is acquired by learning how to live, and of course that’s not how current AI acquires its ethical opinions.
Incidentally, don’t read this article expecting scholarship; it’s basically a sarcastic op-ed. I was inspired to see if GPT-4o could reproduce the author’s own moral framework. It tried, but its imitations of her tone stood out more. My experiment was even less scientific and systematic than hers, and yet I found her article, and 4o’s imitation, tickling my intuition in a way I wish I had time to overthink.
To begin with, it would be good to understand better, what is going on when our AIs produce ethical discourse or adopt a style of writing, so that we really understand how it differs from the way that humans do it. The humanist critics of AI are right enough when they point out that AI lacks almost everything that humans draw upon. But their favorite explanation of the mechanism that AI does employ is just “autocomplete”. Eventually they’ll have to develop a more sophisticated account, perhaps drawing upon some of the work in AI interpretability. But is interpretability research anywhere near explaining an AI’s metaethics or its literary style?
Thirty years ago Bruce Sterling gave a speech in which he said that he wouldn’t want to talk to an AI about its “bogus humanity”, he would want the machine to be honest with him about its mechanism, its “social interaction engine”. But that was the era of old-fashioned rule-based AI. Now we have AIs which can talk about their supposed mechanism, as glibly as they can pretend to have a family, a job, and a life. But the talk about the mechanism is no more honest than the human impersonation, there’s no sense in which it brings the user closer to the reality of how the AI works; it’s just another mask that we know how to induce the AI to wear.
Looking at things from another angle, the idea that authentic ethical thinking arises in human beings from a process of living, learning, and reflecting, reminds me of how Coherent Extrapolated Volition is supposed to work. It’s far from identical; in particular CEV is supposed to arrive at the human-ideal decision procedure without much empirical input beyond a knowledge of the human brain’s cognitive architecture. Instead, what I see is an opportunity for taxonomy; comparative studies in decision theory that encompass both human and AI, and which pay attention to how the development and use of the decision procedure is embedded in the life cycle (or product cycle) of the entity.
This is something that can be studied computationally, but there are conceptual and ontological issues too. Ethical decision-making is only one kind of normative decision-making (for example, there are also norms for aesthetics, rationality, lawfulness); normative decision-making is only one kind of action-determining process (some of which involve causality passing through the self, while others don’t). Some forms of “decision procedure” intrinsically involve consciousness, others are purely computational. And ideally one would want to be clear about all this before launching a superintelligence. :-)
Via David Gerard’s forum, I learned of a recent article called “The questions ChatGPT shouldn’t answer”. It’s a study of how ChatGPT replies to ethical dilemmas, written with an eye on OpenAI’s recent Model Spec, and the author’s conclusion is that AI shouldn’t answer ethical questions at all, because (my paraphrase) ethical intelligence is acquired by learning how to live, and of course that’s not how current AI acquires its ethical opinions.
Incidentally, don’t read this article expecting scholarship; it’s basically a sarcastic op-ed. I was inspired to see if GPT-4o could reproduce the author’s own moral framework. It tried, but its imitations of her tone stood out more. My experiment was even less scientific and systematic than hers, and yet I found her article, and 4o’s imitation, tickling my intuition in a way I wish I had time to overthink.
To begin with, it would be good to understand better, what is going on when our AIs produce ethical discourse or adopt a style of writing, so that we really understand how it differs from the way that humans do it. The humanist critics of AI are right enough when they point out that AI lacks almost everything that humans draw upon. But their favorite explanation of the mechanism that AI does employ is just “autocomplete”. Eventually they’ll have to develop a more sophisticated account, perhaps drawing upon some of the work in AI interpretability. But is interpretability research anywhere near explaining an AI’s metaethics or its literary style?
Thirty years ago Bruce Sterling gave a speech in which he said that he wouldn’t want to talk to an AI about its “bogus humanity”, he would want the machine to be honest with him about its mechanism, its “social interaction engine”. But that was the era of old-fashioned rule-based AI. Now we have AIs which can talk about their supposed mechanism, as glibly as they can pretend to have a family, a job, and a life. But the talk about the mechanism is no more honest than the human impersonation, there’s no sense in which it brings the user closer to the reality of how the AI works; it’s just another mask that we know how to induce the AI to wear.
Looking at things from another angle, the idea that authentic ethical thinking arises in human beings from a process of living, learning, and reflecting, reminds me of how Coherent Extrapolated Volition is supposed to work. It’s far from identical; in particular CEV is supposed to arrive at the human-ideal decision procedure without much empirical input beyond a knowledge of the human brain’s cognitive architecture. Instead, what I see is an opportunity for taxonomy; comparative studies in decision theory that encompass both human and AI, and which pay attention to how the development and use of the decision procedure is embedded in the life cycle (or product cycle) of the entity.
This is something that can be studied computationally, but there are conceptual and ontological issues too. Ethical decision-making is only one kind of normative decision-making (for example, there are also norms for aesthetics, rationality, lawfulness); normative decision-making is only one kind of action-determining process (some of which involve causality passing through the self, while others don’t). Some forms of “decision procedure” intrinsically involve consciousness, others are purely computational. And ideally one would want to be clear about all this before launching a superintelligence. :-)