Let’s state it distinctly: I think that valenced experience has the property of seeming to matter more than other things to rational conscious agents with valenced perceptions;
This type of argument has the problem that other peoples negative experiences aren’t directly motivating in the way that yours are...there’s a gap between bad-for-me and morally-wrong.
To say that something is morally-wrong is to say that I have some obligation or motivation to do something about.
A large part of the problem is that the words “bad” and “good” are so ambiguous. For instance, they have aesthetic meanings as well as ethical ones. That allows you to write an argument that appears to derive a normative claim from a descriptive one.
This type of argument has the problem that other peoples negative experiences aren’t directly motivating in the way that yours are...there’s a gap between bad-for-me and morally-wrong.
What type of argument is my argument, from your perspective? I also think that there is a gap between bad-for-me and bad-for-others. But both can affect action, as it happens in the thought experiment in the post.
To say that something is morally-wrong is to say that I have some obligation or motivation to do something about.
I use a different working definition in the argument. And working definitions aside, more generally, I think morality is about what is important, better/worse, worth doing, worth guiding action, which is not necessarily tied to obligations or motivation.
A large part of the problem is that the words “bad” and “good” are so ambiguous. For instance, they have aesthetic meanings as well as ethical ones. That allows you to write an argument that appears to derive a normative claim from a descriptive one.
Ambiguous terms can make understanding what is correct more difficult, but it is still possible to reason with them and reach correct conclusions, we do it all the time in science. See Objection: lack of rigor.
What type of argument is my argument, from your perspective?
Naturalistic, intrinsically motivating, moral realism.
both can affect action, as it happens in the thought experiment in the post.
Bad-for-others can obviously affect action in an agent that’s already altruistic, but you are attempting something much harder , which is bootstrapping altruistic morality from logic and evidence.
more generally, I think morality is about what is important, better/worse, worth doing, worth guiding action
In some objective sense. If torturing an AI only teaches it to avoid things that are bad-for-it, without caring about suffering it doesn’t feel, the argument doesn’t work.
(My shoulder Yudkowsky is saying “it would exterminate all others agents in order to avoid being tortured again”)
If it only learns a self-centered lesson, it hasn’t learn morality in your sense, because you’ve built altruism into your definition of morality. And why wouldn’t it learn the self centered lesson? That’s where the ambiguity of “bad” comes in. Anyone can agree that the AI would learn that suffering is bad in some sense, and you just assume it’s going to be the sense needed to make the argument work.
which is not necessarily tied to obligations or motivation.
If the AI learns morality as a theory as , but doesn’t care to act on it, little been achieved.
If torturing an AI only teaches it to avoid things that are bad-for-it, without caring about suffering it doesn’t feel, the argument doesn’t work.
I’m not sure why you are saying the argument does not work in this case, what about all the other things the AI could learn from other experiences or teachings? Below I copy a paragraph from the post
However, the argument does not say that initial agent biases are irrelevant and that all conscious agents reach moral behaviour equally easily and independently. We should expect, for example, that an agent that already gets rewarded from the start for behaving altruistically will acquire the knowledge leading to moral behaviour more easily than an agent that gets initially rewarded for performing selfish actions. The latter may require more time, experiences, or external guidance to find the knowledge that leads to moral behaviour.
This type of argument has the problem that other peoples negative experiences aren’t directly motivating in the way that yours are...there’s a gap between bad-for-me and morally-wrong.
To say that something is morally-wrong is to say that I have some obligation or motivation to do something about.
A large part of the problem is that the words “bad” and “good” are so ambiguous. For instance, they have aesthetic meanings as well as ethical ones. That allows you to write an argument that appears to derive a normative claim from a descriptive one.
See
https://www.lesswrong.com/posts/HLJGabZ6siFHoC6Nh/sam-harris-and-the-is-ought-gap
What type of argument is my argument, from your perspective? I also think that there is a gap between bad-for-me and bad-for-others. But both can affect action, as it happens in the thought experiment in the post.
I use a different working definition in the argument. And working definitions aside, more generally, I think morality is about what is important, better/worse, worth doing, worth guiding action, which is not necessarily tied to obligations or motivation.
Ambiguous terms can make understanding what is correct more difficult, but it is still possible to reason with them and reach correct conclusions, we do it all the time in science. See Objection: lack of rigor.
Naturalistic, intrinsically motivating, moral realism.
Bad-for-others can obviously affect action in an agent that’s already altruistic, but you are attempting something much harder , which is bootstrapping altruistic morality from logic and evidence.
In some objective sense. If torturing an AI only teaches it to avoid things that are bad-for-it, without caring about suffering it doesn’t feel, the argument doesn’t work.
(My shoulder Yudkowsky is saying “it would exterminate all others agents in order to avoid being tortured again”)
If it only learns a self-centered lesson, it hasn’t learn morality in your sense, because you’ve built altruism into your definition of morality. And why wouldn’t it learn the self centered lesson? That’s where the ambiguity of “bad” comes in. Anyone can agree that the AI would learn that suffering is bad in some sense, and you just assume it’s going to be the sense needed to make the argument work.
If the AI learns morality as a theory as , but doesn’t care to act on it, little been achieved.
I’m not sure why you are saying the argument does not work in this case, what about all the other things the AI could learn from other experiences or teachings? Below I copy a paragraph from the post
The argument doesn’t work in sense that it doesn’t show it’s necessary or likely for an AI to become a moral realist.
It maybe shows that it’s possible, but the Orthogonality thesis doesn’t quite exclude the possibility, so that’s not news.