It seems to me that Wei_Dai’s six hypotheses do a good job of covering a lot of the logical space. A good enough job that even though I’ve been professionally trained to think about this problem, I can’t come up with any significantly different suggestions.
But maybe I’m being unimaginative (a side effect of training, often enough). If you think these are merely six of countless hypotheses, do you think you could come up with, say, two more?
If you think these are merely six of countless hypotheses, do you think you could come up with, say, two more?
Two more possible positions:
There is a great variety of possible consistent preferences that intelligent beings can have, and there are no facts about what one should value that apply to all possible intelligent beings. However, there are still facts about rationality that do apply to all intelligent beings. Also, if you narrow the scope from “intelligent beings” to “humans”, most humans , when consistent, share similar preferences, and there exist facts about what they should value. (So, 4 or 5 for intelligent beings in general, but 1 for humans.)
Your first suggestion isn’t an additional alternative, it’s just a subdivision within 4 or 5.
I’m not sure I understand the second one. Are you trying to draw the distinction between consequentialism and non-consequentialist moralities? If so, I think that is usually considered to be a distinction in normative ethics rather than metaethics. Although I repeatedly use “preferences” and “values” in this post, that was just for convenience rather than trying to imply that morality must have something to do with values.
Your first suggestion isn’t an additional alternative, it’s just a subdivision within 4 or 5.
Perhaps, but it seems like there’s a substantive difference between those who believe there are no facts about what all intelligent beings should value and between those who believe that in addition to that, there are also no facts about what humans should value.
Although I repeatedly use “preferences” and “values” in this post, that was just for convenience rather than trying to imply that morality must have something to do with values.
Could you give an example of one of these positions put in terms that would be inclusive of both consequentialist and non-consequentialist ethical theories?
Could you give an example of one of these positions put in terms that would be inclusive of both consequentialist and non-consequentialist ethical theories?
Sure. 1. Most intelligent beings in the multiverse end up sharing similar moralities. This came about because there are facts about what morals one should have. For example, suppose there are facts about what preferences one should have along with facts about what decision theory one should use or what prior one should have, and species that manage to build intergalactic civilizations (or the equivalent in other universes) tend to discover all of these facts. There are occasional paperclip maximizers that arise, but they are a relatively minor presence or tend to be taken over by more sophisticated minds.
do you think you could come up with, say, two more?
OP discusses “facts about what everyone should value”, (which is an odd use of the term “fact”, by the way). His classification is:
There is a unique set of values which
is a limit
is an attractor of sorts There is no unique set of values
(I failed to understand what this item says)
but you can come up with your own “consistent” (in some sense) set of preferences to optimize for
you cannot come up with a consistent set of values (preferences?), though you can optimize for each one separately
value is not something you can optimize for at all.
Eliezer’s position is something like “1. but limited to humans/FAI only”, which seems like a separate hypothesis. Other options off the top of my head are that there can be multiple self-consistent limits or attractors, or that the notion of value only makes sense for humans or some subset of them.
Or maybe a hard enough optimization attempt disturbs the value enough to change it, so one can only optimize so much without changing preferences. Or maybe the way to meta-morality is maximizing the diversity of moralities by creating/simulating a multiverse with all the ethical systems you can think of, consistent or inconsistent. Or maybe we should (moral “should”) matrix-like break out of the simulation we are living in and learn about the level above us. Or that the concept of “intelligent being” is inconsistent to begin with. Or...
Options are many and none are testable, so, while it’s good to ask grand questions, it’s silly to try to give grand answers or classification schemes.
To fill in the gap in 3: There is no unique set of values, but there is a unique process for deriving an optimal set of consistent preferences (up to some kind of isomorphism), though distinct individuals will get different results after carrying out this process.
As opposed to 4, which states that there is some set of processes that can derive consistent preferences but that no claims about which of these processes is best can be substantiated.
And as I said above, Eliezer believes something like 3, but insists on the caveat if we consider only humans, all consistent sets of preferences generated will substantially overlap, and that therefore we can create an FAI whose consistent preferences will entirely overlap that set.
It seems to me that Wei_Dai’s six hypotheses do a good job of covering a lot of the logical space. A good enough job that even though I’ve been professionally trained to think about this problem, I can’t come up with any significantly different suggestions.
But maybe I’m being unimaginative (a side effect of training, often enough). If you think these are merely six of countless hypotheses, do you think you could come up with, say, two more?
Two more possible positions:
There is a great variety of possible consistent preferences that intelligent beings can have, and there are no facts about what one should value that apply to all possible intelligent beings. However, there are still facts about rationality that do apply to all intelligent beings. Also, if you narrow the scope from “intelligent beings” to “humans”, most humans , when consistent, share similar preferences, and there exist facts about what they should value. (So, 4 or 5 for intelligent beings in general, but 1 for humans.)
Morality has nothing to do with value.
Your first suggestion isn’t an additional alternative, it’s just a subdivision within 4 or 5.
I’m not sure I understand the second one. Are you trying to draw the distinction between consequentialism and non-consequentialist moralities? If so, I think that is usually considered to be a distinction in normative ethics rather than metaethics. Although I repeatedly use “preferences” and “values” in this post, that was just for convenience rather than trying to imply that morality must have something to do with values.
Perhaps, but it seems like there’s a substantive difference between those who believe there are no facts about what all intelligent beings should value and between those who believe that in addition to that, there are also no facts about what humans should value.
Could you give an example of one of these positions put in terms that would be inclusive of both consequentialist and non-consequentialist ethical theories?
Sure. 1. Most intelligent beings in the multiverse end up sharing similar moralities. This came about because there are facts about what morals one should have. For example, suppose there are facts about what preferences one should have along with facts about what decision theory one should use or what prior one should have, and species that manage to build intergalactic civilizations (or the equivalent in other universes) tend to discover all of these facts. There are occasional paperclip maximizers that arise, but they are a relatively minor presence or tend to be taken over by more sophisticated minds.
OP discusses “facts about what everyone should value”, (which is an odd use of the term “fact”, by the way). His classification is:
There is a unique set of values which
is a limit
is an attractor of sorts
There is no unique set of values
(I failed to understand what this item says)
but you can come up with your own “consistent” (in some sense) set of preferences to optimize for
you cannot come up with a consistent set of values (preferences?), though you can optimize for each one separately
value is not something you can optimize for at all.
Eliezer’s position is something like “1. but limited to humans/FAI only”, which seems like a separate hypothesis. Other options off the top of my head are that there can be multiple self-consistent limits or attractors, or that the notion of value only makes sense for humans or some subset of them.
Or maybe a hard enough optimization attempt disturbs the value enough to change it, so one can only optimize so much without changing preferences. Or maybe the way to meta-morality is maximizing the diversity of moralities by creating/simulating a multiverse with all the ethical systems you can think of, consistent or inconsistent. Or maybe we should (moral “should”) matrix-like break out of the simulation we are living in and learn about the level above us. Or that the concept of “intelligent being” is inconsistent to begin with. Or...
Options are many and none are testable, so, while it’s good to ask grand questions, it’s silly to try to give grand answers or classification schemes.
To fill in the gap in 3: There is no unique set of values, but there is a unique process for deriving an optimal set of consistent preferences (up to some kind of isomorphism), though distinct individuals will get different results after carrying out this process.
As opposed to 4, which states that there is some set of processes that can derive consistent preferences but that no claims about which of these processes is best can be substantiated.
And as I said above, Eliezer believes something like 3, but insists on the caveat if we consider only humans, all consistent sets of preferences generated will substantially overlap, and that therefore we can create an FAI whose consistent preferences will entirely overlap that set.