do you think you could come up with, say, two more?
OP discusses “facts about what everyone should value”, (which is an odd use of the term “fact”, by the way). His classification is:
There is a unique set of values which
is a limit
is an attractor of sorts There is no unique set of values
(I failed to understand what this item says)
but you can come up with your own “consistent” (in some sense) set of preferences to optimize for
you cannot come up with a consistent set of values (preferences?), though you can optimize for each one separately
value is not something you can optimize for at all.
Eliezer’s position is something like “1. but limited to humans/FAI only”, which seems like a separate hypothesis. Other options off the top of my head are that there can be multiple self-consistent limits or attractors, or that the notion of value only makes sense for humans or some subset of them.
Or maybe a hard enough optimization attempt disturbs the value enough to change it, so one can only optimize so much without changing preferences. Or maybe the way to meta-morality is maximizing the diversity of moralities by creating/simulating a multiverse with all the ethical systems you can think of, consistent or inconsistent. Or maybe we should (moral “should”) matrix-like break out of the simulation we are living in and learn about the level above us. Or that the concept of “intelligent being” is inconsistent to begin with. Or...
Options are many and none are testable, so, while it’s good to ask grand questions, it’s silly to try to give grand answers or classification schemes.
To fill in the gap in 3: There is no unique set of values, but there is a unique process for deriving an optimal set of consistent preferences (up to some kind of isomorphism), though distinct individuals will get different results after carrying out this process.
As opposed to 4, which states that there is some set of processes that can derive consistent preferences but that no claims about which of these processes is best can be substantiated.
And as I said above, Eliezer believes something like 3, but insists on the caveat if we consider only humans, all consistent sets of preferences generated will substantially overlap, and that therefore we can create an FAI whose consistent preferences will entirely overlap that set.
OP discusses “facts about what everyone should value”, (which is an odd use of the term “fact”, by the way). His classification is:
There is a unique set of values which
is a limit
is an attractor of sorts
There is no unique set of values
(I failed to understand what this item says)
but you can come up with your own “consistent” (in some sense) set of preferences to optimize for
you cannot come up with a consistent set of values (preferences?), though you can optimize for each one separately
value is not something you can optimize for at all.
Eliezer’s position is something like “1. but limited to humans/FAI only”, which seems like a separate hypothesis. Other options off the top of my head are that there can be multiple self-consistent limits or attractors, or that the notion of value only makes sense for humans or some subset of them.
Or maybe a hard enough optimization attempt disturbs the value enough to change it, so one can only optimize so much without changing preferences. Or maybe the way to meta-morality is maximizing the diversity of moralities by creating/simulating a multiverse with all the ethical systems you can think of, consistent or inconsistent. Or maybe we should (moral “should”) matrix-like break out of the simulation we are living in and learn about the level above us. Or that the concept of “intelligent being” is inconsistent to begin with. Or...
Options are many and none are testable, so, while it’s good to ask grand questions, it’s silly to try to give grand answers or classification schemes.
To fill in the gap in 3: There is no unique set of values, but there is a unique process for deriving an optimal set of consistent preferences (up to some kind of isomorphism), though distinct individuals will get different results after carrying out this process.
As opposed to 4, which states that there is some set of processes that can derive consistent preferences but that no claims about which of these processes is best can be substantiated.
And as I said above, Eliezer believes something like 3, but insists on the caveat if we consider only humans, all consistent sets of preferences generated will substantially overlap, and that therefore we can create an FAI whose consistent preferences will entirely overlap that set.