When a paperclip maximizer and a pencil maximizer do different things, they are not disagreeing about anything, they are just different optimization processes.
Just to make sure I understand this, suppose a pencil maximizer and a paperclip maximizer meet each other while tiling deep space. They communicate (or eat parts of each other and evaluate the algorithms embedded therein) and discover that they are virtually identical except for the pencil/paperclip preference. They further discover that they are both the creation of a species of sentient beings who originated in different galaxies and failed the AI test. The sentient species shared far more in common than the difference in pencil/paperclip preference. Neither can find a flaw in the rationality algorithm that the other employs. Is P(my-primary-goal-should-change) < P(my-primary-goal-should-change | the-evidence-in-this-scenario) for either agent? If not, this implies that the agents believe their primary goal to be arbitrary yet still worth keeping intact forever without change, e.g. pencils and paperclips are their basic morality and there was no simpler basic morality like “do what my creators want me to do” in which case the probability of the paperclip/pencil maximization goal should receive a significant update upon discovering that two different species with so much in common accidentally ordered their own destruction by arbitrary artifacts.
Also, imagine that our basic morality is not as anthropomorphically nice as “What will save my friends, and my people, from getting hurt? How can we all have more fun? …” and is instead “What will most successfully spread my genetic material?”. The nice anthropomorphic questions we are aware of may only be a good-enough approximation of our true basic morality that we don’t have (or need) conscious access to it. Why should we arbitrarily accept the middle level instead of accepting the “abortion is wrong” or “maximize our genetic material” morals at face value?
I find it interesting that single cells got together and built themselves an almost-friendly AI for the propagation of genetic material that is now talking about replacing genetic material with semiconductors. Or was it the Maximization Of Maximization Memes meme that got the cells going in the first place and is still wildly successful and planning its next conquest?
Is P(my-primary-goal-should-change) < P(my-primary-goal-should-change | the-evidence-in-this-scenario) for either agent? If not, this implies that the agents believe their primary goal to be arbitrary yet still worth keeping intact forever without change, e.g. pencils and paperclips are their basic morality and there was no simpler basic morality like “do what my creators want me to do”
This strikes me as a little anthropomorphic. Maximizers would see their maximization targets as motivationally basic; they might develop quite complex behaviors in service to those goals, but there is no greater meta-motivation behind them. If there was, they wouldn’t be maximizers. This is so alien to human motivational schemes that I think using the word “morality” to describe it is already a little misleading, but insofar as it is a morality it’s defined in terms of the maximization target: a paperclipper would consider rewriting its motivational core if and only if it could be convinced that that would ultimately generate more paperclips than the alternative.
I wouldn’t call that arbitrary, though, at least not from the perspective of the maximizer; doing so would be close to calling joy or happiness arbitrary from a human perspective, although there really isn’t any precise analogy in our terms.
Reading http://lesswrong.com/lw/t1/arbitrary/ makes me think that a rational agent, even if its greatest motivation is to maximize its paperclip production, would be able to determine that its desire for paperclips was more arbitrary than its tools for rationality. It could perform simulations or thought experiments to determine its most likely origins and find that while many possible origins lead to the development of rationality there are only a few paths that specifically generate paperclip maximization. Equally likely are pencil maximization and smiley-face maximization, and even some less likely things like human-friendliness maximization will use the same rationality framework because it works well in the Universe. There’s justification for rationality but not for paperclip maximization.
That also means that joy and happiness are not completely arbitrary for humans because they are tools used to maximize evolutionary fitness, which we can identify as the justification for the development of those emotions. Some of the acquired tastes, fetishes, or habits of humans might well be described as arbitrary, though.
Just to make sure I understand this, suppose a pencil maximizer and a paperclip maximizer meet each other while tiling deep space. They communicate (or eat parts of each other and evaluate the algorithms embedded therein) and discover that they are virtually identical except for the pencil/paperclip preference. They further discover that they are both the creation of a species of sentient beings who originated in different galaxies and failed the AI test. The sentient species shared far more in common than the difference in pencil/paperclip preference. Neither can find a flaw in the rationality algorithm that the other employs. Is P(my-primary-goal-should-change) < P(my-primary-goal-should-change | the-evidence-in-this-scenario) for either agent? If not, this implies that the agents believe their primary goal to be arbitrary yet still worth keeping intact forever without change, e.g. pencils and paperclips are their basic morality and there was no simpler basic morality like “do what my creators want me to do” in which case the probability of the paperclip/pencil maximization goal should receive a significant update upon discovering that two different species with so much in common accidentally ordered their own destruction by arbitrary artifacts.
Also, imagine that our basic morality is not as anthropomorphically nice as “What will save my friends, and my people, from getting hurt? How can we all have more fun? …” and is instead “What will most successfully spread my genetic material?”. The nice anthropomorphic questions we are aware of may only be a good-enough approximation of our true basic morality that we don’t have (or need) conscious access to it. Why should we arbitrarily accept the middle level instead of accepting the “abortion is wrong” or “maximize our genetic material” morals at face value?
I find it interesting that single cells got together and built themselves an almost-friendly AI for the propagation of genetic material that is now talking about replacing genetic material with semiconductors. Or was it the Maximization Of Maximization Memes meme that got the cells going in the first place and is still wildly successful and planning its next conquest?
This strikes me as a little anthropomorphic. Maximizers would see their maximization targets as motivationally basic; they might develop quite complex behaviors in service to those goals, but there is no greater meta-motivation behind them. If there was, they wouldn’t be maximizers. This is so alien to human motivational schemes that I think using the word “morality” to describe it is already a little misleading, but insofar as it is a morality it’s defined in terms of the maximization target: a paperclipper would consider rewriting its motivational core if and only if it could be convinced that that would ultimately generate more paperclips than the alternative.
I wouldn’t call that arbitrary, though, at least not from the perspective of the maximizer; doing so would be close to calling joy or happiness arbitrary from a human perspective, although there really isn’t any precise analogy in our terms.
Reading http://lesswrong.com/lw/t1/arbitrary/ makes me think that a rational agent, even if its greatest motivation is to maximize its paperclip production, would be able to determine that its desire for paperclips was more arbitrary than its tools for rationality. It could perform simulations or thought experiments to determine its most likely origins and find that while many possible origins lead to the development of rationality there are only a few paths that specifically generate paperclip maximization. Equally likely are pencil maximization and smiley-face maximization, and even some less likely things like human-friendliness maximization will use the same rationality framework because it works well in the Universe. There’s justification for rationality but not for paperclip maximization.
That also means that joy and happiness are not completely arbitrary for humans because they are tools used to maximize evolutionary fitness, which we can identify as the justification for the development of those emotions. Some of the acquired tastes, fetishes, or habits of humans might well be described as arbitrary, though.