Is P(my-primary-goal-should-change) < P(my-primary-goal-should-change | the-evidence-in-this-scenario) for either agent? If not, this implies that the agents believe their primary goal to be arbitrary yet still worth keeping intact forever without change, e.g. pencils and paperclips are their basic morality and there was no simpler basic morality like “do what my creators want me to do”
This strikes me as a little anthropomorphic. Maximizers would see their maximization targets as motivationally basic; they might develop quite complex behaviors in service to those goals, but there is no greater meta-motivation behind them. If there was, they wouldn’t be maximizers. This is so alien to human motivational schemes that I think using the word “morality” to describe it is already a little misleading, but insofar as it is a morality it’s defined in terms of the maximization target: a paperclipper would consider rewriting its motivational core if and only if it could be convinced that that would ultimately generate more paperclips than the alternative.
I wouldn’t call that arbitrary, though, at least not from the perspective of the maximizer; doing so would be close to calling joy or happiness arbitrary from a human perspective, although there really isn’t any precise analogy in our terms.
Reading http://lesswrong.com/lw/t1/arbitrary/ makes me think that a rational agent, even if its greatest motivation is to maximize its paperclip production, would be able to determine that its desire for paperclips was more arbitrary than its tools for rationality. It could perform simulations or thought experiments to determine its most likely origins and find that while many possible origins lead to the development of rationality there are only a few paths that specifically generate paperclip maximization. Equally likely are pencil maximization and smiley-face maximization, and even some less likely things like human-friendliness maximization will use the same rationality framework because it works well in the Universe. There’s justification for rationality but not for paperclip maximization.
That also means that joy and happiness are not completely arbitrary for humans because they are tools used to maximize evolutionary fitness, which we can identify as the justification for the development of those emotions. Some of the acquired tastes, fetishes, or habits of humans might well be described as arbitrary, though.
This strikes me as a little anthropomorphic. Maximizers would see their maximization targets as motivationally basic; they might develop quite complex behaviors in service to those goals, but there is no greater meta-motivation behind them. If there was, they wouldn’t be maximizers. This is so alien to human motivational schemes that I think using the word “morality” to describe it is already a little misleading, but insofar as it is a morality it’s defined in terms of the maximization target: a paperclipper would consider rewriting its motivational core if and only if it could be convinced that that would ultimately generate more paperclips than the alternative.
I wouldn’t call that arbitrary, though, at least not from the perspective of the maximizer; doing so would be close to calling joy or happiness arbitrary from a human perspective, although there really isn’t any precise analogy in our terms.
Reading http://lesswrong.com/lw/t1/arbitrary/ makes me think that a rational agent, even if its greatest motivation is to maximize its paperclip production, would be able to determine that its desire for paperclips was more arbitrary than its tools for rationality. It could perform simulations or thought experiments to determine its most likely origins and find that while many possible origins lead to the development of rationality there are only a few paths that specifically generate paperclip maximization. Equally likely are pencil maximization and smiley-face maximization, and even some less likely things like human-friendliness maximization will use the same rationality framework because it works well in the Universe. There’s justification for rationality but not for paperclip maximization.
That also means that joy and happiness are not completely arbitrary for humans because they are tools used to maximize evolutionary fitness, which we can identify as the justification for the development of those emotions. Some of the acquired tastes, fetishes, or habits of humans might well be described as arbitrary, though.