Stuart Armstrong has proved some theorems showing that it’s really really hard to get to the Pareto frontier unless you’re adding utility functions in some sense, with the big issue being the choice of scaling factor. I’m not sure even so, on a moral level—in terms of what I actually want—that I quite buy Armstrong’s theorems taken at face value, but on the other hand it’s hard to see how, if you had a solution that wasn’t on the Pareto frontier, agents would object to moving to the Pareto frontier so long as they didn’t get shafted somehow.
It occurred to me (and I suggested to Armstrong) that I wouldn’t want to trade off whole stars turned into paperclips against individual small sentients on an even basis when dividing the gains from trade, even if I came out ahead on net against the prior state of the universe before the trade. I.e., if we were executing a gainful trade and the question was how to split the spoils, and some calculation took which ended up with the paperclip maximizer gaining a whole star’s worth of paperclips from the spoils every time I gain one small-sized eudaimonic sentient, then my primate fairness calculator wants to tell the maximizer to eff off and screw the trade. I suggested to Armstrong that the critical scaling factor might revolve around equal amounts of matter affected by the trade, and you can also see how something like that might emerge if you were conducting an auction between many superintelligences (they would purchase matter affected where it was cheapest). Possibilities like this tend not to be considered in such theorems, and when you ask which axiom they violate it’s often an axiom that turns out to not be super morally appealing.
Irrelevant alternatives is a common hinge on which such theorems fail when you try to do morally sensible-seeming things with them. One of the intuition pumps I use for this class of problem is to imagine an auction system in which all decision systems get to spend the same amount of money (hence no utility monsters). It is not obvious that you should morally have to pay the money only to make alternatives happen, and not to prevent alternatives that might otherwise be chosen. But then the elimination of an alternative not output by the system can, and morally should, affect how much money someone must pay to prevent it from being output.
Stuart Armstrong has proved some theorems showing that it’s really really hard to get to the Pareto frontier unless you’re adding utility functions in some sense, with the big issue being the choice of scaling factor.
He knows. Also, why do you say “really really hard” when the theorem says “impossible”?
It occurred to me (and I suggested to Armstrong) that I wouldn’t want to trade off whole stars turned into paperclips against individual small sentients on an even basis when dividing the gains from trade, even if I came out ahead on net against the prior state of the universe before the trade.
I’m confused. How is this incompatible with maximizing a sum of utility functions with the paperclip maximizer getting a scaling factor of tiny or 0?
Stuart Armstrong has proved some theorems showing that it’s really really hard to get to the Pareto frontier unless you’re adding utility functions in some sense, with the big issue being the choice of scaling factor. I’m not sure even so, on a moral level—in terms of what I actually want—that I quite buy Armstrong’s theorems taken at face value, but on the other hand it’s hard to see how, if you had a solution that wasn’t on the Pareto frontier, agents would object to moving to the Pareto frontier so long as they didn’t get shafted somehow.
It occurred to me (and I suggested to Armstrong) that I wouldn’t want to trade off whole stars turned into paperclips against individual small sentients on an even basis when dividing the gains from trade, even if I came out ahead on net against the prior state of the universe before the trade. I.e., if we were executing a gainful trade and the question was how to split the spoils, and some calculation took which ended up with the paperclip maximizer gaining a whole star’s worth of paperclips from the spoils every time I gain one small-sized eudaimonic sentient, then my primate fairness calculator wants to tell the maximizer to eff off and screw the trade. I suggested to Armstrong that the critical scaling factor might revolve around equal amounts of matter affected by the trade, and you can also see how something like that might emerge if you were conducting an auction between many superintelligences (they would purchase matter affected where it was cheapest). Possibilities like this tend not to be considered in such theorems, and when you ask which axiom they violate it’s often an axiom that turns out to not be super morally appealing.
Irrelevant alternatives is a common hinge on which such theorems fail when you try to do morally sensible-seeming things with them. One of the intuition pumps I use for this class of problem is to imagine an auction system in which all decision systems get to spend the same amount of money (hence no utility monsters). It is not obvious that you should morally have to pay the money only to make alternatives happen, and not to prevent alternatives that might otherwise be chosen. But then the elimination of an alternative not output by the system can, and morally should, affect how much money someone must pay to prevent it from being output.
He knows. Also, why do you say “really really hard” when the theorem says “impossible”?
I’m confused. How is this incompatible with maximizing a sum of utility functions with the paperclip maximizer getting a scaling factor of tiny or 0?