I will suggest that at least part of the problem is your utility function: defining it in terms of the average utility of your future copies means you should spend all your money on lottery tickets and then commit suicide in all the branches where you didn’t win. The operation of dividing the total utility by the number of copies is the problematic part, and in my opinion entirely without justification.
(Granted we might be able to use non-indexical utility to patch this particular failure mode, but I put it to you that even for someone without any current close relationships, quantum suicide is still a bad enough idea to serve as a reductio ad absurdum.)
I’d actually defined the utility of a universe as the average utility of my copies therein, so, depending on what utility I give to universes empty of copies of me, the suicide idea might not be a good one to follow.
But I freely admit my utility is sub-par, and doesn’t work. But the more severe problem is not that it leads to crazy answers; the problem is that it allows money pumping. Crazy answers can be patched, but vulnerabilities to constantly losing utility is a great hole in the theory.
Because I get to say what my intuitions are :-)
And they are that if I have a hundred copies all getting exactly the same nice experience, then this is just one positive experience.
However, in my next post, I don’t get to define my utility based on my intuitions, I have to follow some assumptions, and so I don’t end up in the same place...
Right, the fact that dividing by the number of copies of you sometimes gives division by zero is another good reason for not doing it :-)
But your assessment of problem importance is interesting. I would’ve said it the other way around. In practice, we tend to quickly notice when we are being money pumped, and apply a patch on the fly, so at worst we only end up losing a bit of money, which is recoverable. Crazy policies on the other hand… usually do little damage because we compartmentalize, but when we fail to compartmentalize, the resulting loss may not be recoverable.
This is a valid point for humans (who are not utilitarians at all). but when thinking of an ideal, AI-safe ethics, money pumps are a great falw: because the AI will get pumped again, again, and again, and lose utility. Alternately, the AI will patch its system on the fly; and if we don’t know how it does this, it could end up with a crazy policy—its unpredictable.
Maybe; honestly, nobody knows yet. We’re still too far from being able to build an AI for which ethics would be a relevant concept, to be able to say what such an ethics should look like. For all we really know, perhaps if and when we get to that point, it might become apparent that a system of supervisor modules to perform on-the-fly patching to reliably keep things within sensible bounds is a better solution than trying for logically perfect ethics, for any mind operating under physically realistic constraints of data and computing power.
I will suggest that at least part of the problem is your utility function: defining it in terms of the average utility of your future copies means you should spend all your money on lottery tickets and then commit suicide in all the branches where you didn’t win. The operation of dividing the total utility by the number of copies is the problematic part, and in my opinion entirely without justification.
(Granted we might be able to use non-indexical utility to patch this particular failure mode, but I put it to you that even for someone without any current close relationships, quantum suicide is still a bad enough idea to serve as a reductio ad absurdum.)
I’d actually defined the utility of a universe as the average utility of my copies therein, so, depending on what utility I give to universes empty of copies of me, the suicide idea might not be a good one to follow.
But I freely admit my utility is sub-par, and doesn’t work. But the more severe problem is not that it leads to crazy answers; the problem is that it allows money pumping. Crazy answers can be patched, but vulnerabilities to constantly losing utility is a great hole in the theory.
Why not the sum?
Because I get to say what my intuitions are :-) And they are that if I have a hundred copies all getting exactly the same nice experience, then this is just one positive experience. However, in my next post, I don’t get to define my utility based on my intuitions, I have to follow some assumptions, and so I don’t end up in the same place...
Right, the fact that dividing by the number of copies of you sometimes gives division by zero is another good reason for not doing it :-)
But your assessment of problem importance is interesting. I would’ve said it the other way around. In practice, we tend to quickly notice when we are being money pumped, and apply a patch on the fly, so at worst we only end up losing a bit of money, which is recoverable. Crazy policies on the other hand… usually do little damage because we compartmentalize, but when we fail to compartmentalize, the resulting loss may not be recoverable.
This is a valid point for humans (who are not utilitarians at all). but when thinking of an ideal, AI-safe ethics, money pumps are a great falw: because the AI will get pumped again, again, and again, and lose utility. Alternately, the AI will patch its system on the fly; and if we don’t know how it does this, it could end up with a crazy policy—its unpredictable.
Maybe; honestly, nobody knows yet. We’re still too far from being able to build an AI for which ethics would be a relevant concept, to be able to say what such an ethics should look like. For all we really know, perhaps if and when we get to that point, it might become apparent that a system of supervisor modules to perform on-the-fly patching to reliably keep things within sensible bounds is a better solution than trying for logically perfect ethics, for any mind operating under physically realistic constraints of data and computing power.