“I am precommiting that anyone who cloned themselves a trillion times gets all their clones killed. This precommitment will prevent anyone who genuinely understands my source code from having cloned themselves in the past, and will therefore increase utility.”
Wait, increase utility according to what utility function? If it’s an aggregate utility function where Dr. Evil has 99% weight, then why would that precommitment increase utility?
You’re right. It will make a commitment to stop anyone who tries the same thing later, but it won’t apply it retroactively. The original comment is wrong.
Wait, increase utility according to what utility function?
The current CEV of humanity, or your best estimate of it, I think. If someone forces us to kill orphans or they’ll destroy the world, saving the world is higher utility, but we still want to punish the guy who made it so.
I think that’s where the idea came from, anyway; I agree with Yvain that it doesn’t work.
Wait, increase utility according to what utility function? If it’s an aggregate utility function where Dr. Evil has 99% weight, then why would that precommitment increase utility?
You’re right. It will make a commitment to stop anyone who tries the same thing later, but it won’t apply it retroactively. The original comment is wrong.
The current CEV of humanity, or your best estimate of it, I think. If someone forces us to kill orphans or they’ll destroy the world, saving the world is higher utility, but we still want to punish the guy who made it so.
I think that’s where the idea came from, anyway; I agree with Yvain that it doesn’t work.