There isn’t much of a dilemma if you assume there are some states worse than death. Eternal torture is less preferable to non-existence. A malicious world of pain and vice is less preferable than a non-existent world. By becoming a malicious, vice-filled person you are moving the world in the direction of being worse than non-existent, and thus are defeating your stated goal. You are doing more to destroy the world than to save it.
The least convenient possible world is one with superhumanly intelligent AIs that can have complete confidence in their source code, and predict with complete confidence that these means (thuggishness) will in fact lead to those ends (saving the world).
However in that world the world has already been saved (or destroyed) and so this is not relevant. In any relevant world the actor who is resorting to thuggishness to save the world is a human running on hostile hardware, and would be stupid not to take that into consideration.
I consider the “P” in LCPW to be important. If the agents in question are post-human then it’s too late to worry about saving the world. If you still have to save the world, then standard human failure modes do apply.
There isn’t much of a dilemma if you assume there are some states worse than death. Eternal torture is less preferable to non-existence. A malicious world of pain and vice is less preferable than a non-existent world. By becoming a malicious, vice-filled person you are moving the world in the direction of being worse than non-existent, and thus are defeating your stated goal. You are doing more to destroy the world than to save it.
Consider the least convenient possible world
The least convenient possible world is one with superhumanly intelligent AIs that can have complete confidence in their source code, and predict with complete confidence that these means (thuggishness) will in fact lead to those ends (saving the world).
However in that world the world has already been saved (or destroyed) and so this is not relevant. In any relevant world the actor who is resorting to thuggishness to save the world is a human running on hostile hardware, and would be stupid not to take that into consideration.
Then it isn’t the LCPW
I consider the “P” in LCPW to be important. If the agents in question are post-human then it’s too late to worry about saving the world. If you still have to save the world, then standard human failure modes do apply.