I think that one possible solution against suffering maximisers is that benevolent AIs will dominate the “measure” landscape, so for any my suffering copy exist 10E100 happy ones.
These benevolent AIs could save the suffering-victims from their evil AI using indexing uncertainty attack. First step: create many copies of me in the bad-observer-moment, Second step: crete many good next moments for this bad observer-moment. As a result for any observer how now suffering, the most likely next moment will be that he will stop suffering.
This looks even like a good deal: do you want to suffer for one second for billion of billions years of happiness?
Another way to fight evil AI is indexical attack against the whole AI: For any possible real evil AI, benevolent AI creates million evil AIs in simulations, and if it sees that evil AI starts to torture people, it punishes evil AI. Thus any evil AI (or even a paperclipper), who knows it, will not torture humans, but will preserve them and let them flourish. To reach this outcome it is enough to declare such intent before any AI creation. I described this in more details in here.
These are some good possible solutions but there could be some problems. The first solution implies that the observer is a separate entity from the body, and an observer has a certain probability of “being inside” a certain copy every smallest possible unit of time. But what is the connection between these different copies? They aren’t really copies at all, they’re all different people. And what is the observer actually made of? So at this point it seems to imply open individualism. Of course, there is currently no universally accepted theory for personal identity so this could be true.
And in the second solution, the suffering maximiser has 2 choices: it can either create no suffering at all for fear of being punished by friendly AI, or it can create suffering and take the risk. In the first case, the probability of creating suffering is 0 which is the worst thing possible for the suffering maximiser, so it will take the second choice, where there is a chance that it will not be punished and will be able to create at least some amount of suffering.
1. True, depends on the nature of personal identity. However, if some finite form of identity is true, I should not worry about “hostile resurrection”: that future AI steal information about me and create my copy and will torture me. This closes possibility of many bad outcomes.
2. More likely to work for “instrumental sufferings maximiser”, which may use human sufferings for blackmail. For AI, which has final goal of suffering maximising, where could be some compromise: it allows to torture one person for one second. And as I suggested this idea, I have to volunteer to be this person.
I think that one possible solution against suffering maximisers is that benevolent AIs will dominate the “measure” landscape, so for any my suffering copy exist 10E100 happy ones.
These benevolent AIs could save the suffering-victims from their evil AI using indexing uncertainty attack. First step: create many copies of me in the bad-observer-moment, Second step: crete many good next moments for this bad observer-moment. As a result for any observer how now suffering, the most likely next moment will be that he will stop suffering.
This looks even like a good deal: do you want to suffer for one second for billion of billions years of happiness?
Another way to fight evil AI is indexical attack against the whole AI: For any possible real evil AI, benevolent AI creates million evil AIs in simulations, and if it sees that evil AI starts to torture people, it punishes evil AI. Thus any evil AI (or even a paperclipper), who knows it, will not torture humans, but will preserve them and let them flourish. To reach this outcome it is enough to declare such intent before any AI creation. I described this in more details in here.
These are some good possible solutions but there could be some problems. The first solution implies that the observer is a separate entity from the body, and an observer has a certain probability of “being inside” a certain copy every smallest possible unit of time. But what is the connection between these different copies? They aren’t really copies at all, they’re all different people. And what is the observer actually made of? So at this point it seems to imply open individualism. Of course, there is currently no universally accepted theory for personal identity so this could be true.
And in the second solution, the suffering maximiser has 2 choices: it can either create no suffering at all for fear of being punished by friendly AI, or it can create suffering and take the risk. In the first case, the probability of creating suffering is 0 which is the worst thing possible for the suffering maximiser, so it will take the second choice, where there is a chance that it will not be punished and will be able to create at least some amount of suffering.
1. True, depends on the nature of personal identity. However, if some finite form of identity is true, I should not worry about “hostile resurrection”: that future AI steal information about me and create my copy and will torture me. This closes possibility of many bad outcomes.
2. More likely to work for “instrumental sufferings maximiser”, which may use human sufferings for blackmail. For AI, which has final goal of suffering maximising, where could be some compromise: it allows to torture one person for one second. And as I suggested this idea, I have to volunteer to be this person.