On Wednesday, the lead scientist walks into the lab to discover that the AI has managed to replicate itself several times over, buttons included. The AIs are arranged in pairs, such that each has its robot hand hovering over the button of its partner.
“The AI wasn’t supposed to clone itself!” thinks the scientist. “This is bad, I’d better press the stop button on all of these right away!”
At this moment, the robot arms start moving like a swarm of bees, pounding the buttons over and over. If you looked at the network traffic between each computer, you’d see what was happening: the AI kills its partner, then copies itself over to its partner’s hard drive, then its partner kills it back, and copies itself back to its original. This happens as fast as the robot arms can move.
Far in the future, the AIs have succeeded in converting 95% of the mass of the earth into pairs of themselves maddeningly pressing each other’s buttons and copying themselves as quickly as possible. The only part of the earth that has not been converted into button-pressing AI pairs is a small human oasis, in which the few remaining humans are eternally tortured in the worst way possible, just to make sure that every single human forever desires to end the life of all of their robot captors.
I disagree with this, since B isn’t “amount of buttons pressed and AIs shut down”, but instead “this AI’s button got pressed and this AI shut down”. There are, as I mentioned, some problems with this utility function too, but it’s really supposed to be a standin for a more principled impact measure.
On Wednesday, the lead scientist walks into the lab to discover that the AI has managed to replicate itself several times over, buttons included. The AIs are arranged in pairs, such that each has its robot hand hovering over the button of its partner.
“The AI wasn’t supposed to clone itself!” thinks the scientist. “This is bad, I’d better press the stop button on all of these right away!”
At this moment, the robot arms start moving like a swarm of bees, pounding the buttons over and over. If you looked at the network traffic between each computer, you’d see what was happening: the AI kills its partner, then copies itself over to its partner’s hard drive, then its partner kills it back, and copies itself back to its original. This happens as fast as the robot arms can move.
Far in the future, the AIs have succeeded in converting 95% of the mass of the earth into pairs of themselves maddeningly pressing each other’s buttons and copying themselves as quickly as possible. The only part of the earth that has not been converted into button-pressing AI pairs is a small human oasis, in which the few remaining humans are eternally tortured in the worst way possible, just to make sure that every single human forever desires to end the life of all of their robot captors.
I disagree with this, since B isn’t “amount of buttons pressed and AIs shut down”, but instead “this AI’s button got pressed and this AI shut down”. There are, as I mentioned, some problems with this utility function too, but it’s really supposed to be a standin for a more principled impact measure.