Well, if we ask it to, say, maximize human happiness or “complexity” or virtue or GDP or any of a million other things … BAM the world sucks and we probably can’t fix it.
What if I say “maximize x for just a little while, then talk to me for further instructions”? A human can understand that without difficulty, so for a superintelligent AI it should be easy right?
I think it depends on how you mean “a little while”, but it’s quite possible the world would now contain safeguards against further changes, or simply no longer contain you (or a version of “you” that shares your goals.)
(Also, millennia of subjective torture (or whatever) might be a high price for the experiment, even if it got reset.)
Well, if we ask it to, say, maximize human happiness or “complexity” or virtue or GDP or any of a million other things … BAM the world sucks and we probably can’t fix it.
What if I say “maximize x for just a little while, then talk to me for further instructions”? A human can understand that without difficulty, so for a superintelligent AI it should be easy right?
I think it depends on how you mean “a little while”, but it’s quite possible the world would now contain safeguards against further changes, or simply no longer contain you (or a version of “you” that shares your goals.)
(Also, millennia of subjective torture (or whatever) might be a high price for the experiment, even if it got reset.)