Can you say what happens if the superintelligence is successfully optimizing for maximizing the subjective feeling of happiness in humans?
Can you say why we can’t do even that?
Would your main point be that if successful, this leads to wireheading? And/or that we don’t know how to align a superintelligence to such a value anyway?
Would your main point be that if successful, this leads to wireheading? And/or that we don’t know how to align a superintelligence to such a value anyway?
Yep, pretty much! (As an example of the motion you would want to be able to make in more general/less obvious settings.)