“The AI does things that I personally approve of” as an alignment target with reference to everybody and their values is actually easier to hit than one might think.
It doesn’t require ethics to be solved; it can be achieved by engineering your approval.
It might be impossible for you to tell which of these two post-ASI worlds you find yourself in.
“The AI does things that I personally approve of” as an alignment target with reference to everybody and their values is actually easier to hit than one might think.
It doesn’t require ethics to be solved; it can be achieved by engineering your approval.
It might be impossible for you to tell which of these two post-ASI worlds you find yourself in.