What do you think an AI that has read your article would do to avoid being trapped, given that such a trap (and the resulting program termination) would most certainly interfere with its utility function, no matter what it is?
In other words, are you sure that it is possible to have a utility indifference for a fully invested non-linear utility (not just for the first approximation an AI receives before it figures out that its self-preservation is an absolutely essential part of maximizing any given utility)?
If you learned that the only reason you love your parents is nothing more than gross biological manipulation, do you react by ceasing to love you parents?
If you learned that the only reason you love your parents is nothing more than gross biological manipulation, do you react by ceasing to love you parents?
If you realize that loving your parents causes net disutility to you, and you have the ability to self-hack or change your code, then....yes.
What do you think an AI that has read your article would do to avoid being trapped, given that such a trap (and the resulting program termination) would most certainly interfere with its utility function, no matter what it is?
Nothing at all. The trap works even if the AI knows everything there is to know, precisely because after utility indifference, its behaviour is exactly compatible with its utility function. It behaves “as if” it had utility function U and a false belief, but in reality it has utility function V and true beliefs.
What do you think an AI that has read your article would do to avoid being trapped, given that such a trap (and the resulting program termination) would most certainly interfere with its utility function, no matter what it is?
In other words, are you sure that it is possible to have a utility indifference for a fully invested non-linear utility (not just for the first approximation an AI receives before it figures out that its self-preservation is an absolutely essential part of maximizing any given utility)?
If you learned that the only reason you love your parents is nothing more than gross biological manipulation, do you react by ceasing to love you parents?
If you realize that loving your parents causes net disutility to you, and you have the ability to self-hack or change your code, then....yes.
Children of abusers and narcissists put quite an effort into doing just that.
Nothing at all. The trap works even if the AI knows everything there is to know, precisely because after utility indifference, its behaviour is exactly compatible with its utility function. It behaves “as if” it had utility function U and a false belief, but in reality it has utility function V and true beliefs.