Shmi comments on Trapping AIs via utility indifference

Shmi 28 Feb 2012 20:55 UTC
0 points
0
What do you think an AI that has read your article would do to avoid being trapped, given that such a trap (and the resulting program termination) would most certainly interfere with its utility function, no matter what it is?

In other words, are you sure that it is possible to have a utility indifference for a fully invested non-linear utility (not just for the first approximation an AI receives before it figures out that its self-preservation is an absolutely essential part of maximizing any given utility)?
- moridinamael 29 Feb 2012 14:24 UTC
  0 points
  0
  Parent
  If you learned that the only reason you love your parents is nothing more than gross biological manipulation, do you react by ceasing to love you parents?
  - daenerys 29 Feb 2012 16:02 UTC
    3 points
    0
    Parent
    
    If you learned that the only reason you love your parents is nothing more than gross biological manipulation, do you react by ceasing to love you parents?
    
    If you realize that loving your parents causes net disutility to you, and you have the ability to self-hack or change your code, then....yes.
  - Shmi 1 Mar 2012 5:13 UTC
    0 points
    0
    Parent
    Children of abusers and narcissists put quite an effort into doing just that.
- Stuart_Armstrong 29 Feb 2012 10:17 UTC
  −1 points
  0
  Parent
  
  What do you think an AI that has read your article would do to avoid being trapped, given that such a trap (and the resulting program termination) would most certainly interfere with its utility function, no matter what it is?
  
  Nothing at all. The trap works even if the AI knows everything there is to know, precisely because after utility indifference, its behaviour is exactly compatible with its utility function. It behaves “as if” it had utility function U and a false belief, but in reality it has utility function V and true beliefs.