Stuart_Armstrong comments on Trapping AIs via utility indifference

Stuart_Armstrong 29 Feb 2012 10:17 UTC
−1 points
0

What do you think an AI that has read your article would do to avoid being trapped, given that such a trap (and the resulting program termination) would most certainly interfere with its utility function, no matter what it is?

Nothing at all. The trap works even if the AI knows everything there is to know, precisely because after utility indifference, its behaviour is exactly compatible with its utility function. It behaves “as if” it had utility function U and a false belief, but in reality it has utility function V and true beliefs.