Dmytry comments on Trapping AIs via utility indifference

Dmytry 2 Mar 2012 11:39 UTC
0 points
0
And why exactly is this ‘play randomly for 3 moves then applying material advantage’ gives better utility than just applying material advantage?

Plus you got yourself some utility function that is entirely ill defined in a screwy self referential way (as the expected utility of a move ultimately depends to the AI itself and it’s ability to use resulting state after the move to it’s advantage). You can talk about it in words but you didn’t define it other than ‘okay now it will make ai indifferent’.

To be contrasted with original well defined utility function of future states; the AI may be unable to predict the future states, and calculate some utility numbers to assign to moves, but it can calculate utility of particular end-state of the board, and it can reason from this to strategies. There’s simple thing for it to reason about, originally. I can write python code that looks at board, and tells the available legal moves or the win/loss/tie utility if it is end state. That is the definition of chess utility. AI can take it and reason about it. You instead have some utility that feeds back AI’s own conclusions about utility of potential moves into utility function.
- Stuart_Armstrong 5 Mar 2012 19:41 UTC
  −1 points
  0
  Parent
  
  And why exactly is this ‘play randomly for 3 moves then applying material advantage’ gives better utility than just applying material advantage?
  
  In this instance, they won’t differ at all. But if the AI had some preferences outside of the chess board, then the indifferent AI would be open to playing any particular move (for the first three turns) in exchange for some other separate utility gain.
  
  Plus you got yourself some utility function that is entirely ill defined in a screwy self referential way
  
  In fact no. It seems like that, because of the informal language I used, but the utility function is perfectly well defined without any reference to the AI. The only self-reference is the usual one—how do I predict my future actions now.
  
  If you mean that an indifferent utility can make these predictions harder/more necessary in some circumstances, then you are correct—but this seems trivial for a superintelligence.