asr comments on How many people here agree with Holden? [Actually, who agrees with Holden?]

asr 15 May 2012 5:31 UTC
2 points
The argument breaks down because you are equivocating on what the space is to search over and what the utility function in question is.

Under a given utility function U, “change the utility function to U’ ” won’t generally have positive utility. Self-awareness and pleasure-seeking aren’t some natural properties of optimization processes. They have to be explicitly built in.

Suppose you set a theorem-prover to work looking for a proof of some theorem. It’s searching over the space of proofs. There’s no entry corresponding to “pick a different and easier theorem to prove”, or “stop proving theorems and instead be happy.”
- Manfred 15 May 2012 5:52 UTC
  1 point
  Parent
  The utility function is r(x) (the “r” is for “reward function”). I’m talking about changing x, and leaving r unchanged.
  - asr 15 May 2012 5:56 UTC
    0 points
    Parent
    Yes, I just changed the notation to be more standard. The point remains. There need not be any “x” that corresponds to “pick a new r” or to “pretend x was really x’”. If there was such an x, it wouldn’t in general have high utility.
    - Manfred 15 May 2012 17:06 UTC
      1 point
      Parent
      x is just an input string. So, for example, each x could be a frame coming from a video camera. AIXI then has a reward function r(x), and it maximizes the sum of r(x) over some large number of time steps. In our example, let’s say that if the camera is looking at a happy puppy, r is big, if it’s looking at something else, r is small.
      
      In the lab, AIXI might have to choose between two options (action can be handled by some separate output string, as in Hutter’s paper):
      1) Don’t follow the puppy around.
      2) Follow the puppy around.
      
      Clearly, it will do 2, because r is bigger when it’s looking at a happy puppy, and 2 increases the chance of doing so. One might even say one has a puppy-following robot.
      
      In the real world, there are more options—if you give AIXI access to a printer and some scotch tape, options look like this:
      1) Don’t follow the puppy around.
      2) Follow the puppy around.
      3) Print out a picture of a happy puppy and tape it to the camera.
      
      Clearly, it will do 3, because r is bigger when it’s looking at a happy puppy, and 3 increases the chance of doing so. One might even say one has a happy-puppy-looking-at maximizing robot. This time it’s even true.