timtyler comments on The Human’s Hidden Utility Function (Maybe)

timtyler 25 Jan 2012 15:30 UTC
1 point
0

I’ve taken a look at the paper. If “outcomes” are things like “chose A”, “chose B” or “chose C”, the above mind is simply not an O-maximizer: consider a world with observations “I can choose between A and B/B and C/C and A” (equally likely, independent of any past actions or observations) and actions “take the first offered option” or “take the second offered option” (played for one round, for simplicity, but the argument works fine with multiple rounds); there is no definition of U that yields the described behaviour.

What?!? You haven’t clearly specified the behaviour of the machine. If you are invoking an uncomputable random number generator to produce an “equally likely” result then you have an uncomputable agent. However, there’s no such thing as an uncomputable random number generator in the real world. So: how is this decision actually being made?

I’m aware that the paper asserts that “any agents [sic] can be written in O-maximizer form”, but note that the paper may simply be wrong. It’s clearly an unfinished draft, and no argument or proof is given.

It applies to any computable agent. That is any agent—assuming that the Church–Turing–Deutsch principle is true.

The argument given is pretty trivial. If you doubt the result, check it—and you should be able to see if it is correct or not fairly easily.
- JoachimSchipper 25 Jan 2012 16:57 UTC
  0 points
  0
  Parent
  The world is as follows: each observation x_i is one of “the mind can choose between A and B”, “the mind can choose between B and C” or “the mind can choose between C and A” (conveniently encoded as 1, 2 and 3). Independently of any past observations (x_1 and the like) and actions (x_1 and the like), each of these three options is equally likely. This fully specifies a possible world, no?
  
  The mind, then, is as follows: if the last observation is 1 (“A and B”), output “A”; if the last observation is 2 (“B and C”), output “B”; if the last observation is 3 (“C and A”), output “C”. This fully specifies a possible (deterministic, computable) decision procedure, no? (1)
  
  I argue that there is no assignment to U(“A”), U(“B”) and U(“C”) that causes an O-maximizer to produce the same output as the algorithm above. Conversely, there are assignments to U(“1A”), U(“1B”), …, U(“3C”) that cause the O-maximizer to output the same decisions as the above algorithm, but then we have encoded our decision algorithm into the U function used by the O-maximizer (which has its own issues, see my previous post.)
  
  (1) Actually, the definition requires the mind to output something before receiving input. That is a technical detail that can be safely ignored; alternatively, just always output “A” before receiving input.
  - timtyler 25 Jan 2012 18:13 UTC
    3 points
    0
    Parent
    
    I argue that there is no assignment to U(“A”), U(“B”) and U(“C”) that causes an O-maximizer to produce the same output as the algorithm above.
    
    ...but the domain of a utility function surely includes sensory inputs and remembered past experiences (the state of the agent). You are trying to assign utilities to outputs.
    
    If you try and do that you can’t even encode absolutely elementary preferences with a utility function—such as: I’ve just eaten a peanut butter sandwich, so I would prefer a jam one next.
    
    If that is the only type of utility function you are considering, it is no surprise that you can’t get the theory to work.