Just to distance this very interesting question from expected utility maximization: “Beliefs” sound like they are about couldness, and values about shouldness. Couldness is about behavior of the environment outside the agent, and shouldness is about behavior of the agent. Of course, the two only really exist in interaction, but as systems they can be conceptualized separately. When an agent asks what it could do, the question is really about what effects in environment could be achieved (some Tarskian hypocrisy here: using “could” to explain “couldness”). Beliefs is what’s assumed, and values is what’s asserted. In a decision tree, beliefs are associated with knowledge about other agent’s possible actions, and values with the choice of the present agent’s action. Both are aspects of the system, but playing different roles in the interaction: making a choice versus accepting a choice. Naturally, there is a duality here, when the sides are exchanged: my values become your beliefs, and my beliefs become your values. Choice of representation is not that interesting, as it’s all interpretation: nothing changes in behavior.
I gave an example where choice of representation is important: Eliezer’s CEV. If the choice of representation shouldn’t to be important, then that seems to be argument against CEV.
Bullet acknowledged and bitten. A Friendly AI attempting to identify humanity’s supposed CEV will also have to be a politician and have enough support so that they don’t shut it down. As a politician, it will have to appeal to people with the standard biases. So it’s not enough for it to say, “okay, here’s something all of you should agree on as a value, and benefit from me moving humanity to that state”.
And in figuring out what would appeal to humans, it will have to model the same biases that blur the distinction.
Just to distance this very interesting question from expected utility maximization: “Beliefs” sound like they are about couldness, and values about shouldness. Couldness is about behavior of the environment outside the agent, and shouldness is about behavior of the agent. Of course, the two only really exist in interaction, but as systems they can be conceptualized separately. When an agent asks what it could do, the question is really about what effects in environment could be achieved (some Tarskian hypocrisy here: using “could” to explain “couldness”). Beliefs is what’s assumed, and values is what’s asserted. In a decision tree, beliefs are associated with knowledge about other agent’s possible actions, and values with the choice of the present agent’s action. Both are aspects of the system, but playing different roles in the interaction: making a choice versus accepting a choice. Naturally, there is a duality here, when the sides are exchanged: my values become your beliefs, and my beliefs become your values. Choice of representation is not that interesting, as it’s all interpretation: nothing changes in behavior.
I gave an example where choice of representation is important: Eliezer’s CEV. If the choice of representation shouldn’t to be important, then that seems to be argument against CEV.
Bullet acknowledged and bitten. A Friendly AI attempting to identify humanity’s supposed CEV will also have to be a politician and have enough support so that they don’t shut it down. As a politician, it will have to appeal to people with the standard biases. So it’s not enough for it to say, “okay, here’s something all of you should agree on as a value, and benefit from me moving humanity to that state”.
And in figuring out what would appeal to humans, it will have to model the same biases that blur the distinction.
I was referring to you referring to my post on playing with utility/prior representations.