nshepperd comments on Why Universal Comparability of Utility?

nshepperd 13 May 2018 16:59 UTC
5 points
The VNM theorem is best understood as an operator that applies to a function $C o m p (W_{i}, W_{j}) \to {W_{i} < W_{j}, W_{i} > W_{j}, W_{i} \sim W_{j}}$ that obeys the axioms and rewrites that function in the form $C o m p (W_{i}, W_{j}) = C o m p a r e (E [U (W_{i})], E [U (W_{j})])$ where U is the resulting “utility function” producing a real number. So it rewrites your function into one that compares “expected utilities”.
To apply this to something in the real world, a human or an AI, one must decide exactly what $C o m p$ refers to and how $(>, <, \sim)$ are interpreted.
- We can interpret $C o m p$ as the actual revealed choices of the agent. Ie. when put in a position to take action to cause either $W_{i}$ or $W_{j}$ to happen, what do they do? If the agent’s thinking doesn’t terminate (within the allotted time), or it chooses randomly, we can interpret that as $\sim$ . The possibilities are fully enumerated, so completeness holds. However, you will find that any real agent fails to obey some of the other axioms.
- We can interpret $C o m p$ as the expressed preferences of the agent. That is to say, present the hypothetical and ask what the agent prefers. Then we say that $W_{i} < W_{j}$ if the agent says they prefer $W_{j}$ ; we say that $W_{i} > W_{j}$ if the agent says they prefer $W_{i}$ , and we say that $W_{i} \sim W_{j}$ if the agent says they are equal or can’t decide (within the allotted time). Again completeness holds, but you will again always find that some of the other axioms will fail.
- In the case of humans, we can interpret $C o m p$ as some extrapolated volition of a particular human. In which case we say that $W_{i} < W_{j}$ if the person would choose $W_{j}$ if only they thought faster, knew more, were smarter, were more the person they wished they would be, etc. One might fancifully describe this as defining $C o m p$ as the person’s “true preferences”. This is not a practical interpretation, since we don’t know how to compute extrapolated volition in the general case. But it’s perfectly mathematically valid, and it’s not hard to see how it could be defined so that completeness holds. It’s plausible that the other axioms could hold too—most people consider the rationality axioms generally desirable to conform to, so “more the person they wished they would be” plausibly points in a direction that results in such rationality.
- For some AIs whose source code we have access to, we might be able to just read the source code and define $C o m p$ using the actual code that computes preferences.
There are a lot of variables here. One could interpret the domain of $C o m p$ as being a restricted set of lotteries. This is the likely interpretation in something like a psychology experiment where we are constrained to only asking about different flavours of ice cream or something. In that case the resulting utility function will only be valid in this particular restricted domain.
- Rohin Shah 14 May 2018 15:53 UTC
  3 points
  Parent
  I was going to say the same thing as the first bullet point here—you can interpret the preference ordering as “If you were to give the agent two buttons that could cause world state 1 and world state 2 respectively, which would it choose?” (Indifference could be modeled as a third button which chooses randomly.) This gives you a definition of the full preference ordering which is complete by construction.
  In practice, you only need to have utilities over world states you actually have to decide between, but I think the VNM utility theorem will apply in the same way to the world states which you actually care about.
- AK 13 May 2018 18:47 UTC
  2 points
  Parent
  Thanks for this response. On notation: I want world-states, $W_{i}$ , to be specific outcomes rather than random variables. As such, $U (W_{i})$ is a real number, and the expectation of a real number could only be defined as itself: $E [U (W_{i})] = U (W_{i})$ in all cases. I left aside all the discussion of ‘lotteries’ in the VNM Wikipedia article, though maybe I ought not have done so.
  I think your first two bullet points are wrong. We can’t reasonably interpret ~ as ‘the agent’s thinking doesn’t terminate’. ~ refers to indifference between two options, so if $A > B > C$ and $P$ ~ $B$ , then $A > P > C$ . Equating ‘unable to decide between two options’ and ‘two options are equally preferable’ will lead to a contradiction or a trivial case when combined with transitivity. I can cook up something more explicit if you’d like?
  There’s a similar problem with ~ meaning ‘the agent chooses randomly’, provided the random choice isn’t prompted by equality of preferences.
  This comment has sharpened my thinking, and it would be good for me to directly prove my claims above—will edit if I get there.