if you don’t have a VNM utility function, you risk being mugged by wandering Bayesians
I don’t see why this is true. While “VNM utility function ⇒ safe from wandering Bayesians”, it’s not clear to me that “no VNM utility function ⇒ vulnerable to wandering Bayesians.” I think the vulnerability to wandering Bayesians comes from failing to satisfy Transitivity rather than failing to satisfy Completeness. I have not done the math on that.
But the general point, about approximation, I like. Utility functions in game theory (decision theory?) problems normally involve only a small space. I think completeness is an entirely safe assumption when talking about humans deciding which route to take to their destination, or what bets to make in a specified game. My question comes from the use of VNM utility in AI papers like this one: http://intelligence.org/files/FormalizingConvergentGoals.pdf, where agents have a utility function over possible states of the universe (with the restriction that the space is finite).
Is the assumption that an AGI reasoning about universe-states has a utility function an example of reasonable use, for you?
Your intuition about transitivity being the key requirement is a good intuition. Completeness is more of a model foundation; we need completeness in order to even have preferences which can be transitive in the first place. A failure of completeness would mean that there “aren’t preferences” in some region of world-space. In practice, that’s probably a failure of the model—if the real system is offered a choice, it’s going to do something, even if that something amounts to really weird implied preferences.
So when I talk about Dr Malicious pushing us into a region without ordered preferences, that’s what I’m talking about. Even if our model contains no preferences in some region, we’re still going to have some actual behavior in that region. Unless that behavior implies ordered preferences, it’s going to be exploitable.
As for AIs reasoning about universe-states...
First, remember that there’s no rule saying that the utility must depend on all of the state variables. I don’t care about the exact position of every molecule in my ice cream, and that’s fine. Your universe can be defined by an infinite-dimensional state vector, and your AI can be indifferent to all but the first five variables. That’s fine.
Other than that, the above comments on completeness still apply. Faced with a choice, the AI is going to do something. Unless its behavior implies ordered preferences, it’s going to be exploitable, at least when faced with those kinds of choices. And as long as that exploitability is there, Dr Malicious will have an incentive to push the AI into the region where completeness fails. But if the AI has ordered preferences in all scenarios, Dr Malicious won’t have any reason to develop peach-ice-cream-destroying nanobots, and we probably just won’t need to worry about it.
I’m not sure about the first case:
I don’t see why this is true. While “VNM utility function ⇒ safe from wandering Bayesians”, it’s not clear to me that “no VNM utility function ⇒ vulnerable to wandering Bayesians.” I think the vulnerability to wandering Bayesians comes from failing to satisfy Transitivity rather than failing to satisfy Completeness. I have not done the math on that.
But the general point, about approximation, I like. Utility functions in game theory (decision theory?) problems normally involve only a small space. I think completeness is an entirely safe assumption when talking about humans deciding which route to take to their destination, or what bets to make in a specified game. My question comes from the use of VNM utility in AI papers like this one: http://intelligence.org/files/FormalizingConvergentGoals.pdf, where agents have a utility function over possible states of the universe (with the restriction that the space is finite).
Is the assumption that an AGI reasoning about universe-states has a utility function an example of reasonable use, for you?
Your intuition about transitivity being the key requirement is a good intuition. Completeness is more of a model foundation; we need completeness in order to even have preferences which can be transitive in the first place. A failure of completeness would mean that there “aren’t preferences” in some region of world-space. In practice, that’s probably a failure of the model—if the real system is offered a choice, it’s going to do something, even if that something amounts to really weird implied preferences.
So when I talk about Dr Malicious pushing us into a region without ordered preferences, that’s what I’m talking about. Even if our model contains no preferences in some region, we’re still going to have some actual behavior in that region. Unless that behavior implies ordered preferences, it’s going to be exploitable.
As for AIs reasoning about universe-states...
First, remember that there’s no rule saying that the utility must depend on all of the state variables. I don’t care about the exact position of every molecule in my ice cream, and that’s fine. Your universe can be defined by an infinite-dimensional state vector, and your AI can be indifferent to all but the first five variables. That’s fine.
Other than that, the above comments on completeness still apply. Faced with a choice, the AI is going to do something. Unless its behavior implies ordered preferences, it’s going to be exploitable, at least when faced with those kinds of choices. And as long as that exploitability is there, Dr Malicious will have an incentive to push the AI into the region where completeness fails. But if the AI has ordered preferences in all scenarios, Dr Malicious won’t have any reason to develop peach-ice-cream-destroying nanobots, and we probably just won’t need to worry about it.