Computational efficiency reasons not to model VNM-rational preference relations with utility functions

The VNM theorem is constructive; given a rational preference relation, it is easy to construct a corresponding utility function. But in may cases, it may be computationally intractable to choose between two lotteries by computing each of their expected utilities and picking the bigger one, even while it may be easy to decide between them in other ways.

Consider a human deciding whether or not to buy a sandwich. There are several factors that go into this decision, like the price of substitutes, the human’s future income, and how tasty the sandwich will be. But these factors are still fairly limited; the human does not consider, for instance, the number of third cousins they have, the outcome of a local election in Siberia, or whether there is intelligent life elsewhere in the galaxy. The predictable effects of buying a sandwich are local in nature, and the human’s preferences also have an at least somewhat local character, so they need not think too much about too many details of the world in order to make their decision.

Compare this with how difficult it would be for the human to compute the expected utility of the world if they buy the sandwich, and the expected utility of the world if they don’t buy the sandwich, each to enough precision that they can tell which is bigger. The way to construct a utility function corresponding to a given rational preference relation is to pick two lotteries as reference points, declare that the preferred lottery has utility 1 and the dispreferred lottery has utility 0, and compute the utility of other lotteries by looking at what the preference relation says about weighted averages of the lottery with the two reference points. But the reference points could differ from the lotteries resulting from buying a sandwich and from not buying a sandwich in many ways; perhaps these reference points differ from the real world in the number of third cousins the human has, the outcome of a local election in Siberia, and the presence of intelligent life elsewhere in this galaxy. In order to compute the expected utility of buying a sandwich and of not buying a sandwich, both to high enough precision that they can tell which is bigger, the human must consider how all of these factors differ from the reference points, and decide how much they care about each of them. Doing this would require heroic feats of computation, introspection, and philosophical progress, which aren’t really needed for the simple decision of whether or not to buy a sandwich. They might try picking realistic reference points, but if they learn more about the world since picking reference points, then there might be systematic differences between the reference points and the available options anyway.

Some decisions have much more far-reaching consequences than the decision to buy a sandwich (such as deciding whether or not to quit your job), and this is especially true for especially powerful agents (such as a political leader deciding whether or not to invade another country). And decisions with farther-reaching consequences are more difficult to make. But even the actions of a superintelligent AI only affect its future light-cone (neglecting strategies like acausal trade, which may extend its reach somewhat). And even for very powerful agents whose actions are quite far-reaching, it seems likely that most of their decisions can be broken into components that each have more localized consequences.

For the same reasons a rational agent wouldn’t want to use its utility function to make decisions, a utility function might also not be useful for someone else to model the agent’s preferences with. Someone trying to predict the behavior of a rational agent doesn’t want to do a bunch of unnecessary computation to determine the utilities of each possible action the agent could take any more than the agent does.

A possible counterargument is that one might want to know the magnitude of a preference, rather than just its sign, for the purposes of estimating how likely the preference is to be reversed after further thought, or under random perturbations of the available options. But even then, the difference between the expected utilities of two lotteries combines information about the strength of the preference between those two lotteries with information about the strength of the preference between the two reference points that were used to define utility. If an agent changes its estimate of the difference in expected utility between two lotteries after thinking for longer, that doesn’t tell you whether they were thinking more about these two lotteries or the two reference points. So it might be better to model the robustness of a value judgment directly, instead of as a difference of expected utilities. This could be done by estimating how large certain changes in value judgments in expected directions, or how large certain perturbations of the lotteries in expected directions, would have to be in order to reverse the preference between the two lotteries.