I don’t have a unified answer at the moment, but a few comments/pointers...
First, I currently think that utility-maximization is the right way to model goal-directedness/optimization-in-general, for reasons unrelated to dutch book arguments. Basically, if we take Flint’s generalized-notion-of-optimization (which was pretty explicitly not starting from a utility-maximization assumption), and formalize it in a pretty reasonable way (as reducing the number-of-bits required to encode world-state under some model), then it turns out to be equivalent to expected utility maximization. This definitely seems like the sort of argument which should apply to e.g. evolved systems.
One caveat, though: that argument says that the system is an expected utility maximizer under some God’s-eye world-model, not necessarily under the system’s own world model (if it even has one). I expect something like the (improved) Good Regulator theorem to be able to bridge that gap, but I haven’t written up a full argument for that yet, much less started on the empirical question of whether the Good Regulator conditions actually match the conditions under which agenty systems evolve in practice.
Second, there’s the issue from Why Subagents?. Markets of EU maximizers are “inexploitable” in exactly the same sense used in the dutch book theorems, but a well-known result in economics says that a market is not always equivalent to single EU maximizer. What gives? We have an explicit example of a system which is inexploitable but not an EU maximizer, so what loophole in the dutch book theorems is it using? Turns out, the dutch book theorems implicitly assume that the system has no “internal state”—which is clearly false for most real-world agenty systems. I conjecture that markets, rather than single EU maximizers, are the most general inexploitable systems once we allow for internal state.
I still expect evolved systems to be well-modeled as EU maximizers at the level of terminal goals and from an external perspective, for the reasons above. But in terms of instrumental goals/behavior, or internal implementation, I expect to see market-like mechanisms, rather than single-utility-maximization.
Third, there’s the issue of type-signatures of goals. You mentioned utilities over world-states or changes-in-world-states, but the possibilities are much broader than that—in principle, the variables which go into a utility function need not be (fully) grounded in physical world-state at all. I could, for instance, care about how elegant the mathematical universe is, e.g. whether P = NP, independent of the real-world consequences of that.
More importantly, the variables which go into a utility function need not be things the agent can or does observe. I think this is true for rather a lot of things humans care about—for instance, I care about the welfare of random people in Mumbai, even if I will never meet them or have any idea how they’re doing. This is very different from the dutch-book theorems, which assume not only that we can observe every variable, but that we can even bet on every variable. This is another aspect which makes more sense if we view EU maximization as compression (i.e. reducing the number-of-bits required to encode world-state under some model) rather than as a consequence of dutch-book theorems.
I conjecture that markets, rather than single EU maximizers, are the most general inexploitable systems once we allow for internal state.
I note that this may be very similar to Eli’s own proposal, provided we do insist on the constraint that “if you can predict how your values will change then you agree with that change” (aka price today equals expected value of price tomorrow).
I don’t have a unified answer at the moment, but a few comments/pointers...
First, I currently think that utility-maximization is the right way to model goal-directedness/optimization-in-general, for reasons unrelated to dutch book arguments. Basically, if we take Flint’s generalized-notion-of-optimization (which was pretty explicitly not starting from a utility-maximization assumption), and formalize it in a pretty reasonable way (as reducing the number-of-bits required to encode world-state under some model), then it turns out to be equivalent to expected utility maximization. This definitely seems like the sort of argument which should apply to e.g. evolved systems.
One caveat, though: that argument says that the system is an expected utility maximizer under some God’s-eye world-model, not necessarily under the system’s own world model (if it even has one). I expect something like the (improved) Good Regulator theorem to be able to bridge that gap, but I haven’t written up a full argument for that yet, much less started on the empirical question of whether the Good Regulator conditions actually match the conditions under which agenty systems evolve in practice.
Second, there’s the issue from Why Subagents?. Markets of EU maximizers are “inexploitable” in exactly the same sense used in the dutch book theorems, but a well-known result in economics says that a market is not always equivalent to single EU maximizer. What gives? We have an explicit example of a system which is inexploitable but not an EU maximizer, so what loophole in the dutch book theorems is it using? Turns out, the dutch book theorems implicitly assume that the system has no “internal state”—which is clearly false for most real-world agenty systems. I conjecture that markets, rather than single EU maximizers, are the most general inexploitable systems once we allow for internal state.
I still expect evolved systems to be well-modeled as EU maximizers at the level of terminal goals and from an external perspective, for the reasons above. But in terms of instrumental goals/behavior, or internal implementation, I expect to see market-like mechanisms, rather than single-utility-maximization.
Third, there’s the issue of type-signatures of goals. You mentioned utilities over world-states or changes-in-world-states, but the possibilities are much broader than that—in principle, the variables which go into a utility function need not be (fully) grounded in physical world-state at all. I could, for instance, care about how elegant the mathematical universe is, e.g. whether P = NP, independent of the real-world consequences of that.
More importantly, the variables which go into a utility function need not be things the agent can or does observe. I think this is true for rather a lot of things humans care about—for instance, I care about the welfare of random people in Mumbai, even if I will never meet them or have any idea how they’re doing. This is very different from the dutch-book theorems, which assume not only that we can observe every variable, but that we can even bet on every variable. This is another aspect which makes more sense if we view EU maximization as compression (i.e. reducing the number-of-bits required to encode world-state under some model) rather than as a consequence of dutch-book theorems.
I note that this may be very similar to Eli’s own proposal, provided we do insist on the constraint that “if you can predict how your values will change then you agree with that change” (aka price today equals expected value of price tomorrow).