Coalition Dynamics as Morality

At risk of re-hashing some things which have already been covered a lot, I wanted to outline some of my current thinking on ethics/​morality/​meta-ethics. I haven’t yet found a distinction between these things which feels to me like more than a distinction of the moment. What I’m talking about for the purposes of this post is game-theoretic reasoning which has a tendency to promote cooperation and get good results. I’ll call this “game-theoretic morality” here.

I suspect there isn’t something like an objectively correct game-theoretic morality. The pragmatically best approach depends too much on what universe you’re in. Players can enforce weird equilibria in iterated Prisoner’s Dilemma. If you find yourself in a strange playing field, all sorts of irrational-looking strategies may be optimal. That being said, we can try to capture working principles for the situations we find we tend to get into, and hope they don’t generalize too badly.

Coalition Dynamics

I think of game-theoretic morality largely in terms of coalition dynamics. In some sense, the ideal outcome is for everyone to be on the same team, maximizing a combined utility function. Unfortunately, that’s not always possible. A pure altruist, who values everyone equally, is exploitable; unqualified altruism isn’t a winning strategy (or rather, isn’t always a winning strategy), even from the perspective of global coordination. A more pragmatic strategy is to give consideration to others in a way which incentivises joining your coalition. This often allows you to convince selfish agents to “grow the circle of empathy”, creating overall better outcomes through coordination.

This line of thinking leads to things like Nash bargaining and Shapley value. Everyone in the coalition coordinates to provide maximum value to the coalition as a whole, but not treating everyone equally. Members of the coalition are valued based on their contribution to the coalition. If you want the setup to be more egalitarian, that’s a matter of your values (which your coalition partners should take into account), but it’s not part of the ideal game-theoretic morality.

This point is similar to Friendship is Utilitarian. Even if you have egalitarian altruistic goals, and even if your potential allies don’t, it can be overall better to form alliances in which you help the people who you expect will help you most in return.

If this sounds a little too cold-hearted, there’s probably a good reason for that. When I said “things like Nash bargaining and Shapely value”, I was purposefully leaving things open to interpretation. I don’t know what the right formal model of what I’m talking about is. I suspect there’s some use for extending benefit of the doubt, in general. For example, in Prisoner’s Dilemma, if your strategy is to estimate the probability p that the other person will cooperate and then cooperate with probability p yourself, the result is unstable when playing with other similar agents. However, if you cooperate with probability p+0.001, then both people are trying to be a little more cooperative than the other. You’ll cooperate 100% of the time with others following the same strategy, while sacrificing very little in other situations. Common knowledge that you’ll extend a little more trust than is “strictly justified” can go a long way!

By the way, in one sense, the “True Prisoner’s Dilemma” is impossible between agents of the sort I’m imagining. They see the game set-up and the payoff table, and immediately figure out the Nash bargaining solution (or something like it), and re-write their own utility function to care about the other player. From this perspective, the classical presentation of Prisoner’s Dilemma as a game between humans doesn’t provide such bad intuitions after all.

Preference Utilitarianism

Preference utilitarianism makes a lot more sense within this kind of coalition than alternatives like hedonic utilitarianism. We help allies in what they care about, not what we think they ideally should care about. You’re allowed to care about the happiness of others. Allies in your coalition will support your wishes in this respect, to the extent that you’ve earned it (and perhaps a little more). But, as with egalitarianism, that’s a matter of your personal preference, not a matter of game-theoretic morality.


Another wrinkle in the story is timeless decision theory, which gives something more like rule utilitarianism rather than the more common act utilitarianism. This is quite close to deontology, if not identical. In particular, it sounds quite close to Kant’s categorical imperative to me.

Arguably, timeless decision theory does not exactly give rule utilitarianism: trying to take the action which you would want relevantly similar decision makers to take in relevantly similar situations is not necessarily the same as trying to act according to the set of rules which are highest-utility. Creating rules (such as “do not kill”, “do not lie”) risks over-generalizing in a way which trying to follow the best policy doesn’t. However, this is good for humans: we can’t expect to work out all the instances correctly on the spot, especially accounting for biases. Furthermore, clear rules are going to be better for coalition coordination than just generally trying to take the best actions (although there’s room for both).

A common objection is that deontology is about duty, not about consequences; that even if rule utilitarians do arrive at the same conclusions, they do it for different reasons. However, from a coalition perspective, I’m not sure “duty” is such a bad way of describing the reason for following the rules.


The kind of reasoning here has some similarities to Scott Alexander’s attempt to derive utilitarianism from contractualism.

Now, I won’t try to pretend that I understand contractualism all that well, but I think orthodox contractualism (as opposed to Scott Alexander’s version) does something more like a “min” operation rather than summing utility. From the SEP article:

Since individuals must be objecting on their own behalf and not on behalf of a group, this restriction to single individuals’ reasons bars the interpersonal aggregation of complaints; it does not allow a number of lesser complaints to outweigh one person’s weightier complaint.

I surprise myself by thinking something similar to this applies to the ideal coalition dynamics.

Harsanyi’s Utilitarian Theorem is a very strong argument for the utilitarian practice of aggregating utilities by summing them. However, when we take coalition dynamics into account, we see that there’s a need to keep everyone in the coalition happy. Utilitarianism will happily kill a few group members or expose them to terrible suffering for the greater good. If coalition members can foresee this fate, they will likely leave the coalition.

This situation is somewhat improved if the coalition members are using something like timeless decision theory, since they will have a greater tendency to commit to beneficial arrangements. However, assuming a typical “veil of ignorance” seems too strong—this is like assuming that all the agents come from the same timeless perspective (a position where they’re ignorant of which agent they’ll become). This would allow a perfect Harsanyi coordination, but only because everyone starts out agreeing by assumption.

If there’s a great degree of honor in the coalition, or other commitment mechanisms which enforce what’s best for the group overall in the Harsanyi sense, then this isn’t a concern. However, it seems to me that some sort of compromise between optimizing the minimum and optimizing the average will be needed. Perhaps it’d be more like optimizing the average subject to the constraint that no one is so bad off that they will leave, or optimizing the average in some way that takes into account that some people will leave.

Population Ethics

Perhaps the most famous objection to utilitarianism is the repugnant conclusion. However, from the coalition-dynamics perspective, the whole line of reasoning relies on improper comparison of utility functions of differing coalitions. You determine whether to expand a coalition by checking the utility of that act with respect to the current coalition. A small coalition with a high average preference satisfaction isn’t better or worse than a large one with a medium average preference satisfaction; the two are incomparable. There’s no difference between total utilitarianism and average utilitarianism if applied in the right way. A member is added to a coalition if adding that member benefits the existing coalition (from an appropriate timeless-rule perspective); adding members in this way can’t result in lives barely worth living (at least, not in expectation).

This conclusion is likely weakened by benefit-of-the-doubt style reasoning. Still, the direct argument to the repugnant conclusion is blocked here.


The rules say we have to use consequentialism, but good people are deontologists, and virtue ethics is what actually works.

-Eliezer Yudkowsky

Part of what I’m trying to get at here is that every major candidate for normative ethics makes points which are importantly true, and that they seem easier to reconcile than (I think) is widely recognized.

On the other hand, I’m trying to argue for a very specific version of utilitarianism, which I haven’t even fully worked out. I think there’s a lot of fertile ground here for investigation.