# A Pure Math Argument for Total Utilitarianism

**Summary**: I sketch an argument that population ethics should, in a certain technical sense, be similar to addition. I show that a surprising theorem of Hölder’s implies that this means that we should be total utilitarians.

Addition is a very special operation. Despite the wide variety of esoteric mathematical objects known to us today, none of them have the basic desirable properties of grade-school arithmetic.

This fact was intuited by 19th century philosophers in the development of what we now call “total” utilitarianism. In this ethical system, we can assign each person a real number to indicate their welfare, and the value of an entire population is the sum of each individuals’ welfare.

Using modern mathematics, we can now prove the intuition of Mills and Bentham: because addition is so special, any ethical system which is in a certain technical sense “reasonable” is equivalent to total utilitarianism.

### What do we mean by ethics?

The most basic premise is that we have some way of ordering individual lives.

We don’t need to say how much better some life is than another, we just need to be able to put them in order. We might have some uncertainty as to which of two lives is better:

In this case, we aren’t certain if “Medium” or “Medium 2″ is better. However, we know they’re both better than “Bad” and worse than “Good”.

In the case when we always know which of two lives is better, we say that lives are totally ordered. If there is uncertainty, we say they are lattice ordered.

In either case, we require that the ranking remain consistent when we add people to the population. Here we add a person of “Medium” utility to each population:

The ranking on the right side of the figure above is legitimate because it keeps the order—if some life X is worse than Y, then (X + Medium) is still worse than (Y + Medium). This ranking below for example would fail that:

This ranking is inconsistent because it sometimes says that “Bad” is worse than “Medium” and other times says “Bad” is better than “Medium”. A basic principle of ethics is that rankings should be consistent, and so rankings like the latter are excluded.

### Increasing population size

The most obvious way of defining an ethics of populations is to just take an ordering of individual lives and “glue them together” in an order-preserving way, like I did above. This generates what mathematicians would call the free group. (The only tricky part is that we need good and bad lives to “cancel out”, something which I’ve talked about before.)

It turns out that merely gluing populations together in this way gives us a highly structured object known as a “lattice-ordered group”. Here is a snippet of the resulting lattice:

This ranking is similar to what philosophers often call “Dominance”—if everyone in population P is better off than everyone in population Q, then P is better than Q. However, this is somewhat stronger—it allows us to compare populations of different sizes, something that the traditional dominance criterion doesn’t let us do.

Let’s take a minute to think about what we’ve done. Using only the fact that individuals’ lives can be ordered and the requirement that population ethics respects this ordering in a certain technical sense, we’ve derived a robust population ethics, about which we can prove many interesting things.

### Getting to total utilitarianism

One obvious facet of the above ranking is that it’s not total. For example, we don’t know if “Very Good” is better than “Good, Good”, i.e. if it’s better to have welfare “spread out” across multiple people, or concentrated in one. This obviously prohibits us from claiming that we’ve derived total utilitarianism, because under that system we always know which is better.

However, we can still derive a form of total utilitarianism which is equivalent in a large set of scenarios. To do so, we need to use the idea of an

*embedding*. This is merely a way of assigning each welfare level a number. Here is an example embedding:

Medium = 1

Good = 2

Very Good = 3

Here’s that same ordering, except I’ve tagged each population with the total “utility” resulting from that embedding:

This is clearly not identical to total utilitarianism—“Very Good” has a higher total utility than “Medium, Medium” but we don’t know which is better, for example.

However, this ranking

**never disagrees**with total utilitarianism—there is never a case where P is better than Q yet P has less total utility than Q.

Due to a surprising theorem of Holder which I have discussed before, as long as we disallow “infinitely good” populations, there is always some embedding like this. Thus, we can say that:

Total utilitarianism is the moral “baseline”. There might be circumstances where we are uncertain whether or not P is better than Q, but if we are certain, then it must be that P has greater total utility than Q.

### An application

Here is one consequence of these results. Many people, including myself, have the intuition that inequality is bad. In fact, it is so bad that there are circumstances where increasing equality is good even if people are, on average, worse off.If we accept the premises of this blog post, this intuition simply cannot be correct. If the inequitable society has greater total utility, it must be at least as good as the equitable one.

### Concluding remarks

There are certain restrictions we want the “addition” of a person to a population to obey. It turns out that there is only one way to obey them: by using grade school addition, i.e. total utilitarianism.

[For those interested in the technical result: Holder showed that any archimedean l-group is l-isomorphic to a subgroup of (R,+). The proof can be found in Glass’ *Partially Ordered Groups* as Corollary 4.1.4. This article was originally posted here.]

I reject this premise. Specifically, I believe I have some ordering, and you have some ordering, but strongly suspect those orderings disagree, so don’t think we have one unambiguous joint ordering.

I reject this premise. Specifically, I believe that lives interact. Suppose Bob by himself has a medium quality life, and Alice by herself has a medium quality life. Putting them in a universe together by no means guarantees that each of them will have a medium quality life.

Total utilitarianism is a dead simple conclusion from its premises—you don’t need to bring in group theory. This is only a “pure math” argument for total utilitarianism because you’re talking about the group (R,+) instead of addition, but the two are

the same, and the core of the argument remains the contentious moral premises.I’m not certain this proves what you want it to—it would still hold that you and I are individually total utilitarians. We would just disagree about what those utilities are.

I guess I don’t find this very convincing. Any reasonably complicated argument is going to say “ceteris paribus” at some point—I don’t think you can just reject the conclusion because of this.

I guess I don’t know what you mean. By (R,+) I was trying to refer to addition, so I apologize if this has some other meaning and you thought I was “proving” them equivalent.

I was unclear, and agree that stated rejection is weak. Here’s the stronger version: I see the central premise underlying total and average utilitarianism as “Preferences are determined over life-histories, rather than universe-histories.” If you accept this premise, then you need some way to aggregate life-utilities to get a universe-utility. But if you reject that premise, and see all preferences as over universe-histories, then it’s not clear that an aggregation procedure is necessary.

But look at the horrible world you’ve created!

Anysort of empathy is banned. Bob cannot delight in Alice’s happiness, and Alice cannot suffer because of Bob’s sadness. They cannot even be heartless traders, who are both made wealthier and happier by the other’s existence, even though they are otherwise indifferent to whether or not the other lives or dies.The argument against various repugnant conclusions often

hingeson ceteris paribus being violated. The “mere addition” paradox, for example, is easily dispensed with if each person has a slight negative penalty in their utility function for the number of other people that exist, or that exist below a certain utility threshold, or so on. It’s worth pointing out that many moral sensations seem like they could be internalization of practical constraints- when you talk about adding more and more people to the world, an instinctual backlash against crowding is probably not due to any malevolence, but rather due to the combined effects of traffic and pollution and scarcity which, in the real world, accompany such crowding.I, for one, find it ludicrous to posit that the utility functions of a social species would

notdepend on the sort of society they find themselves in, and that their utilities cannot contain any relative measures.I was objecting to the title, mostly. In my mind, the core of the argument in this post is “if you believe that preferences are expressed over individual lives, and that the number of lives shouldn’t be relevant to preferences, then total utilitarianism must follow,” which I think is a correct argument. But I

disagreethat preferences are expressed over individual lives (or at least I think that is a contentious claim which should not be taken as a premise)Empathy banned? Nature does that for you. ″Brain cells we use to mull over our past must switch off when we do sums, say researchers, who have been spying on a previously inaccessible part of the brain.”″

Don’t arguments related to the badness of inequality often rely on the existence of envy such that if I envy you then my utility goes down as yours increases.

Yes, one way to rescue this is to value equality instrumentally, instead of intrinsically.

(Similarly, I tentatively am an average utilitarian, but I still value population size instrumentally.)

Not sure if that’s an application as much as a tautology. Valuing equality means that you reject the assumption of “we require that the ranking remain consistent when we add people to the population”, so of course

acceptingthat assumption is incompatible with valuing equality.At least, that’s assuming that you value equality as an intrinsic good. As James Miller pointed out, one can also oppose inequality on the ground that it ends up making people’s lives worse off, which is an empirical claim separate from utilitarianism.

It’s a proof, so sure it’s a tautology.

Here’s a better way of masking it though: suppose we believe:

We should be non-sadistic: X < 0 ==> X+Y < Y

Accepting of dominance: X > 0 ==> X+Y > Y

This is exactly what it means to be order preserving, but maybe when phrased this way the result seems more surprising (in the sense that those axioms are harder to refute)?

The only part that makes this total utilitarianism is the ranking you match the embedding to. So what, mathematically, goes wrong if you embed the

averageof your individual numbers into a directed graph like (Very Good) > (Good, Good, Good, Good) ~~ (Good) > (Medium).I think this is a great question, as people who accept the premises of this article are likely to accept some sort of utilitarianism, so a major result is that average utilitarianism doesn’t work.

If we are average utilitarians, then we believe that (2) ~~ (1,2,3). But this must mean that (2,6) ~~ (1,2,3,6) to be order preserving, which is not true. (The former’s average utility is 4, the latter’s 3.)

Ah, great, I understand more now—the linchpin is the premise that what we really want, is to preserve order when we add another person. So what sort of premise would lead to average utilitarianism?

How about—order should be preserved if we shift the zero-point of our happiness measurement. That seems pretty common-sense. And yet it rules out total utilitarianism. (2,2,2) > (5), but (1,1,1) < (4).

Or maybe we could allow average utilitarianism just by weakening the premise—so that we want to preserve the ordering only if we add an average member.

The usual definition of “zero-point” is “it doesn’t matter whether that person exists or not”. By that definition, there is no (universal) zero-point in average utilitarianism. (2,2,0) != (2,2) etc.

By the way, it’s true you can’t shift by a constant in total utilitarianism, but you can scale by a constant/

...Or you could notice that requiring that order be preserved when you add another member is outright assuming that you care about the total and not about the average. You assume the conclusion as one of your premises, making the argument trivial.

Near the beginning you write this:

but then your actual argument includes steps like these:

which, please note, does not amount to any sort of argument that we

mustor evenshouldjust glue values-of-lives together in this sort of way.I do not see any sign in what you have written that Hölder’s theorem is doing any real work for you here. It says that an archimedean totally ordered group is isomorphic to a subset of (R,+) -- but all the contentious stuff about total utilitarianism is

already thereby the time you suppose that utilities form an archimedean totally ordered group and that combining people is just a matter of applying the group operation to their individual utilities.Thanks for the feedback, I should’ve used clearer terminology.

This seems to be the consensus. It’s very surprising to me that we get such a strong result from only the l-group axioms, and the fact that his result is so celebrated seems to indicate that other mathematicians find it surprising too, but the commenters here are rather blase.

Do you think giving examples of how many things completely unrelated to addition are groups (wallpaper groups, rubik’s cube, functions under composition, etc.) would help show that the really restrictive axiom is the archimedean one?

It doesn’t seem to me like the issue is one of terminology, but maybe I’m missing something.

I’m not convinced that it is. The examples you give aren’t ordered groups, after all.

It’s unclear to me whether your main purpose here is to exhibit a surprising fact about ethics (which happens to be proved by means of Hölder’s theorem) or to exhibit an interesting mathematical theorem (which happens to have a nice illustration involving ethics). From the original posting it looked like the former but what you’ve now written seems to suggest the latter.

My impression is that the blasé-ness is aimed more at the alleged application to ethics rather than denying that the theorem, quite mathematical theorem, is interesting and surprising.

Two points:

I don’t know the Holder theorem, but if it actually depends on the lattice being

a group, that includes an extra assumption of the existence of a neutral element and inverse elements. The neutral element would have to be a life of exactly zero value, so that killing that person off wouldn’t matter at all, either positively or negatively. The inverse elements would mean that for every happy live you can imagine an exactly opposite unhappy live, so that killing off both leaves the world exactly as good as before.Proving this might be hard for infinite cases, but it would be trivial for finite generating groups. Most Less Wrong utilitarians would believe there are only finitely many brain states (otherwise simulations are impossible!) and utility is a function of brain states. That would mean only finitely many utility levels and then the result

isobvious. The mathematically interesting part is that it still works if we go infinite on some things but not on others, but that’s not relevant to the general Less Wrong belief system.(Also, here I’m discussing the details of utilitarian systems arguendo, but I’m sticking with the general claim that all of them are mathematically inconsistent or horrible under Arrow’s theorem.)

Z^2 lexically ordered is finitely generated, and can’t be embedded in (R,+). [EDIT: I’m now not sure if you meant “finitely generated” or “finite” here. If it’s the latter, note that any ordered group must be torsion-free, which obviously excludes finite groups.]

But your implicit point is valid (+1) - I should’ve spent more time explaining why this result is surprising. Just about every comment on this article is “this is obvious because ”, which I guess is an indication LWers are so immersed in utilitarianism that counter-examples don’t even come to mind.

I’m a bit out of my depth here. I understood an “ordered group” as a group with an order on its elements. That clearly can be finite. If it’s more than that the question would be why we should assume whatever further axioms characterize it.

from wikipedia:

So if a > 0, a+a > a etc. which results means the group has to be torsion free.

No, the premises don’t necessitate that. “A is at least as good as B”, in our language, is ¬(A < B). But you’ve stated that the lack of an edge from A to B says nothing about whether A < B, now you’re talking like if the premises don’t conclude that A < B they must conclude ¬(A < B), which is kinda affirming the consequent.

It might have been a slip of the tongue, or it might be an indication that you’re overestimating the significance of this alignment. These premises don’t prove that a higher utility inequitable society is at least as good as a lower utility equitable one. They merely don’t disagree.

I may be wrong here, but it looks as though, just as the premises support (A < B) ⇒ (utility(A) < utility(B)), they also support (A < B) ⇒ (normalizedU(A)) < normalizedU(B))), such that normalizedU(World) = sum(log(utility(life)) for life in elements(World)) a perfectly reasonable sort of population utilitarianism where utility monsters are fairly well seen to. In this case equality would usually yield greater betterness than inequality despite it being permitted by the premises.

This is a good point, what I was trying to say is slightly different. Basically, we know that (A < B) ==> (f(A) < f(B)), where f is our order embedding. So it is indeed true that f(A) > f(B) ==> ¬(A < B), by modus tollens.

Yeah, that’s a pretty clever way to get around the constraint. I think my claim “If the inequitable society has greater total utility, it must be at least as good as the equitable one” would still hold though, no?

Well… …. yeah, technically. But for example in the model ( worlds={A, B}, f(W)=sum(log(felicity(e)) for e in population(W)) ), such that world A=(2,2,2,2), and world B=(1,1,1,9). f(A) ≥ f(B), IE ¬(f(A) < f(B)), so ¬(A < B), IE,

the equitable society is also at least as good as the inequitable, higher sum utility one. So if you want to support all embeddings via summation of an increasing function of the units’ QoL.. I’d be surprised if those embeddings had anything in common aside from what the premises required. I suspect anything that agreed with all of them would require all worlds the original premises don’t relate to be equal, IE, ¬(A<B) ∧ ¬(B<A).… looking back, I’m opposed to your implicit definition of a ” “baseline” ”, the original population partial ordering premises are the baseline, here, not total utilitarianism.

In your “Increasing population size”, you put “Medium, Medium” as more valuable than “Medium”, but that doesn’t seem to derive from the premises you’d been using so far (apart from the “glue them together” part). I found that surprising, since you seem to go at bigger lengths to justify other things that seem more self-evident to me.

Would Xodarap agree that the premises are (assuming we have operator overloads for multisets rather than sets)

the better set is a superset (A ⊂ B) ⇒ (A < B)

or everything in the better set that’s not in the worse set is better than everything that’s in the worse set that’s not in the better set, (∀a∈(A\B), b∈(B\A) value(a) < value(b)) ⇒ (A < B)

Yeah, maybe things just get worse and worse as you add more people—but uniformly, so that adding another person preserves ordering :P

If you change the value of “medium” from “1″ to “-5” while leaving the other two states the same, your conclusion no longer holds. For example, on your last graph, (very good, medium) would outrank (very good), even though the former has a value of −2 and the latter of +3. This suggests your system doesn’t allow negative utilities, which seems bad because intuitively it’s possible for utility to sometimes be negative (eg euthanasia arguments).

It must allow negative numbers, or it’s not a group, as (R+,+) is

nota group. (Each element must has an inverse which returns that element to the identity element, which for this particular free group is “no one alive”.)However, I believe this specific issue is solved by the lattice structure. If “medium” were “-5″ instead of “1”, when you add “medium” to any universe, you create a lattice element

belowthe original universe, because we know it is worse than the original universe.This is a good point—I am now regretting not having given more technical details on what it means to be “order preserving”.

The requirement is that

`X > 0 ==> X + Y > Y`

. I’ve generated the graph under the assumption that`Medium > 0`

, which results in (very good, medium) > (very good). Clearly the antecedent doesn’t hold if`Medium < 0`

, in which case the graph would go the other direction, as you point out.First, I think that what you call lattice order is more like partial order, unless you can also show that a join always exists. The pictures have it, but I am not convinced that they constitute a proof.

It looks like all you have “shown” is that if you embed some partial order into a total order, then you can map this total ordering into integers. I am not a mathematician, but this seems rather trivial.

I agree, I didn’t show this. It’s not hard, but it’s a bit of writing to prove that (x1x2 \/ y1y2)=(x1\/y1)(x2\/y2) which inductively shows that this is an l-group.

It’s not a total order, nor is it true that all totally ordered groups can be embedded into Z (consider R^2, lexically ordered, for example. Heck, even R itself can’t be mapped to Z since it’s uncountable!). So not only would this be a non-trivial proof, it would be an impossible one :-)

Not all, just countable...

Z^2 lexically ordered is countable but can’t be embedded in Z.

It seems like your intuition is shared by a lot of LW though—people seem to think it’s “obvious” that these restrictions result in total utilitarianism, even though it’s actually pretty tricky.

Well, yes. The badness of inequality will show up in the utilities. Once you’ve mapped states of society onto utilities, you’ve already taken it into account. You still need an additional empirical argument to say anything interesting (for example, that a society with an equal distribution of wealth is not as good as a society with slightly more total wealth in an inequitable distribution; that may or may not be what you had in mind, but it seemed worth clarifying).

Sure. This is probably not a majority opinion on LW, but there are a lot of people who believe that equality is good even beyond utility maximization (c.f. Rawls). That’s what I was trying to get at when I said: