Forgive me for posting on such an old topic, but I’ve spent the better part of the last few days thinking about this and had to get my thoughts together somewhere. But after some consideration, I must say that I side with the “speckers” as it were.
Let us do away with “specks of dust” and “torture” notions in an attempt to avoid arguing the relative value one might place on either event (i.e. - “rounding to 0/infinity”), and instead focus on the real issue. Replace torture with “event A” as the single most horrific event that can happen to an individual. Replace dust motes with “event B” as the least inconvenience that can still be considered an inconvenience to an individual.
Similarly, let us do away with the notion of a reasoning about a googol, or 3^^^3, or 3^^^^3, as our brains treat each of these numbers as just a featureless conglomeration, regardless of how well we want to pretend we understand the differences in magnitude. Instead, replace this with “n”, with “n” being an arbitrarily large number.
The question then becomes: Is it better to subject a single individual to event A, or n individuals event B?
The utilitarian argument supposes that this question can be equivalently stated as such: Is the total disutility of subjecting n individuals to event B greater than subjecting a single individual to event A?
This seems reasonable enough, given a sufficiently “good” definition of utility. Let us assume that these statements are equivalent and proceed from here.
Let “x” be the disutility value of event A, and “y” be the disutility value of event B. How can we compare x and y? Intuitively, it is obvious that enough additions of “y” would at the very least approach “x”. I.E. - If you were to subject a single individual to event B often enough and for long enough, this would approach being as “bad” as subjecting that same individual to event A. Let “k” be such a number of additions, however large it may be. Thus we have x ~= ky, or y ~= x/k.
But how exactly do we measure x? Does it even have a fixed value? Is the definition of event A even consistent across all individuals (or even the definition of event B for that matter)? Perhaps, perhaps not. But interestingly enough, we’ve already found a reasonable “fixed” definition for event B. This is simply the most trivial inconvenience that can be subjected to an individual which, if repeated enough times, would be approximately equivalent to subjecting them to event A.
So lets choose a scale where event A has disutility 1 for a given individual. Now event B has disutility 1/k for that same individual. The scale may change relative to an individual, but lets make the assumption that this variance is massively dwarfed by the magnitude of “k”, which again seems reasonable. In other words, the difference between the worst that could happen and the most trivial bad thing that can happen is so great, that any variance in an individual’s definition for the worst event is trivial in comparison. “Sacred” vs “mundane”, if you will. At least now we’re only working in one variable.
We also want to compare the utility of both situations over a population. That is, is it better for a single individual to have a disutility value of 1, or for n individuals to have disutility 1/k? And this is where a second problem arises. How exactly does one distribute a utility value across a population? It might be tempting to assume it just divides evenly into the population and is additive across individuals. For instance, one person stubbing their toe twice in a day is approximately equivalent to two people each stubbing their toe once. It may hold for small scale scenarios that we are used to dealing with, but I’m not certain it holds with larger scales.
One questionable example is that of wealth distribution amongst a nation. This is a very complex and nuanced subject, but the underlying issues can be expressed in relatively simple terms. Assume utility here is directly proportional to wealth. If we want to maximize the average wealth of the nation, we could have a plethora of distributions where everyone is in poverty except for a small percentage, who have vast expanses of wealth. This is an entirely valid solution—if we are trying only to maximize average utility.
But certainly, a good measure of utility should also take into account the status of each person with respect to the whole. Few would argue that a system where over half the population exists in poverty is better than a system with almost no poverty. But again, perhaps it is desirable to have some disparity in such a distribution, to entice people to work harder and to contribute more to society as a whole with the prospect of increasing their personal wealth. Perhaps this lends to a more sustainable system.
It is for this reason that I believe a “good” function should not only attempt to maximize the average utility, but also minimize the (negative) deviation away from the group average for each individual—of course taking into account other constraints regarding sustainability, stability, etc.
So let us now consider a “good” utility function that takes as parameters (1) the population size and (2) a list of the average utility scores for of each individual in that population. Since its all the same in this example, we’ll just represent (2) as a single number. Let us call this function F. We can restate the question entirely in mathematical terms.
Is F(n, 1/k) > F(1, 1)?
Perhaps. Perhaps not. It depends mostly on what utility function would be considered “good” in this instance. What no one would disagree on, however, is that:
F(n, k/k) = F(n, 1) >> F(1, 1) for n > 1.
Also, F(n, (k-1)/k >> F(1, 1).
And you can continue this pattern onwards. Consider the general equation:
F(n, m/k) >? F(1, 1) : 1 ⇐ m ⇐ k.
There is certainly a “breaking point” for which m is large enough for the general well being to eclipse the individual.
In other words, there is certainly a point where subjecting each individual in an arbitrarily large population to a massively excessive amount of trivial inconveniences is morally worse than subjecting a single individual to a horrific event. But where is this “breaking point?”
My conclusion is that it depends on the size of the population, the extent to which each individual can reasonably bear excessive trivial burdens, and what criteria is used for the function mapping utility to a population.
I personally find it very hard to swallow that it would ever be a good idea to allow one individual to suffer immeasurably than to subject an immeasurable population to suffer trivially. I would suspect the “breaking point” in the example given would be somewhere between having everyone in the population stub a toe, and having everyone in the population lose a toe.
A relatable example would be distributing stress in a building. It would generally be a better design to allow for each individual piece in the building to be stressed trivially to compensate for one piece bearing a disproportionate load, than to allow for any given piece to break away as necessary to prevent the rest from bearing a trivial load. Certainly there is a point, however, that it becomes undesirable to unnecessarily compromise the overall structure (or perhaps just to introduce unacceptable risks) for the sake of a single piece. Pieces are ultimately replaceable. The whole structure, however, is not.
Is this an instance of me being irrational due to some form of scale insensitivity? Possibly. But to err is human, and I would rather err on the side of compassion than on that of cold calculation. I would also say that some caution should be taken when working with large scales and with continuums. It may be just as irrational to disregard our intuitions in the face of the unknown as to cling blindly to them.
So after a lot of thought, and about 5 months spent reading articles on this site, I think I can see the big picture a little more clearly now. Imagine having a really large collection of grains of sand that are all suspended in the air in the shape of a flat disk. Imagine too, that it takes energy to move any single grain in collection upwards or downwards, but once a grain is moved, it stays put unless moved again.
Just conceptually let grains of sand represent people and grain movement upwards/downwards represent utility/disutility.
What Eliezer is arguing is that, assuming it takes the same amount of energy to move each individual grain of sand, then clearly it takes far less energy to move a single grain of sand very far downward than to move every grain of sand just slightly downward.
What I initially objected to, and what I was trying to intuit through in my first post, is that perhaps it is the case that the energy required to move a single grain of sand is not constant. Perhaps it increases with distance from the disk. I still hold to this objection.
Even if so, it is certainly a valid conclusion to draw that moving a single grain far enough downwards requires less energy than moving every grain slightly downwards. Increasing the number of grains of sand certainly affects this. And no matter what the growth factor may be on the nonlinear amount of energy required to move a single grain very far from its starting point, it is still finite. And you can add enough grains of sand so that the multiplicative factor of moving everything slightly downwards dwarfs the nonlinear growth factor.
Thus, given enough people (and I do stress, enough people), it may be morally worse to subject them all to having a single dust speck enter their eye for a brief moment than to subject a single individual to torture for 50 years.
Its just that our intuition says that for any scale our minds are even close to capable of reasoning about, exponential/super-exponential functions (even with a tiny starting value) greatly dwarf multiplicative scaling functions.
But this intuition cannot be accurate for scales larger than our minds are capable of reasoning about.
Would it be morally acceptable for an immeasurably large population of individuals to allow a single individual to be mercilessly tortured if it would spare the entire population some trivial inconvenience?
I think that example triggers our “not, it would be immoral” intuition, because an immoral population would make the choice against the trivial inconvenience with even greater ease. So, their saying “yes, do please allow some individual to be mercilessly tortured” functions as Bayesian evidence in support of their immorality.
But if you had a large population of people decide between a trivial inconvenience for a different large population of people vs a single individual selected from their own midst to be mercilessly tortured, I’m guessing that the moral intuition would be the exact different, and it would feel immoral for this population to condemn a different large population to such an inconvenience just to benefit one of their own.
So you’re saying it is potentially immoral if the group themselves decide to make the decision, but potentially moral if an outsider of the group makes the exact same decision?
No, I’m not saying that. Don’t start with the ill-defined concept of “moral” and “immoral”—start from the undisputed reality of the matter that people pass moral judgements on actions they hear about.
So I’m saying that when Alice hears of X: group A choosing to sacrifice one of their own rather than inconvenience group B Alice is likely to pass a different moral judgement of that choice than if Alice hears of Y: group A choosing to sacrifice a member of group B rather than inconvenience themselves.
Even though utilitarianism would argue that actions X and Y are equally moral taken by themselves, actions X and Y provide different evidence about whether group A is really acting on moral principles. So if the evolutionary purpose for our moral intuitions is to e.g. identify people as villains or not, action Y triggers our moral intuitions negatively and action X triggers our moral intuitions positively. Because at a deeper level the real purpose of judging the deed is to judge the doer.
Forgive me for posting on such an old topic, but I’ve spent the better part of the last few days thinking about this and had to get my thoughts together somewhere. But after some consideration, I must say that I side with the “speckers” as it were.
Let us do away with “specks of dust” and “torture” notions in an attempt to avoid arguing the relative value one might place on either event (i.e. - “rounding to 0/infinity”), and instead focus on the real issue. Replace torture with “event A” as the single most horrific event that can happen to an individual. Replace dust motes with “event B” as the least inconvenience that can still be considered an inconvenience to an individual.
Similarly, let us do away with the notion of a reasoning about a googol, or 3^^^3, or 3^^^^3, as our brains treat each of these numbers as just a featureless conglomeration, regardless of how well we want to pretend we understand the differences in magnitude. Instead, replace this with “n”, with “n” being an arbitrarily large number.
The question then becomes: Is it better to subject a single individual to event A, or n individuals event B?
The utilitarian argument supposes that this question can be equivalently stated as such: Is the total disutility of subjecting n individuals to event B greater than subjecting a single individual to event A?
This seems reasonable enough, given a sufficiently “good” definition of utility. Let us assume that these statements are equivalent and proceed from here.
Let “x” be the disutility value of event A, and “y” be the disutility value of event B. How can we compare x and y? Intuitively, it is obvious that enough additions of “y” would at the very least approach “x”. I.E. - If you were to subject a single individual to event B often enough and for long enough, this would approach being as “bad” as subjecting that same individual to event A. Let “k” be such a number of additions, however large it may be. Thus we have x ~= ky, or y ~= x/k.
But how exactly do we measure x? Does it even have a fixed value? Is the definition of event A even consistent across all individuals (or even the definition of event B for that matter)? Perhaps, perhaps not. But interestingly enough, we’ve already found a reasonable “fixed” definition for event B. This is simply the most trivial inconvenience that can be subjected to an individual which, if repeated enough times, would be approximately equivalent to subjecting them to event A.
So lets choose a scale where event A has disutility 1 for a given individual. Now event B has disutility 1/k for that same individual. The scale may change relative to an individual, but lets make the assumption that this variance is massively dwarfed by the magnitude of “k”, which again seems reasonable. In other words, the difference between the worst that could happen and the most trivial bad thing that can happen is so great, that any variance in an individual’s definition for the worst event is trivial in comparison. “Sacred” vs “mundane”, if you will. At least now we’re only working in one variable.
We also want to compare the utility of both situations over a population. That is, is it better for a single individual to have a disutility value of 1, or for n individuals to have disutility 1/k? And this is where a second problem arises. How exactly does one distribute a utility value across a population? It might be tempting to assume it just divides evenly into the population and is additive across individuals. For instance, one person stubbing their toe twice in a day is approximately equivalent to two people each stubbing their toe once. It may hold for small scale scenarios that we are used to dealing with, but I’m not certain it holds with larger scales.
One questionable example is that of wealth distribution amongst a nation. This is a very complex and nuanced subject, but the underlying issues can be expressed in relatively simple terms. Assume utility here is directly proportional to wealth. If we want to maximize the average wealth of the nation, we could have a plethora of distributions where everyone is in poverty except for a small percentage, who have vast expanses of wealth. This is an entirely valid solution—if we are trying only to maximize average utility.
But certainly, a good measure of utility should also take into account the status of each person with respect to the whole. Few would argue that a system where over half the population exists in poverty is better than a system with almost no poverty. But again, perhaps it is desirable to have some disparity in such a distribution, to entice people to work harder and to contribute more to society as a whole with the prospect of increasing their personal wealth. Perhaps this lends to a more sustainable system.
It is for this reason that I believe a “good” function should not only attempt to maximize the average utility, but also minimize the (negative) deviation away from the group average for each individual—of course taking into account other constraints regarding sustainability, stability, etc.
So let us now consider a “good” utility function that takes as parameters (1) the population size and (2) a list of the average utility scores for of each individual in that population. Since its all the same in this example, we’ll just represent (2) as a single number. Let us call this function F. We can restate the question entirely in mathematical terms.
Is F(n, 1/k) > F(1, 1)?
Perhaps. Perhaps not. It depends mostly on what utility function would be considered “good” in this instance. What no one would disagree on, however, is that:
F(n, k/k) = F(n, 1) >> F(1, 1) for n > 1.
Also, F(n, (k-1)/k >> F(1, 1).
And you can continue this pattern onwards. Consider the general equation:
F(n, m/k) >? F(1, 1) : 1 ⇐ m ⇐ k.
There is certainly a “breaking point” for which m is large enough for the general well being to eclipse the individual.
In other words, there is certainly a point where subjecting each individual in an arbitrarily large population to a massively excessive amount of trivial inconveniences is morally worse than subjecting a single individual to a horrific event. But where is this “breaking point?”
My conclusion is that it depends on the size of the population, the extent to which each individual can reasonably bear excessive trivial burdens, and what criteria is used for the function mapping utility to a population.
I personally find it very hard to swallow that it would ever be a good idea to allow one individual to suffer immeasurably than to subject an immeasurable population to suffer trivially. I would suspect the “breaking point” in the example given would be somewhere between having everyone in the population stub a toe, and having everyone in the population lose a toe.
A relatable example would be distributing stress in a building. It would generally be a better design to allow for each individual piece in the building to be stressed trivially to compensate for one piece bearing a disproportionate load, than to allow for any given piece to break away as necessary to prevent the rest from bearing a trivial load. Certainly there is a point, however, that it becomes undesirable to unnecessarily compromise the overall structure (or perhaps just to introduce unacceptable risks) for the sake of a single piece. Pieces are ultimately replaceable. The whole structure, however, is not.
Is this an instance of me being irrational due to some form of scale insensitivity? Possibly. But to err is human, and I would rather err on the side of compassion than on that of cold calculation. I would also say that some caution should be taken when working with large scales and with continuums. It may be just as irrational to disregard our intuitions in the face of the unknown as to cling blindly to them.
So after a lot of thought, and about 5 months spent reading articles on this site, I think I can see the big picture a little more clearly now. Imagine having a really large collection of grains of sand that are all suspended in the air in the shape of a flat disk. Imagine too, that it takes energy to move any single grain in collection upwards or downwards, but once a grain is moved, it stays put unless moved again.
Just conceptually let grains of sand represent people and grain movement upwards/downwards represent utility/disutility.
What Eliezer is arguing is that, assuming it takes the same amount of energy to move each individual grain of sand, then clearly it takes far less energy to move a single grain of sand very far downward than to move every grain of sand just slightly downward.
What I initially objected to, and what I was trying to intuit through in my first post, is that perhaps it is the case that the energy required to move a single grain of sand is not constant. Perhaps it increases with distance from the disk. I still hold to this objection.
Even if so, it is certainly a valid conclusion to draw that moving a single grain far enough downwards requires less energy than moving every grain slightly downwards. Increasing the number of grains of sand certainly affects this. And no matter what the growth factor may be on the nonlinear amount of energy required to move a single grain very far from its starting point, it is still finite. And you can add enough grains of sand so that the multiplicative factor of moving everything slightly downwards dwarfs the nonlinear growth factor.
Thus, given enough people (and I do stress, enough people), it may be morally worse to subject them all to having a single dust speck enter their eye for a brief moment than to subject a single individual to torture for 50 years.
Its just that our intuition says that for any scale our minds are even close to capable of reasoning about, exponential/super-exponential functions (even with a tiny starting value) greatly dwarf multiplicative scaling functions.
But this intuition cannot be accurate for scales larger than our minds are capable of reasoning about.
I understand now: “Shut up and multiply.”
To flip the question on its head:
Would it be morally acceptable for an immeasurably large population of individuals to allow a single individual to be mercilessly tortured if it would spare the entire population some trivial inconvenience?
I think that example triggers our “not, it would be immoral” intuition, because an immoral population would make the choice against the trivial inconvenience with even greater ease. So, their saying “yes, do please allow some individual to be mercilessly tortured” functions as Bayesian evidence in support of their immorality.
But if you had a large population of people decide between a trivial inconvenience for a different large population of people vs a single individual selected from their own midst to be mercilessly tortured, I’m guessing that the moral intuition would be the exact different, and it would feel immoral for this population to condemn a different large population to such an inconvenience just to benefit one of their own.
So you’re saying it is potentially immoral if the group themselves decide to make the decision, but potentially moral if an outsider of the group makes the exact same decision?
No, I’m not saying that. Don’t start with the ill-defined concept of “moral” and “immoral”—start from the undisputed reality of the matter that people pass moral judgements on actions they hear about.
So I’m saying that when Alice hears of
X: group A choosing to sacrifice one of their own rather than inconvenience group B
Alice is likely to pass a different moral judgement of that choice than if Alice hears of
Y: group A choosing to sacrifice a member of group B rather than inconvenience themselves.
Even though utilitarianism would argue that actions X and Y are equally moral taken by themselves, actions X and Y provide different evidence about whether group A is really acting on moral principles. So if the evolutionary purpose for our moral intuitions is to e.g. identify people as villains or not, action Y triggers our moral intuitions negatively and action X triggers our moral intuitions positively. Because at a deeper level the real purpose of judging the deed is to judge the doer.