I’m a preference utilitarian, and as far as I can tell there are no real problems with preference utilitarianism (I’ve heard many criticisms and ime none of them hold up to scrutiny) but I just noticed something concerning. Summary: Desires that aren’t active in the current world diminish the weight of the client’s other desires, which seems difficult to justify and/or my normalisation method is incomplete.
Background on normalisation: utility functions aren’t directly comparable, because the vertical offset and scale of an agent’s utility function are meaningless, you can move them around and they don’t impact the agent’s behaviour in any way (they wouldn’t even know you were doing it)[1], so they don’t correspond to anything real, so at minimum you need to strip those arbitrary numbers out somehow before doing the aggregation. (we don’t have to normalise offset.)
Generally, I’ve been assuming that the right normalisation method is to choose a well informed outcome distribution (say, the beliefs of the institution or superintelligence doing the optimisation), and from that we can assign a width to each of the worlds/histories that the utility function rates, and that lets us integrate the utility function and get the area under the curve, so we normalise the utility function’s scale by dividing everything by that area to make the expected u sum to 1.
But I just noticed something really weird with that.
Say you have a population of people who all just want as much food as they can get, but with diminishing returns. One member of the population also really wants a pet elephant, but it would cost a large amount of food to give them an elephant, so the optimiser decides that there shall be no elephants. But as a result of the utility normalisation, that desire for an elephant, which didn’t even get satisfied or result in any impact on the world or costs imposed onto anyone else, means that due to the conservation of total caring, their caring for food is considered less by the system, and they’ll be allocated significantly less food than the other citizens who didn’t want an elephant.
Wait… Isn’t this solved in normalisation?
From another perspective, clients wouldn’t be punished for desires that aren’t realised, since the optimiser knows that it’s ultimately not going to decide to produce an elephant, the client’s extreme utility spike in worlds where they get an elephant is given zero weight in the normalisation and ignored, meaning their utility function after normalisation will be the same as anyone else’s!
So maybe this is just going to be another one of many counterarguments to baseless objections that I’m going to be too stroppy to write up because there has never been an actual problem and the counterargument is so much more complicated than the bullshit it was trying to address.
Though, there’s another weird thing here. The counterargument is at least incomplete.
If you lived in a deterministic world, then only one outcome/world/history has any probability mass. So everyone’s utility function gets reduced to some scalar on a single possible world, which is then scaled to 1. Which would mean that the optimiser could just do whatever it wants, and because that becomes the only possible outcome, that would always maximise utility after the normalisation, as long as it knows what it’s going to do in advance.
I think assuming the whole world as the optimization scope is a major issue with how expected utility theory is applied to metaethics. It makes more sense if you treat it as a theory of building machines piecemeal, each action adding parts to some machine, with the scope of expected utility (consequences of actions) being a particular machine (the space of all possible machines in some place, or those made of particular parts).
Coordination is then a study of how multiple actors can coordinate assembly of the same shared machine from multiple directions simultaneously. The need to aggregate preferences is a comment on how trying to build different machines while actually building a single machine won’t end coherently. But also, aggregating preferences is mostly necessary for individual projects, rather than globally, you don’t need to aggregate preferences of others as they are talking about yourself, if you yourself are a self-built machine with only a single builder. Similarly, you don’t want too many builders for your own home. Shared machines are more of a communal property, there should be boundaries defining the stakeholders that get to influence the preference over which machine is being built in a particular scope.
there should be boundaries defining the stakeholders that get to influence the preference over which machine is being built in a particular scope.
I’d be able to understand where this was coming from if yall are mostly talking about population ethics, but there was no population ethics in the example I’m discussing (note, the elephant wasn’t a stakeholder. A human can love an elephant, but a human would not lucidly give an elephant unalloyed power, for an elephant probably desires things that would be fucked up to a human, such as the production of bulls in musth, or for practices of infanticide (at much higher rates).)
And I’d argue that population ethics shouldn’t really be a factor. In humans, new humanlike beings should be made stakeholders to the extent that the previous stakeholders want them to be. The current stakeholders (californians) do prefer for new humans to be made stakeholders, so they keep trying to put that into their definition of utilitarianism, but the fact that they want it means that they don’t need to put it in there.
But if it’s not about population ethics then it just seems to me like you’re probably giving up on generalizability too early.
The point is that people shouldn’t be stakeholders of everything, let alone to an equal extent. Instead, particular targets of optimization (much smaller than the whole world) should have much fewer agents with influence over their construction, and it’s only in these contexts that preference aggregation should be considered. When starting with a wider scope of optimization with many stakeholders, it makes more sense to start with dividing it into smaller parts that are each a target of optimization with fewer stakeholders, optimized under preferences aggregated differently from how that settles for the other parts. Expected utility theory makes sense for such smaller projects just as much as it does for the global scope of the whole world, but it breaks normality less when applied narrowly like that than if we try to apply it to the global scope.
The elephant might need to be part of one person’s home, but not a concern for anyone else, and not subject to anyone else’s preferences. That person would need to be able to afford an elephant though, to construct it within the scope of their home. Appealing to others’ preferences about the would-be owner’s desires would place the would-be owner within the others’ optimization scope, make the would-be owner a project that others are working on, make them stakeholders of the would-be owner’s self, rather than remaining a more sovereign entity. If you depend on the concern of others to keep receiving the resources you need, then you are receiving those resources conditionally, rather than allocating the resources you have according to your own volition. Much better for others to contribute to an external project you are also working on, according to what that project is, rather than according to your desires about it.
As an example, normality means a person can, EG, create an elephant within their home, and torture it. Under preference utilitarianism, the torture of the elephant upsets the values of a large number of people, it’s treated as a public bad and has to be taxed as such. Even when we can’t see it happening, it’s still reducing our U, so a boundaryless prefu optimizer would go in there and says to the elephant torturer “you’d have to pay a lot to offset the disvalue this is creating, and you can’t afford it, so you’re going to have to find a better outlet (how about a false elephant who only pretends to be getting tortured)”.
But let’s say there are currently a lot of sadists and they have a lot of power. If I insist on boundaryless aggregation, they may veto the safety deal, so it just wouldn’t do. I’m not sure there are enough powerful sadists for that to happen, political discourse seems to favor publicly defensible positions, but [looks around] I guess there could be. But if there were, it would make sense to start to design the aggregation around… something like the constraints on policing that existed before the aggregation was done. But not that exactly.
I notice it becomes increasingly impractical to assess whether a preference had counterfactual impact on the allocation. For instance if someone had a preference for there to be no elephants, and we get no elephants, partially because of that, but largely because of the food costs, should the person who had that preference receive less food for having already received an absense of elephants?
So I checked in on a previous post about utility normalisation. Normalising by the outcomes expected under random dictator would definitely work better than normalising by the outcomes determined by the optimiser.
But it still seems clearly wrong in its own way. Random dictator was never the BATNA. So this is also optimising for a world or distribution of worlds that isn’t real.
I’m a preference utilitarian, and as far as I can tell there are no real problems with preference utilitarianism (I’ve heard many criticisms and ime none of them hold up to scrutiny) but I just noticed something concerning. Summary: Desires that aren’t active in the current world diminish the weight of the client’s other desires, which seems difficult to justify and/or my normalisation method is incomplete.
Background on normalisation: utility functions aren’t directly comparable, because the vertical offset and scale of an agent’s utility function are meaningless, you can move them around and they don’t impact the agent’s behaviour in any way (they wouldn’t even know you were doing it)[1], so they don’t correspond to anything real, so at minimum you need to strip those arbitrary numbers out somehow before doing the aggregation. (we don’t have to normalise offset.)
Generally, I’ve been assuming that the right normalisation method is to choose a well informed outcome distribution (say, the beliefs of the institution or superintelligence doing the optimisation), and from that we can assign a width to each of the worlds/histories that the utility function rates, and that lets us integrate the utility function and get the area under the curve, so we normalise the utility function’s scale by dividing everything by that area to make the expected u sum to 1.
But I just noticed something really weird with that.
Say you have a population of people who all just want as much food as they can get, but with diminishing returns. One member of the population also really wants a pet elephant, but it would cost a large amount of food to give them an elephant, so the optimiser decides that there shall be no elephants. But as a result of the utility normalisation, that desire for an elephant, which didn’t even get satisfied or result in any impact on the world or costs imposed onto anyone else, means that due to the conservation of total caring, their caring for food is considered less by the system, and they’ll be allocated significantly less food than the other citizens who didn’t want an elephant.
Wait… Isn’t this solved in normalisation?
From another perspective, clients wouldn’t be punished for desires that aren’t realised, since the optimiser knows that it’s ultimately not going to decide to produce an elephant, the client’s extreme utility spike in worlds where they get an elephant is given zero weight in the normalisation and ignored, meaning their utility function after normalisation will be the same as anyone else’s!
So maybe this is just going to be another one of many counterarguments to baseless objections that I’m going to be too stroppy to write up because there has never been an actual problem and the counterargument is so much more complicated than the bullshit it was trying to address.
Though, there’s another weird thing here. The counterargument is at least incomplete.
If you lived in a deterministic world, then only one outcome/world/history has any probability mass.
So everyone’s utility function gets reduced to some scalar on a single possible world, which is then scaled to 1. Which would mean that the optimiser could just do whatever it wants, and because that becomes the only possible outcome, that would always maximise utility after the normalisation, as long as it knows what it’s going to do in advance.
Huh. Well. I’m gonna have to think about that.
there are no utility monsters and there is no misery line (mako’s public roam)
I think assuming the whole world as the optimization scope is a major issue with how expected utility theory is applied to metaethics. It makes more sense if you treat it as a theory of building machines piecemeal, each action adding parts to some machine, with the scope of expected utility (consequences of actions) being a particular machine (the space of all possible machines in some place, or those made of particular parts).
Coordination is then a study of how multiple actors can coordinate assembly of the same shared machine from multiple directions simultaneously. The need to aggregate preferences is a comment on how trying to build different machines while actually building a single machine won’t end coherently. But also, aggregating preferences is mostly necessary for individual projects, rather than globally, you don’t need to aggregate preferences of others as they are talking about yourself, if you yourself are a self-built machine with only a single builder. Similarly, you don’t want too many builders for your own home. Shared machines are more of a communal property, there should be boundaries defining the stakeholders that get to influence the preference over which machine is being built in a particular scope.
I’d be able to understand where this was coming from if yall are mostly talking about population ethics, but there was no population ethics in the example I’m discussing (note, the elephant wasn’t a stakeholder. A human can love an elephant, but a human would not lucidly give an elephant unalloyed power, for an elephant probably desires things that would be fucked up to a human, such as the production of bulls in musth, or for practices of infanticide (at much higher rates).)
And I’d argue that population ethics shouldn’t really be a factor. In humans, new humanlike beings should be made stakeholders to the extent that the previous stakeholders want them to be. The current stakeholders (californians) do prefer for new humans to be made stakeholders, so they keep trying to put that into their definition of utilitarianism, but the fact that they want it means that they don’t need to put it in there.
But if it’s not about population ethics then it just seems to me like you’re probably giving up on generalizability too early.
The point is that people shouldn’t be stakeholders of everything, let alone to an equal extent. Instead, particular targets of optimization (much smaller than the whole world) should have much fewer agents with influence over their construction, and it’s only in these contexts that preference aggregation should be considered. When starting with a wider scope of optimization with many stakeholders, it makes more sense to start with dividing it into smaller parts that are each a target of optimization with fewer stakeholders, optimized under preferences aggregated differently from how that settles for the other parts. Expected utility theory makes sense for such smaller projects just as much as it does for the global scope of the whole world, but it breaks normality less when applied narrowly like that than if we try to apply it to the global scope.
The elephant might need to be part of one person’s home, but not a concern for anyone else, and not subject to anyone else’s preferences. That person would need to be able to afford an elephant though, to construct it within the scope of their home. Appealing to others’ preferences about the would-be owner’s desires would place the would-be owner within the others’ optimization scope, make the would-be owner a project that others are working on, make them stakeholders of the would-be owner’s self, rather than remaining a more sovereign entity. If you depend on the concern of others to keep receiving the resources you need, then you are receiving those resources conditionally, rather than allocating the resources you have according to your own volition. Much better for others to contribute to an external project you are also working on, according to what that project is, rather than according to your desires about it.
But not preserving normality is the appeal :/
As an example, normality means a person can, EG, create an elephant within their home, and torture it. Under preference utilitarianism, the torture of the elephant upsets the values of a large number of people, it’s treated as a public bad and has to be taxed as such. Even when we can’t see it happening, it’s still reducing our U, so a boundaryless prefu optimizer would go in there and says to the elephant torturer “you’d have to pay a lot to offset the disvalue this is creating, and you can’t afford it, so you’re going to have to find a better outlet (how about a false elephant who only pretends to be getting tortured)”.
But let’s say there are currently a lot of sadists and they have a lot of power. If I insist on boundaryless aggregation, they may veto the safety deal, so it just wouldn’t do. I’m not sure there are enough powerful sadists for that to happen, political discourse seems to favor publicly defensible positions, but [looks around] I guess there could be. But if there were, it would make sense to start to design the aggregation around… something like the constraints on policing that existed before the aggregation was done. But not that exactly.
I notice it becomes increasingly impractical to assess whether a preference had counterfactual impact on the allocation. For instance if someone had a preference for there to be no elephants, and we get no elephants, partially because of that, but largely because of the food costs, should the person who had that preference receive less food for having already received an absense of elephants?
So I checked in on a previous post about utility normalisation. Normalising by the outcomes expected under random dictator would definitely work better than normalising by the outcomes determined by the optimiser.
But it still seems clearly wrong in its own way. Random dictator was never the BATNA. So this is also optimising for a world or distribution of worlds that isn’t real.