That seems like a sensible way to set up the no-trade situation. Presumably the connection to trade is via some theorem that trade will result in pareto-optimal situations, therefore making comparative advantage applicable.
But I still wonder what the exact theorem is.
Then if you want to describe the pareto frontier that maximizes the amount of goods produced, it involves each person producing a good where they have a favorable ratio of how much of that good they can produce vs. how much of other goods-being-produced they can produce.
What do you mean by “favorable”? Is there some threshhold?
What do you mean by “involves each person producing”? Does it mean that they’ll exclusively produce such goods? Or does it mean they’ll produce at least some of such goods?
Correction: I now see that my formulation turns the question of completeness into a question of transitivity of indifference. An “incomplete” preference relation should not be understood as one in which allows strict preferences to go in both directions (which is what I interpret them as, above) but rather, a preference relation in which the ≤ relation (and hence the ∼ relation) is not transitive.
In this case, we can distinguish between ~ and “gaps”, IE, incomparable A and B. ~ might be transitive, but this doesn’t bridge across the gaps. So we might have a preference chain A>B>C and a chain X>Y>Z, but not have any way to compare between the two chains.
In my formulation, which lumps together indifference and gaps, we can’t have this two-chain situation. If A~X, then we must have A>Y, since X>Y, by transitivity of ≥.
So what would be a completeness violation in the wikipedia formulation becomes a transitivity violation in mine.
But notice that I never argued for the transitivity of ~ or ≥ in my comment; I only argued for the transitivity of >.
I don’t think a money-pump argument can be offered for transitivity here.
However, I took a look at the paper by Aumann which you cited, and I’m fairly happy with the generalization of VNM therein! Dropping uniqueness does not seem like a big cost. This seems like more of an example of John Wentworth’s “boilerplate” point, rather than a counterexample.
This was helpful, but I’m still somewhat confused. Conspicuously absent from your post is an outright statement of what comparative advantage is—particularly, what the concept and theorem is supposed to be in the general case with more than two resources and more than two agents.
The question is: who and where do I order to grow bananas, and who and where do I order to build things? To maximize construction, I will want to order people with the largest comparative advantage in banana-growing to specialize in banana-growing, and I will want to order those bananas to be grown on the islands with the largest comparative advantage in banana-growing. (In fact, this is not just relevant to maximization of construction—it applies to pareto-optimal production in general.)
Could you elaborate on this by providing the general statement rather than only the example?
Before reading your post, I had in mind two different uses for the concept:
Comparative advantage is often used as an argument for free trade. Dynomight’s post seems to provide a sufficient counterargument to this, in its example illustrating how with more than 2 players, opening up a trade route may not be a Pareto improvement (may not be a good thing for everyone).
Comparative advantage is sometimes used in career advice, EG, “find your comparative advantage”. This is the case I focus on in the comment I linked to illustrating my confusion. What advice is actually offered? Are agents supposed to produce and sell things which they have a comparative advantage in? Not so much. It seems that advice coming from the concept is actually extremely weak in the case of a market with more than two goods.
Your post gave me a third potential application, namely, a criterion for when trade may occur at all. This expanded my understanding of the concept considerably. It’s clear that where no comparative advantage exists, no trade makes sense. A country that’s bad at producing everything might want to buy stuff from a country that’s just 10x better, but to do so they’d at least need a comparative advantage in producing money (which doesn’t really make sense; money isn’t something you produce). (Or putting it a different way: their money would soon be used up.)
But then you apply the concept of comparative advantage to a case where there isn’t any trade at all. What would you give as your general statement of the concept and the theorem you’re applying?
I happened upon this old thread, and found the discussion intriguing. Thanks for posting these references! Unless I’m mistaken, it sounds like you’ve discussed this topic a lot on LW but have never made a big post detailing your whole perspective. Maybe that would be useful! At least I personally find discussions of applicability/generalizability of VNM and other rationality axioms quite interesting.
Indeed, I think I recently ran into another old comment of yours in which you made a remark about how Dutch Books only hold for repeated games? I don’t recall the details now.
I have some comments on the preceding discussion. You said:
It would be rather audacious to claim that this is true for each of the four axioms. For instance, do please demonstrate how you would Dutch-book an agent that does not conform to the completeness axiom!
For me, it seems that transitivity and completeness are on an equally justified footing, based on the classic money-pump argument.
Just to keep things clear, here is how I think about the details. There are outcomes. Then there are gambles, which we will define recursively. An outcome counts as a gamble for the sake of the base case of our recursion. For gambles A and B, pA+(1-p)B also counts as a gamble, where p is a real number in the range [0,1].
Now we have a preference relation > on our gambles. I understand its negation to be ≤; saying ¬(A>B) is the same thing as A≤B. The indifference relation, A∼B is just the same thing as (A≤B)&(B≤A).
This is different than the development on wikipedia, where ~ is defined separately. But I think it makes more sense to define > and then define ~ from that. A>B can be understood as “definitely choose A when given the choice between A and B”. ~ then represents indifference as well as uncertainty like the kind you describe when you discuss bounded rationality.
From this starting point, it’s clear that either A<B, or B<A, or A~B. This is just a way of saying “either A<B or B<A or neither”. What’s important about the completeness axiom is the assumption that exactly one of these hold; this tells us that we cannot have both A<B and B<A.
But this is practically the same as circular preferences A<B<C<A, which transitivity outlaws. It’s just a circle of length 2.
The classic money-pump against circularity is that if we have circular preferences, someone can charge us for making a round trip around the circle, swapping A for B for C for A again. They leave us in the same position we started, less some money. They can then do this again and again, “pumping” all the money out of us.
Personally I find the this argument extremely metaphysically weird, for several reasons.
The money-pumper must be God, to be able to swap arbitrary A for B, and B for C, etc.
But furthermore, the agent must not understand the true nature of the money-pumper. When God asks about swapping A for B, the agent thinks it’ll get B in the end, and makes the decision accordingly. Yet, God proceeds to then ask a new question, offering to swap B for C. So God doesn’t actually put the agent in universe B; rather, God puts the agent in “B+God”, a universe with the possibility of B, but also a new offer from God, namely, to move on to C. So God is actually fooling the agent, making an offer of B but really giving the agent something different than B. Bad decision-making should not count against the agent if the agent was mislead in such a manner!
It’s also pretty weird that we can end up “in the same situation, but with less money”. If the outcomes A,B,C were capturing everything about the situation, they’d include how much money we had!
I have similar (but less severe) objections to Dutch-book arguments.
However, I also find the argument extremely practically applicable, so much so that I can excuse the metaphysical weirdness. I have come to think of Dutch-book and money-pump arguments as illustrative of important types of (in)consistency rather than literal arguments.
OK, why do I find money-pumps practical?
Simply put, if I have a loop in my preferences, then I will waste a lot of time deliberating. The real money-pump isn’t someone taking advantage of me, but rather, time itself passing.
What I find is that I get stuck deliberating until I can find a way to get rid of the loop. Or, if I “just choose randomly”, I’m stuck with a yucky dissatisfied feeling (I have regret, because I see another option as better than the one I chose).
This is equally true of three-choice loops and two-choice loops. So, transitivity and completeness seem equally well-justified to me.
Stuart Armstrong argues that there is a weak money pump for the independence axiom. I made a very technical post (not all of which seems to render correctly on LessWrong :/) justifying as much as I could with money-pump/dutch-book arguments, and similarly got everything except continuity.
I regard continuity as not very theoretically important, but highly applicable in practice. IE, I think the pure theory of rationality should exclude continuity, but a realistic agent will usually have continuous values. The reason for this is again because of deliberation time.
If we drop continuity, we get a version of utility theory with infinite and infinitesimal values. This is perfectly fine, has the advantage of being more general, and is in some sense more elegant. To reference the OP, continuity is definitely just boilerplate; we get a nice generalization if we want to drop it.
However, a real agent will ignore its own infinitesimal preferences, because it’s not worth spending time thinking about that. Indeed, it will almost always just think about the largest infinity in its preferences. This is especially true if we assume that the agent places positive probability on a really broad class of things, which again seems true of capable agents in practice. (IE, if you have infinities in your values, and a broad probability distribution, you’ll be Pascal-mugged—you’ll only think of the infinite payoffs, neglecting finite payoffs.)
So all of the axioms except independence have what appear to me to be rather practical justifications, and independence has a weak money-pump justification (which may or may not translate to anything practical).
I once had a discussion with Scott G and Eli Tyre about this. We decided that the “real thing” was basically where you should end up in the complicated worker/job optimization problem, and there were more or less two ways to try and approximate it:
Supposing everyone else has already chosen their optimal spot, what still needs doing? What can I best contribute? This is sorta easy, because you just look around at what needs doing, combine this with what you know about how capable you are at contributing, and you get an estimate of how much you’d contribute in each place. Then you go to the place with the highest number. [modulo gut feelings, intrinsic motivation, etc]
Supposing you choose first, how could everyone else move around you to create an optimal configuration? You then go do the thing which implies the best configuration. This seems much harder, but might be necessary for people who provide a lot of value (and therefore what they do has a big influence on what other people should do), particularly in small teams where a near-optimal reaction to your choice is feasible.
OK. It seems there are results for more than 2 goods, but the results are quite weak:
Thus, if both relative prices are below the relative prices in autarky, we can rule out the possibility that both goods 1 and 2 will be imported—but we cannot rule out the possibility that one of them will be imported. In other words, once we leave the two-good case, we cannot establish detailed predictive relations saying that if the relative price of a traded good exceeds the relative price of that good in autarky, then that good will be exported by the country in question. It follows that any search for a strong theorem along the lines of our first proposition earlier is bound to fail. The most one can hope for is a correlation between the pattern of trade and differences in autarky prices. Dixit, Avinash; Norman, Victor (1980). Theory of International Trade: A Dual, General Equilibrium Approach. Cambridge: Cambridge University Press. p. 8
Thus, if both relative prices are below the relative prices in autarky, we can rule out the possibility that both goods 1 and 2 will be imported—but we cannot rule out the possibility that one of them will be imported. In other words, once we leave the two-good case, we cannot establish detailed predictive relations saying that if the relative price of a traded good exceeds the relative price of that good in autarky, then that good will be exported by the country in question. It follows that any search for a strong theorem along the lines of our first proposition earlier is bound to fail. The most one can hope for is a correlation between the pattern of trade and differences in autarky prices.
Dixit, Avinash; Norman, Victor (1980). Theory of International Trade: A Dual, General Equilibrium Approach. Cambridge: Cambridge University Press. p. 8
Here’s something I don’t get about comparative advantage.
The implied advice, as far as I understand it, is to check which good you have a comparative advantage in producing, and offer that good to the market.
But suppose that there are a lot more goods and a lot more participants in the market.
For any one individual, given fixed prices and supply of everyone else, it sounds like we can formulate the production and trade strategy as a linear programming problem:
We have some maximum amount of time. That’s a linear constraint.
We can allocate time to different tasks.
The output of the tasks are assumed to be linear in time.
The tasks produce different goods.
These goods all have different prices on the market.
We might have some basic needs, like the 10 bananas and 10 coconuts. That’s a constraint.
We might also have desires, like not working, or we might desire some goods. That’s our linear programming objective.
OK. So we can solve this as a linear program.
But… linear programs don’t have some nice closed-form solution. The simplex algorithm can solve them efficiently in practice, but that’s very different from an easy formula like “produce the good with the highest comparative advantage”.
And that’s just solving the problem for one player, assuming the other players have fixed strategies. More generally, we have to anticipate the rest of the market as well. I don’t even know if that can be solved efficiently, via linear programming or some other technique.
Is “produce where you have comparative advantage” really very useful advice for more complex cases?
Wikipedia starts out describing comparative advantage as a law:
The law of comparative advantage describes how, under free trade, an agent will produce more of and consume less of a good for which they have a comparative advantage.
But no precise mathematical law is ever stated, and the law is only justified with examples (specifically, two-player, two-commodity examples). Furthermore, I only ever recall seeing comparative advantage explained with examples, rather than being stated as a theorem. (Although this may be because I never got past econ 101.)
This makes it hard to know what the claimed law even is, precisely. “produce more and consume less”? In comparison to what?
One spot on Wikipedia says:
Skeptics of comparative advantage have underlined that its theoretical implications hardly hold when applied to individual commodities or pairs of commodities in a world of multiple commodities.
Although, without citation, so I don’t know where to find the details of these critiques.
I’m not sure I follow that it has to be linear—I suspect higher-order polynomials will work just as well. Even if linear, there are a very wide range of transformation matrices that can be reasonably chosen, all of which are compatible with not blocking Pareto improvements and still not agreeing on most tradeoffs.
Well, I haven’t actually given the argument that it has to be linear. I’ve just asserted that there is one, referencing Harsanyi and complete class arguments. There are a variety of related arguments. And these arguments have some assumptions which I haven’t been emphasizing in our discussion.
Here’s a pretty strong argument (with correspondingly strong assumptions).
Suppose each individual is VNM-rational.
Suppose the social choice function is VNM-rational.
Suppose that we also can use mixed actions, randomizing in a way which is independent of everything else.
Suppose that the social choice function has a strict preference for every Pareto improvement.
Also suppose that the social choice function is indifferent between two different actions if every single individual is indifferent.
Also suppose the situation gives a nontrivial choice with respect to every individual; that is, no one is indifferent between all the options.
By VNM, each individual’s preferences can be represented by a utility function, as can the preferences of the social choice function.
Imagine actions as points in preference-space, an n-dimensional space where n is the number of individuals.
By assumption #5, actions which map to the same point in preference-space must be treated the same by the social choice function. So we can now imagine the social choice function as a map from R^n to R.
VNM on individuals implies that the mixed action p * a1 + (1-p) * a2 is just the point p of the way on a line between a1 and a2.
VNM implies that the value the social choice function places on mixed actions is just a linear mixture of the values of pure actions. But this means the social choice function can be seen as an affine function from R^n to R. Of course since utility functions don’t mind additive constants, we can subtract the value at the origin to get a linear function.
But remember that points in this space are just vectors of individual’s utilities for an action. So that means the social choice function can be represented as a linear function of individual’s utilities.
So now we’ve got a linear function. But I haven’t used the pareto assumption yet! That assumption, together with #6, implies that the linear function has to be increasing in every individual’s utility function.
Now I’m lost again. “you should have a preference over something where you have no preference” is nonsense, isn’t it? Either the someone in question has a utility function which includes terms for (their beliefs about) other agents’ preferences (that is, they have a social choice function as part of their preferences), in which case the change will ALREADY BE positive for their utility, or that’s already factored in and that’s why it nets to neutral for the agent, and the argument is moot.[...]
Now I’m lost again. “you should have a preference over something where you have no preference” is nonsense, isn’t it? Either the someone in question has a utility function which includes terms for (their beliefs about) other agents’ preferences (that is, they have a social choice function as part of their preferences), in which case the change will ALREADY BE positive for their utility, or that’s already factored in and that’s why it nets to neutral for the agent, and the argument is moot.
If you’re just saying “people don’t understand their own utility functions very well, and this is an intuition pump to help them see this aspect”, that’s fine, but “theorem” implies something deeper than that.
Indeed, that’s what I’m saying. I’m trying to separately explain the formal argument, which assumes the social choice function (or individual) is already on board with Pareto improvements, and the informal argument to try to get someone to accept some form of preference utilitarianism, in which you might point out that Pareto improvements benefit others at no cost (a contradictory and pointless argument if the person already has fully consistent preferences, but an argument which might realistically sway somebody from believing that they can be indifferent about a Pareto improvement to believing that they have a strict preference in favor of them).
But the informal argument relies on the formal argument.
Maybe it’s better phrased as “a CIRL agent has a positive incentive to allow shutdown iff it’s uncertain [or the human has a positive term for it being shut off]”, instead of “a machine” has a positive incentive iff.
I would further charitably rewrite it as:
“In chapter 16, we analyze an incentive which a CIRL agent has to allow itself to be switched off. This incentive is positive if and only if it is uncertain about the human objective.”
A CIRL agent should be capable of believing that humans terminally value pressing buttons, in which case it might allow itself to be shut off despite being 100% sure about values. So it’s just the particular incentive examined that’s iff.
Sure, but the theorem he proves in the setting where he proves it probably is if and only if. (I have not read the new edition, so, not really sure.)
It also seems to me like Stuart Russell endorses the if-and-only-if result as what’s desirable? I’ve heard him say things like “you want the AI to prevent its own shutdown when it’s sufficiently sure that it’s for the best”.
Of course that’s not technically the full if-and-only-if (it needs to both be certain about utility and think preventing shutdown is for the best), but it suggests to me that he doesn’t think we should add more shutoff incentives such as AUP.
Keep in mind that I have fairly little interaction with him, and this is based off of only a few off-the-cuff comments during CHAI meetings.
My point here is just that it seems pretty plausible that he meant “if and only if”.
Notice however that for Logical Counterfactual Mugging to be well defined, you need to define what Omega is doing when it is making its prediction. In Counterfactuals for Perfect Predictors, I explained that when dealing with perfect predictors, often the counterfactual would be undefined. For example, in Parfit’s Hitchhicker a perfect predictor would never give a lift to someone who never pays in town, so it isn’t immediately clear that predicting what such a person would do in town involves predicting something coherent.
Another approach is to change the example to remove the objection.
The poker-like game at the end of Decision Theory (I really titled that post simply “decision theory”? rather vague, past-me...) is isomorphic to counterfactual mugging, but removes some distractions, such as “how does Omega take the counterfactual”.
Alice receives a High or Low card. Alice can reveal the card to Bob. Bob then states a probability p for Alice’s card being Low. Bob’s incentives just encourage him to report honest beliefs. Alice loses p2.
When Alice gets a Low card, she can just reveal it to Bob, and get the best possible outcome. But this strategy means Bob will know if she has a high card, giving her the worst possible outcome in that case. In order to successfully bluff, Alice has to sometimes act like she has different cards than she has. And indeed, the optimal strategy for Alice in this case is to never show her cards.
This example will get less objections from people, because it is grounded in a very realistic game. Playing poker well requires this kind of reasoning. The powerful predictor is replaced with another player. We can still technically ask “how is the other player deal with undefined counterfactuals?”, but we can skip over that by just reasoning about strategies in the usual game-theoretic way—if Alice’s strategy were to reveal low cards, then Bob could always call high cards.
We can then insert logical uncertainty by stipulating that Alice gets her card pseudorandomly, but neither Alice nor Bob can predict the random number generator.
Not sure yet whether you can pull a similar trick with Counterfactual Prisoner’s Dilemma.
== nitpicks ==
Applying the Counterfactual Prisoner’s Dilemma to Logical Uncertainty
Why isn’t the title “applying logical uncertainty to the counterfactual prisoner’s dilemma”? Or “A Logically Uncertain Version of Counterfactual Prisoner’s Dilemma”? I don’t see how you’re applying CPD to LU.
The Counterfactual Prisoner’s Dilemma is a symmetric version of the original
Symmetric? The original is already symmetric. But “symmetric” is a concept which applies to multi-player games. Counterfactual PD makes PD into a one-player game. Presumably you meant “a one-player version”?
where regardless of whether the coin comes up heads or tails you are asked to pay $100 and you are then paid $10,000 if Omega predicts that you would have paid if the coin had come up the other way. If you decide updatelesly you will always received $9900, while if you decide updatefully, then you will receive $0.
This is only true if you use classical CDT, yeah? Whereas EDT can get $9900 in both cases, provided it believes in a sufficient correlation between what it does upon seeing heads vs tails.
So unlike Counterfactual Mugging, pre-committing to pay ensures a better outcome regardless of how the coin flip turns out, suggesting that focusing only on your particular probability branch is mistaken.
I don’t get what you meant by the last part of this sentence. Counterfactual Mugging already suggests that focusing only on your particular branch is mistaken. If someone bought that you should pay up in this problem but not in counterfactual mugging, I expect that person to say something like “because in this case that strategy is guaranteed better even in this branch”—hence, they’re not necessarily convinced to look at other branches. So I don’t think this example necessarily argues for looking at other branches.
Also, why is this posted as a question?
If it’s certain about the human objective, then it would be certain that it knows what’s best, so there would be no reason to let a human turn it off. (Unless humans have a basic preference to turn it off, in which case it could prefer to be shut off.)
I thought that the end result is that since any change would not be a pareto improvement the function can’t recommend any change so it must be completely ambivalent about everything thus is the constant function of every option being of utility 0.Pareto-optimality says that if there is a mass murderer that wants to kill as many people as possible then you should not do a choice that lessens the amount of people killed ie you should not oppose the mass murderer.
I thought that the end result is that since any change would not be a pareto improvement the function can’t recommend any change so it must be completely ambivalent about everything thus is the constant function of every option being of utility 0.
Pareto-optimality says that if there is a mass murderer that wants to kill as many people as possible then you should not do a choice that lessens the amount of people killed ie you should not oppose the mass murderer.
Ah, I should have made more clear that it’s a one-way implication: if it’s a Pareto improvement, then the social choice function is supposed to prefer it. Not the other way around.
A social choice function meeting that minimal requirement can still do lots of other things. So it could still oppose a mass murderer, so long as mass-murder is not itself a Pareto improvement.
Me too! I’m having trouble seeing how that version of the pareto-preference assumption isn’t already assuming what you’re trying to show, that there is a universally-usable social aggregation function.
I’m not sure what you meant by “universally usable”, but I don’t really argue anything about existence, only what it has to look like if it exists. It’s easy enough to show existence, though; just take some arbitrary sum over utility functions.
Or maybe I misunderstand what you’re trying to show—are you claiming that there is a (or a family of) aggregation function that are privileged and should be used for Utilitarian/Altruistic purposes?
Yep, at least in some sense. (Not sure how “privileged” they are in your eyes!) What the Harsanyi Utilitarianism Theorem shows is that linear aggregations are just such a distinguished class.
And now we have to specify which agent’s preferences we’re talking about when we say “support”.[...]The assumption I missed was that there are people who claim that a change is = for them, but also they support it. I think that’s a confusing use of “preferences”.
And now we have to specify which agent’s preferences we’re talking about when we say “support”.
The assumption I missed was that there are people who claim that a change is = for them, but also they support it. I think that’s a confusing use of “preferences”.
That’s why, in the post, I moved to talking about “a social choice function”—to avert that confusion.
So we have people, who are what we define Pareto-improvement over, and then we have the social choice function, which is what we suppose must > every Pareto improvement.
Then we prove that the social choice function must act like it prefers some weighted sum of the people’s utility functions.
But this really is just to avert a confusion. If we get someone to assent to both VNM and strict preference of Pareto improvements, then we can go back and say “by the way, the social choice function was secretly you” because that person meets the conditions of the argument.
There’s no contradiction because we’re not secretly trying to sneak in a =/> shift; the person has to already prefer for Pareto improvements to happen.
If it’s > for the agent in question, they clearly support it. If it’s =, they don’t oppose it, but don’t necessarily support it.
Right, so, if we’re applying this argument to a person rather than just some social choice function, then it has to be > in all cases.
If you imagine that you’re trying to use this argument to convince someone to be utilitarian, this is the step where you’re like “if it doesn’t make any difference to you, but it’s better for them, then wouldn’t you prefer it to happen?”
Yes, it’s trivially true that if it’s = for them then it must not be >. But humans aren’t perfectly reflectively consistent. So, what this argument step is trying to do is engage with the person’s intuitions about their preferences. Do they prefer to make a move that’s (at worst) costless to them and which is beneficial to someone else? If yes, then they can be engaged with the rest of the argument.
To put it a different way: yes, we can’t just assume that an agent strictly prefers for all Pareto-improvements to happen. But, we also can’t just assume that they don’t, and dismiss the argument on those grounds. That agent should figure out for itself whether it has a strict preference in favor of Pareto improvements.
When I say “Pareto optimality is min-bar for agreement”, I’m making a distinction between literal consensus, where all agents actually agree to a change, and assumed improvement, where an agent makes a unilateral (or population-subset) decision, and justifies it based on their preferred aggregation function. Pareto optimality tells us something about agreement. It tells us nothing about applicability of any possible aggregation function.
Ah, ok. I mean, that makes perfect sense to me and I agree. In this language, the idea of the Pareto assumption is that an aggregation function should at least prefer things which everyone agrees about, whatever else it may do.
In my mind, we hit the same comparability problem for Pareto vs non-Pareto changes. Pareto-optimal improvements, which require zero interpersonal utility comparisons (only the sign matters, not the magnitude, of each affected entity’s preference), teach us nothing about actual tradeoffs, where a function must weigh the magnitudes of multiple entities’ preferences against each other.
The point of the Harsanyi theorem is sort of that they say surprisingly much. Particularly when coupled with a VNM rationality assumption.
I made a post complaining about the current definition of this tag:
I might just “fix” it at some point, but, I’m curious to get feedback about this.
That’s not what Pareto-optimality asserts. It only talks about >= for all participants individually.
Given an initial situation, a Pareto improvement is a new situation where some agents will gain, and no agents will lose.
So a pareto improvement is a move that is > for at least one agent, and >= for the rest.
If you’re making assumptions about altruism, you should be clearer that it’s an arbitrary aggregation function that is being increased.
I stated that the setup is to consider a social choice function (a way of making decisions which would “respect everyone’s preferences” in the sense of regarding pareto improvements as strict preferences, ie, >-type preferences).
Perhaps I didn’t make clear that the social choice function should regard Pareto improvements as strict preferences. But this is the only way to ensure that you prefer the Pareto improvement and not the opposite change (which only makes things worse).
And then, Pareto-optimality is a red herring. I don’t know of any aggregation functions that would change a 0 to a + for a Pareto-optimal change, and would not give a + to some non-Pareto-optimal changes, which violate other agents’ preferences.
Exactly. That’s, like, basically the point of the Harsanyi theorem right there. If your social choice function respects Pareto optimality and rationality, then it’s forced to also make some trade-offs—IE, give a + to some non-Pareto changes.
(Unless you’re in a degenerate case, EG, everyone already has the same preferences.)
I feel as if you’re denying my argument by… making my argument.
My primary objection is that any given aggregation function is itself merely a preference held by the evaluator. There is no reason to believe that there is a justifiable-to-assume-in-others or automatically-agreeable aggregation function.
I don’t believe I ever said anything about justifying it to others.
I think one possible view is that every altruist could have their own personal aggregation function.
There’s still a question of which aggregation function to choose, what properties you might want it to have, etc.
But then, many people might find the same considerations persuasive. So I see nothing against people working together to figure out what “the right aggregation function” is, either.
This may be the crux. I do not assent to that. I don’t even think it’s common.
OK! So that’s just saying that you’re not interested in the whole setup. That’s not contrary to what I’m trying to say here—I’m just trying to say that if an agent satisfies the minimal altruism assumption of preferring Pareto improvements, then all the rest.
If you’re not at all interested in the utilitarian project, that’s fine, other people can be interested.
Pareto improvements are fine, and some of them actually improve my situation, so go for it! But in the wider sense, there are lots of non-Pareto changes that I’d pick over a Pareto subset of those changes.
Again, though, now it just seems like you’re stating my argument.
Weren’t you just criticizing the kind of aggregation I discussed for assenting to Pareto improvements but inevitably assenting to non-Pareto-improvements as well?
Pareto is a min-bar for agreement, not an optimum for any actual aggregation function.
My section on Pareto is literally titled “Pareto-Optimality: The Minimal Standard”
I’m feeling a bit of “are you trolling me” here.
You’ve both denied and asserted both the premises and the conclusion of the argument.
All in the same single comment.
I agree that this can create perverse incentives in practice, but that seems like the sort of thing that you should be handling as part of your decision theory, not your utility function.
I’m mainly worried about the perverse incentives part.
I recognize that there’s some weird level-crossing going on here, where I’m doing something like mixing up the decision theory and the utility function. But it seems to me like that’s just a reflection of the weird muddy place our values come from?
You can think of humans a little like self-modifying AIs, but where the modification took place over evolutionary history. The utility function which we eventually arrived at was (sort of) the result of a bargaining process between everyone, and which took some accounting of things like exploitability concerns.
In terms of decision theory, I often think in terms of a generalized NicerBot: extend everyone else the same cofrence-coefficient they extend to you, plus an epsilon (to ensure that two generalized NicerBots end up fully cooperating with each other). This is a pretty decent strategy for any game, generalizing from one of the best strategies for Prisoner’s Dilemma. (Of course there is no “best strategy” in an objective sense.)
But a decision theory like that does mix levels between the decision theory and the utility function!
I feel like the solution of having cofrences not count the other person’s cofrences just doesn’t respect people’s preferences—when I care about the preferences of somebody else, that includes caring about the preferences of the people they care about.
I totally agree with this point; I just don’t know how to balance it against the other point.
A crux for me is the coalition metaphor for utilitarianism. I think of utilitarianism as sort of a natural endpoint of forming beneficial coalitions, where you’ve built a coalition of all life.
If we imagine forming a coalition incrementally, and imagine that the coalition simply averages utility functions with its new members, then there’s an incentive to join the coalition as late as you can, so that your preferences get the largest possible representation. (I know this isn’t the same problem we’re talking about, but I see it as analogous, and so a point in favor of worrying about this sort of thing.)
We can correct that by doing 1/n averaging: every time the coalition gains members, we make a fresh average of all member utility functions (using some utility-function normalization, of course), and everybody voluntarily self-modifies to have the new mixed utility function.
But the problem with this is, we end up punishing agents for self-modifying to care about us before joining. (This is more closely analogous to the problem we’re discussing.) If they’ve already self-modified to care about us more before joining, then their original values just get washed out even more when we re-average everyone.
So really, the implicit assumption I’m making is that there’s an agent “before” altruism, who “chose” to add in everyone’s utility functions. I’m trying to set up the rules to be fair to that agent, in an effort to reward agents for making “the altruistic leap”.
First off, I’m not trying to illustrate the many-player game here. So imagine there’s just Alice and Bob. I agree that the many-player version is relevant, but I was just dealing with the complexities that arise from iteration.
Second, yeah, absolutely: strategies in iterated games can be any function of the history. But that’s a really complicated strategy space to try and draw. Essentially I’m showing you just a very high-level summary, focusing on frequency of cooperation as a salient feature.
The idea is that frequency is something each player can observe about the other. Alice can implement a Grim Trigger strategy to enforce any given frequency of cooperation from Bob. It needs to have some wiggle room, to allow chance fluctuations in frequency without pulling the Grim Trigger; but Alice can include wiggle room while enforcing tight enough a guarantee that Bob is forced to cooperate with the desired frequency in the limit, and Alice runs only a small risk of spuriously Grim Triggering.
The problem in your example is that you failed to identify a reasonable disagreement point.
Ahh, yeahh, that’s a good point.