Incorporating Justice Theory into Decision Theory

When someone wrongs us, how should we respond? We want to discourage this behavior, so that others find it in their interest to treat us well. And yet the goal should never be to “do something unpleasant to them”, for its deterrent effect. I’m persuaded by Yudkowsky’s take (source contains spoilers for Project Lawful, but it’s here):

If at any point you’re calculating how to pessimize a utility function, you’re doing it wrong. If at any point you’re thinking about how much somebody might get hurt by something, for a purpose other than avoiding doing that, you’re doing it wrong.

In other words, when someone is wronged, we want to search over ways to repair the harm done to them and prevent similar harm from happening in the future, rather than searching over ways to harm the perpetrator in return. If we require that a person who harms another pay some or all of the costs involved in repairing that harm, that also helps to align their incentives and discourages people from inefficiently harming each other in the first place.

Restitution and Damages

Our legal systems have all sorts of tools for handling these situations, and I want to point to two of them: restitution and damages. Restitution covers cases where one party is enriched at another’s expense. Damages cover situations where one party causes a loss or injury for another. Ideally, we’d like to make the wronged party at least as well-off as if they hadn’t been wronged in the first place.

Sometimes, a wronged party can be made whole. If SpaceX drops a rocket on my car, there’s an amount of money they could pay me where I feel like my costs have been covered. If SpaceX drops a rocket on an irreplaceable work of art or important landmark, there’s no amount of money that can make the affected parties whole. Not that they shouldn’t pay compensation and do their best to repair the harm done anyway. But some losses are irreversible, like the loss of something irreplaceable. And some losses are reversible, like the financial loss of a replaceable car and the costs associated with replacing it.

In a previous post, we looked at a couple of example games, where if Alice treats “Bob Defecting while Alice Cooperates” as creating a debt between them, she can employ a policy which incentivizes Bob to Cooperate and receive a fair share of the socially optimal outcome. And if Bob employs the same policy, this stabilizes that outcome as part of a Nash equilibrium. Importantly, the penalty Bob experiences for not treating Alice according to her notion of fairness is limited rather than unlimited.

We also looked at how Alice might enforce debts owed to Carol when interacting with Bob, and this can lend social legitimacy and help ensure these debts actually get paid. One function of governments is to create common knowledge around what actions lead to how much debt, and how this debt will be enforced. I claim that the economic concept of externalities is a useful lens for determining how much debt if any is created by an action.

Voluntarism and Counterfactual Negotiation

Suppose that you are in a position to gain $100,000 at the expense of $20,000 to someone else. Should you? It might be justified on utilitarian grounds, if you gain more utility than they lose. But it’s clearly not a Pareto improvement over doing nothing.

One major theme of Gaming the Future is that we should generally prefer voluntary interactions over involuntary interactions. A voluntary interaction is one where all parties involved could meaningfully “opt-out” if they wanted to. And so if the interaction takes place at all, it’s because all parties prefer it to happen. In other words, voluntary interactions lead to Pareto improvements.

I think one relevant question for deciding whether to profit at someone else’s expense is “what would it take for them to agree to that?” For example, the other person might reasonably ask that they be compensated for their $20,000 loss. And that the remaining $80,000 be split equally between both parties.

Ideally, all parties would be able to actually negotiate this before the decision was made. This internalizes the negative externality by bringing all affected parties into the decision-making process. But suppose that it’s impractical to negotiate ahead of time. I would still be in favor of a social norm that “some reversible losses may be imposed involuntarily on others, so long as those losses are indeed reversed and the remaining surplus is split fairly.” This also internalizes the negative externality, and leads to fair Pareto improvements.

Importantly, we probably don’t want to internalize all negative externalities. If Alice doesn’t like Bob’s haircut, this doesn’t mean Bob should have to pay damages to Alice. If Alice dislikes Bob’s haircut more than Bob likes it, there is an opportunity for Alice to pay Bob to change his hair and split the resulting economic surplus fairly. But the concept of boundaries helps to define who gets to make what decisions unilaterally in the absence of a negotiated agreement without incurring debt to others.

Parfit’s Hitchhiker and Positive Externalities

Suppose that you are in a position to create a huge financial windfall for someone else, but it requires that you pay a noticeable-but-much-smaller cost yourself. Should you? The utilitarians are still saying yes, but this also not a Pareto improvement over doing nothing. And worse, now your personal interests seem to be misaligned with the socially optimal action.

Arguably, you should just help them because you personally are better off if people generally pay small costs to bring others huge benefits, and their decisions are correlated with your decision. But also arguably, the beneficiaries should pay those costs and then some, to internalize the positive externality and align their benefactor’s local incentives with their own. If there was enough sensible decision theory in the water supply, everyone would find it intuitively obvious that you should pay Paul Ekman to drive you out of the desert. There are times when a person actively prefers not to be compensated for their help. (“That was a lovely meal grandma, what do I owe you?”) But especially when there are significant costs for doing the socially optimal thing, we should generally arrange for those costs to be paid and then some.

And again, just because Alice likes Bob’s haircut, that doesn’t necessarily mean she owes him compensation. It seems fair for boundaries to work both ways. Alice can offer to pay Bob to choose the way she likes, but it’s his choice in the property-rights sense of ownership.

Insurance as a Substitute for Good Decision-Making

In a saner world, it might be intuitively obvious that “you should repay those that create windfalls for you, conditional on their prediction that you would repay them.” That world might have already done the hard work of creating the common knowledge that nearly everyone would reason this way, repay their benefactors, and split the remaining surplus fairly.

Until we build dath ilan, there are incremental steps we can take to internalize externalities and align individual incentives with our collective interests. In the United States, drivers are required to carry insurance which will cover some or all the damages caused by their driving. We can expand this to a requirement that everyone carry liability insurance, to internalize negative externalities more generally. Similarly in the United States, everyone is required to carry health insurance. We could expand this requirement to other types of help one could receive, to internalize more positive externalities. I literally carry “being airlifted out of the desert” insurance because I work with medical and fire teams for festivals out in the desert, and my normal health insurance doesn’t cover those sorts of evacuations.

What makes a decision into an externality is that it affects someone who wasn’t involved in making the decision. A first approach to internalizing externalities might be to literally bring affected parties into the decision-making process, or imagine what it would have taken for all parties to agree. Both of these fail when dealing with holdouts, who insist on unfairly high gains from the interaction. But we can still treat others according to our own notions of fairness, or a standardized consensus notion of fairness if this is even more generous. And that seems like a pretty good way to calculate the amount of debt incurred if someone receives worse treatment than fairness requires.