In other words, when someone is wronged, we want to search over ways to repair the harm done to them and prevent similar harm from happening in the future, rather than searching over ways to harm the perpetrator in return.
I think this is very often true, but not actually universal. There are LOTS of cases where repairing the harm is impossible, and more importantly cases where we want to disincentivize the behavior MORE than just this instance caused harm.
A difficult case for your theory is when a child skips school or verbally abuses a classmate. Actual punishment is (often) the right response. Likewise thefts that are rarely caught—if you want it to be rare, you need to punish enough for all the statistical harms that the caught criminal is, on average, responsible for. And since they don’t have enough to actually make it right by repaying, you punish them by other means.
The missing piece from this sequence is “power”. You’re picking games that have a symmetry or setup which makes your definition of “fair” enforceable. For games without these mechanisms, the rational outcomes don’t end up that pleasant. Except sometimes, with players who have extra-rational motives.
Add to this the fact that the human population varies WIDELY on these dimensions, and many DO seek retribution as the primary signal (for their behaviors, and to impose their preferences on others).
For games without these mechanisms, the rational outcomes don’t end up that pleasant. Except sometimes, with players who have extra-rational motives.
I think we agree that if a selfish agent needs to be forced to not treat others poorly, in the absence of such enforcement they will treat others poorly.
It also seems like in many cases, selfish agents have an incentive to create exactly those mechanisms ensuring good outcomes for everyone, because it leads to good outcomes for them in particular. A nation-state comprised entirely of very selfish people would look a lot different from any modern country, but they face the same instrumental reasons to pool resources to enforce laws. The more inclined their populace is towards mistreatment in the absence of enforcement, the higher those enforcement costs need to be in order to achieve the same level of good treatment.
I also think “fairness” is a Schelling point that even selfish agents can agree to coordinate around, in a way that they could never be aligned on “maximizing Zaire’s utility in particular.” They don’t need to value fairness directly to agree that “an equal split of resources is the only compromise we’re all going to agree on during this negotiation.”
So I think my optimism comes from at least two places:
Even utterly selfish agents still have an incentive to create mechanisms enforcing good outcomes for everyone.
People have at least some altruism, and are willing to pay costs to prevent mistreatment of others in many cases.
Skipping school is a victimless action, and I’d expect that searching over ways to cause the person to learn would almost never use punishment as the response, because it doesn’t produce incentive to learn. School incentive design is an active area of political discussion in the world right now, eg I’m a fan of tht Human Restoration Project for their commentary on and sketches of such things.
On the main topic, I might agree with you, not sure yet, it seems to me the punishment case should be generated by searching for how to prevent recurrence in any circumstance where there isn’t another way to prevent recurrence, right?
how to prevent recurrence in any circumstance where there isn’t another way to prevent recurrence, right?
Not quite. How to minimize similar choices in future equilibrium, maybe. In many cases, how to maximize conformance and compliance to a set of norms, rather than just this specific case. In real humans (not made-up rationalist cooperators), it includes how to motivate people to behave compatibly with your worldview, even though they think differently enough from you that you can’t fully model them. Or don’t have the bandwidth to understand them well enough to convince them. Or don’t have the resources to satisfy their needs such that they’d be willing to comply.
I don’t mean to argue against searching for (and in fact using) alternatives. I merely mean to point out that there seem to be a lot of cases in society where we haven’t found effective alternatives to punishment. It’s simply incorrect for the OP to claim that the vision of fiction is fully applicable to the real world.
ah, I see—if it turns out OP was arguing for that, then I misunderstood something. the thing I understood OP to be saying is about the algorithm for how to generate responses—that it should not be retribution-seeking, but rather solution-seeking, and it should likely have a penalty for selecting retribution, but it also likely does need to be able to select retribution to work in reality, as you say. OP’s words, my italics:
In other words, when someone is wronged, we want to search over ways to repair the harm done to them and prevent similar harm from happening in the future, rather than searching over ways to harm the perpetrator in return.
implication I read: prevent similar harm is allowed to include paths that harm the perpetrator, but it’s searching over ?worldlines? based on those ?worldlines? preventing recurrence, rather than just because they harm the perpetrator.
If SpaceX drops a rocket on an irreplaceable work of art or important landmark, there’s no amount of money that can make the affected parties whole. Not that they shouldn’t pay compensation and do their best to repair the harm done anyway.
I think this is very often true, but not actually universal. There are LOTS of cases where repairing the harm is impossible, and more importantly cases where we want to disincentivize the behavior MORE than just this instance caused harm.
A difficult case for your theory is when a child skips school or verbally abuses a classmate. Actual punishment is (often) the right response. Likewise thefts that are rarely caught—if you want it to be rare, you need to punish enough for all the statistical harms that the caught criminal is, on average, responsible for. And since they don’t have enough to actually make it right by repaying, you punish them by other means.
The missing piece from this sequence is “power”. You’re picking games that have a symmetry or setup which makes your definition of “fair” enforceable. For games without these mechanisms, the rational outcomes don’t end up that pleasant. Except sometimes, with players who have extra-rational motives.
Add to this the fact that the human population varies WIDELY on these dimensions, and many DO seek retribution as the primary signal (for their behaviors, and to impose their preferences on others).
I think we agree that if a selfish agent needs to be forced to not treat others poorly, in the absence of such enforcement they will treat others poorly.
It also seems like in many cases, selfish agents have an incentive to create exactly those mechanisms ensuring good outcomes for everyone, because it leads to good outcomes for them in particular. A nation-state comprised entirely of very selfish people would look a lot different from any modern country, but they face the same instrumental reasons to pool resources to enforce laws. The more inclined their populace is towards mistreatment in the absence of enforcement, the higher those enforcement costs need to be in order to achieve the same level of good treatment.
I also think “fairness” is a Schelling point that even selfish agents can agree to coordinate around, in a way that they could never be aligned on “maximizing Zaire’s utility in particular.” They don’t need to value fairness directly to agree that “an equal split of resources is the only compromise we’re all going to agree on during this negotiation.”
So I think my optimism comes from at least two places:
Even utterly selfish agents still have an incentive to create mechanisms enforcing good outcomes for everyone.
People have at least some altruism, and are willing to pay costs to prevent mistreatment of others in many cases.
Skipping school is a victimless action, and I’d expect that searching over ways to cause the person to learn would almost never use punishment as the response, because it doesn’t produce incentive to learn. School incentive design is an active area of political discussion in the world right now, eg I’m a fan of tht Human Restoration Project for their commentary on and sketches of such things.
On the main topic, I might agree with you, not sure yet, it seems to me the punishment case should be generated by searching for how to prevent recurrence in any circumstance where there isn’t another way to prevent recurrence, right?
Not quite. How to minimize similar choices in future equilibrium, maybe. In many cases, how to maximize conformance and compliance to a set of norms, rather than just this specific case. In real humans (not made-up rationalist cooperators), it includes how to motivate people to behave compatibly with your worldview, even though they think differently enough from you that you can’t fully model them. Or don’t have the bandwidth to understand them well enough to convince them. Or don’t have the resources to satisfy their needs such that they’d be willing to comply.
but I don’t see how that precludes searching for alternatives to retribution first
I don’t mean to argue against searching for (and in fact using) alternatives. I merely mean to point out that there seem to be a lot of cases in society where we haven’t found effective alternatives to punishment. It’s simply incorrect for the OP to claim that the vision of fiction is fully applicable to the real world.
ah, I see—if it turns out OP was arguing for that, then I misunderstood something. the thing I understood OP to be saying is about the algorithm for how to generate responses—that it should not be retribution-seeking, but rather solution-seeking, and it should likely have a penalty for selecting retribution, but it also likely does need to be able to select retribution to work in reality, as you say. OP’s words, my italics:
implication I read: prevent similar harm is allowed to include paths that harm the perpetrator, but it’s searching over ?worldlines? based on those ?worldlines? preventing recurrence, rather than just because they harm the perpetrator.