I’m not saying this is a problem with utility functions in general, and yes, thank you, I know what a utility function is. Rather, my claim is that the problem is with average utilitarianism and variants thereof, which is to say, that subset of utility functions which attempt to incorporate every other instantiated utility function as a non-negligible factor within themselves. The computational compromises necessary to apply such a system inevitably introduce more and more noise, and if someone decided to implement the resulting garbage-data-based policy proposals anyway, it would spiral off into pathology whenever a monster wandered in.
Tit-for-tat works. Division of labor according to comparative advantage works. Omnibenevolence looks good on paper.
Yes, I think this scenario does illustrate the point that simulations cannot be winningly granted “moral weight” by default on pain of dutch book.
It’s not about the fact that they’re simulations. This is just a hostage situation, with the complications that A) the encamped terrorist has a factory for producing additional hostages and B) the negotiator doesn’t have a SWAT team to send in. Under those circumstances, playing as the negotiator, you can meet the demands (or make a good-faith effort, and then provide evidence of insurmountable obstacles to full compliance), or you can devalue the hostages.
I don’t think EYs answer to precommit to only accept positive trades is okay here as that makes the outcome of this scenario dependent on who gets to precommit “first”, which notion should, in order to appeal to my intuition, not make sense.
Pre-existing commitments are the terrain upon which a social conflict takes place. In the moment of conflict, it doesn’t matter so much when or how the land got there. Committing not to negotiate with terrorists is building a wall: it stops you being attacked from a particular direction, but also stops you riding out to rescue the hostages by the expedient path of paying for them. If the enemy commits to attacking along that angle anyway, well… then we get to find out whether you built a wall from interlocking blocks of solid adamant, or cheap plywood covered in adamant-colored paint. Or maybe just included the concealed sally-port of an ambiguous implicit exception. A truly solid wall will stop the attack from reaching it’s objective, regardless of how utterly committed the attacker may be (continuing the terrain metaphor, perhaps sending a fire or flood rather than infantry), but there are construction costs and opportunity costs.
Generally speaking, defense has primacy in social conflict. There’s almost always some way to shut down the communication channel, or just be more stubborn. People open up and negotiate anyway, even when stubbornness could have gotten everything they wanted without being inconvenienced by the other side’s preferences, because the worst-case costs of losing a social conflict are generally less than the best-case costs of winning a physical conflict. That strategy breaks down in the face of an extremely clever but physically helpless foe, like an ambiguously-motivated AI in a box of Hannibal Lecter in a prison cell, which may be the source of the fascination in both cases.
I’m not saying this is a problem with utility functions in general, and yes, thank you, I know what a utility function is. Rather, my claim is that the problem is with average utilitarianism and variants thereof, which is to say, that subset of utility functions which attempt to incorporate every other instantiated utility function as a non-negligible factor within themselves. The computational compromises necessary to apply such a system inevitably introduce more and more noise, and if someone decided to implement the resulting garbage-data-based policy proposals anyway, it would spiral off into pathology whenever a monster wandered in.
Tit-for-tat works. Division of labor according to comparative advantage works. Omnibenevolence looks good on paper.
It’s not about the fact that they’re simulations. This is just a hostage situation, with the complications that A) the encamped terrorist has a factory for producing additional hostages and B) the negotiator doesn’t have a SWAT team to send in. Under those circumstances, playing as the negotiator, you can meet the demands (or make a good-faith effort, and then provide evidence of insurmountable obstacles to full compliance), or you can devalue the hostages.
Pre-existing commitments are the terrain upon which a social conflict takes place. In the moment of conflict, it doesn’t matter so much when or how the land got there. Committing not to negotiate with terrorists is building a wall: it stops you being attacked from a particular direction, but also stops you riding out to rescue the hostages by the expedient path of paying for them. If the enemy commits to attacking along that angle anyway, well… then we get to find out whether you built a wall from interlocking blocks of solid adamant, or cheap plywood covered in adamant-colored paint. Or maybe just included the concealed sally-port of an ambiguous implicit exception. A truly solid wall will stop the attack from reaching it’s objective, regardless of how utterly committed the attacker may be (continuing the terrain metaphor, perhaps sending a fire or flood rather than infantry), but there are construction costs and opportunity costs.
Generally speaking, defense has primacy in social conflict. There’s almost always some way to shut down the communication channel, or just be more stubborn. People open up and negotiate anyway, even when stubbornness could have gotten everything they wanted without being inconvenienced by the other side’s preferences, because the worst-case costs of losing a social conflict are generally less than the best-case costs of winning a physical conflict. That strategy breaks down in the face of an extremely clever but physically helpless foe, like an ambiguously-motivated AI in a box of Hannibal Lecter in a prison cell, which may be the source of the fascination in both cases.