I think you need a decision theory (+ a theory of counterfactuals, which is basically going to have to be a theory of logical counterfactuals if you want to prevent extortion from Omega, and uhh good luck figuring that out) for this. We compare to counterfactuals where the other agents aren’t destroying value for the sake of extortion because agents with a good decision theory will refuse to give in in those cases. Now let’s imagine that Greg, for genuinely unrelated reasons, will lead to the project’s downfall (say, he usually mows his lawn in the morning, and the project requires quiet at that time). If Greg chooses to not mow his lawn to help the project, I’d call that “participating in the coalition”, and he should get some value from doing so. The point, after all, is to incentivize people to contribute to the project and also to be resistant to extortion.
Yeah, I don’t know how it should work properly when people factor in information about decision procedures of other people. I guess Shapley values might be Newton’s laws versus Special relativity kind of deal, when they might mostly work most of the time. Or it might be more like applied design thing, where everything switches to work on completely different underlying logic if it gets you even modest improvement. Idk.
I think you need a decision theory (+ a theory of counterfactuals, which is basically going to have to be a theory of logical counterfactuals if you want to prevent extortion from Omega, and uhh good luck figuring that out) for this. We compare to counterfactuals where the other agents aren’t destroying value for the sake of extortion because agents with a good decision theory will refuse to give in in those cases. Now let’s imagine that Greg, for genuinely unrelated reasons, will lead to the project’s downfall (say, he usually mows his lawn in the morning, and the project requires quiet at that time). If Greg chooses to not mow his lawn to help the project, I’d call that “participating in the coalition”, and he should get some value from doing so. The point, after all, is to incentivize people to contribute to the project and also to be resistant to extortion.
Yeah, I don’t know how it should work properly when people factor in information about decision procedures of other people. I guess Shapley values might be Newton’s laws versus Special relativity kind of deal, when they might mostly work most of the time. Or it might be more like applied design thing, where everything switches to work on completely different underlying logic if it gets you even modest improvement. Idk.