I tried to formalize this, using A→B as a “poor man’s counterfactual”, standing in for “if Alice cooperates then so does Bob”. This has the odd behaviour of becoming “true” when Alice defects! You can see this as the counterfactual collapsing and becoming inconsistent, because its premise is violated. But this does mean we need to be careful about using these.
For technical reasons we upgrade to □A→B, which says “if Alice cooperates in a legible way, then Bob cooperates back”. Alice tries to prove this, and legibly cooperates if so.
This setup gives us “Alice legibly cooperates if she can prove that, if she legibly cooperates, Bob would cooperate back”. In symbols, □(□A→B)→A.
Now, is this okay? What about proving □A→⊥?
Well, actually you can’t ever prove that! Because of Lob’s theorem.
Outside the system we can definitely see cases where A is unprovable, e.g. because Bob always defects. But you can’t prove this inside the system. You can only prove things like “□kA→⊥” for finite proof lengths k.
I think this is best seen as a consequence of “with finite proof strength you can only deny proofs up to a limited size”.
So this construction works out, perhaps just because two different weirdnesses are canceling each other out. But in any case I think the underlying idea, “cooperate if choosing to do so leads to a good outcome”, is pretty trustworthy. It perhaps deserves to be cached out in better provability math.
I tried to formalize this, using A→B as a “poor man’s counterfactual”, standing in for “if Alice cooperates then so does Bob”. This has the odd behaviour of becoming “true” when Alice defects! You can see this as the counterfactual collapsing and becoming inconsistent, because its premise is violated. But this does mean we need to be careful about using these.
For technical reasons we upgrade to □A→B, which says “if Alice cooperates in a legible way, then Bob cooperates back”. Alice tries to prove this, and legibly cooperates if so.
This setup gives us “Alice legibly cooperates if she can prove that, if she legibly cooperates, Bob would cooperate back”. In symbols, □(□A→B)→A.
Now, is this okay? What about proving □A→⊥?
Well, actually you can’t ever prove that! Because of Lob’s theorem.
Outside the system we can definitely see cases where A is unprovable, e.g. because Bob always defects. But you can’t prove this inside the system. You can only prove things like “□kA→⊥” for finite proof lengths k.
I think this is best seen as a consequence of “with finite proof strength you can only deny proofs up to a limited size”.
So this construction works out, perhaps just because two different weirdnesses are canceling each other out. But in any case I think the underlying idea, “cooperate if choosing to do so leads to a good outcome”, is pretty trustworthy. It perhaps deserves to be cached out in better provability math.