In case this helps folks’ intuition, my generating idea was something like: “look at what my opponent is thinking, but assume that whenever they check if I cooperate, they think the answer is yes”. This is how we break the loop.
This results in a proof of the lemma like thus:
x↔□(□x→x) (given)
x↔□(□x→□(□x→x)) (unrolling one step)
□x→□(□x→x) is straightforward intuitively, and we can get there by applying box-distributivity to □(x→(□x→x))
(EDIT: This is basically the same proof as in the post, but less simple. Maybe the interesting part is the “unroll once then break the loop” intuition.)
(In fact, I think “assume they will think I cooperate” turns out to be too strong, and leads to unnecessary defection. I’m still working through the details.)
If I follow what you mean, we can derive:
□(□A)→□B→□(□B)→□A
So there’s a Löbian proof, in which the provability is self-fulfilling. But this isn’t sufficient to avoid this kind of proof.
(Aside on why I don’t like the Löbian method: I moreso want the agents to be doing “correct” counterfactual reasoning about how their actions affect their opponent, and to cooperate because they see that mutual cooperation is possible and then choose it. The Löbian proof style isn’t a good model of that, imo.)
In case this helps folks’ intuition, my generating idea was something like: “look at what my opponent is thinking, but assume that whenever they check if I cooperate, they think the answer is yes”. This is how we break the loop.
This results in a proof of the lemma like thus:
x↔□(□x→x) (given)
x↔□(□x→□(□x→x)) (unrolling one step)
□x→□(□x→x) is straightforward intuitively, and we can get there by applying box-distributivity to □(x→(□x→x))
(EDIT: This is basically the same proof as in the post, but less simple. Maybe the interesting part is the “unroll once then break the loop” intuition.)
Relatedly, with two agents A and B, we can have:
A↔□(□A→B) (given)
B↔□(□B→A) (given)
A↔□(□A→□(□B→A)) (substituting B into A)
□A→□(□B→A) is straightforward as before
(In fact, I think “assume they will think I cooperate” turns out to be too strong, and leads to unnecessary defection. I’m still working through the details.)
So, ideally you would like to assume only
□A→B
□B→A
and conclude A and B ?
If I follow what you mean, we can derive: □(□A)→□B→□(□B)→□A
So there’s a Löbian proof, in which the provability is self-fulfilling. But this isn’t sufficient to avoid this kind of proof.
(Aside on why I don’t like the Löbian method: I moreso want the agents to be doing “correct” counterfactual reasoning about how their actions affect their opponent, and to cooperate because they see that mutual cooperation is possible and then choose it. The Löbian proof style isn’t a good model of that, imo.)