This post presents five closely-related ways to achieve proof-based cooperation without using Löb’s theorem, and muses on legible cooperation in the real world.

(Edit: maybe they’re closer to just-use-Löb’s-theorem than I originally thought! See this comment. If these constructions somehow work better, I’m more confused than before about why.

New edit: I do think there’s something up about these constructions being better-founded than a Löb-based approach, specifically thanks to theseresults, though I don’t have a more full account yet for what’s up with that.)

I’m writing this as a follow-up to Andrew Critch’s recent post, to share more of my perspective on the subject.

We’re going to dive straight into the weeds. (I’m planning to also write a more accessible explainer post soon.)

The ideas

Idea #1: try to prove A→B

I claim the following are sufficient for robust cooperation:

A↔□(A→B)

B←□A

A tries to prove that A→B, and B tries to prove A. The reason this works is that B can prove that A→□A, i.e.A only cooperates in ways legible to B. (Proof sketch: A↔□X→□□X↔□A.)

The flaw in this approach is that we needed to know that A won’t cooperate for illegible reasons. Otherwise we can’t verify that B will cooperate whenever A does.

This indicates to me that “A→B” isn’t the right “counterfactual”. It shouldn’t matter if Acould cooperate for illegible reasons, if A is actually cooperating for a legible one.

Idea #2: try to prove □A→B

We can weaken the requirements with a simple change:

A←□(□A→B)

B←□A

Note that this form is close to the lemma discussed in Critch’s post.

In this case, the condition □A→B is trivial. And when the condition activates, it also ensures that □A is true, which discharges our assumption and ensures B is true.

I still have the sense that the condition for cooperation should talk about itself activating, not A. Because we want it to activate when that is sufficient for cooperaion. But I do have to admit that □A→B works for mostly the right reasons, comes with a simple proof, and is the cleanest two-agent construction I know.

Idea #3: factor out the loop-cutting gadget

We can factor the part that is trying to cut the loop out from A, like so:

A←□X

B←□A

X↔□(X→B); or alternatively

X↔□(□X→B)

This gives the loop-cutting logic a name, X. Now X can refer to itself, and roughly says “I’ll legibly activate if I can verify this will cause B to be true”.

Like with idea #2, we just need A to reveal a mechanism by which it can be compelled to cooperate. The mechanism X is designed to be self-contained and easy for B to verify.

Idea #4: everyone tries to prove □me→them

What about three people trying to cooperate? We can try applying lots of idea #2:

A←□(□A→B∧C)

B←□(□B→A∧C)

C←□(□C→A∧B)

And, this works! Proof sketch:

Under the assumption of □C:

A←□(□A→B∧C)←□(□A→B)

B←□(□B→A∧C)←□(□B→A)

A and B form a size-2 group, which cooperates by inductive hypothesis

□C→A∧B, since we proved A and B under the assumption

C and □C follow from (2)

A and B also follow, from (2) and (3)

The proof simplifies the group one person at a time, since each person is asking “what would happen if everyone else could tell I cooperate”. This lets us prove the whole thing by induction. It’s neat that it works, though it’s not the easiest thing to see.

Idea #5: the group agrees to a shared mechanism or leader

What if we factor out the choosing logic in a larger group? Here’s one way to do it:

A←□X

B←□X

C←□X

X↔□(□X→A∧B∧C)

This is the cleanest idea I know for handling the group case. The group members agree on some trusted leader or process X. They set things up so X activates legibly, verifies things in a way trusted by everyone, and only activates when it verifies this will cause cooperation.

We’ve now localized the choice-making in one place. Everyone can watch our process X verify that □X→A∧B∧C, which causes X to activate, and everyone cooperates.

Closing remarks on groups in the real world

Centralizing the choosing like in idea #5 make the logic simpler, but this sort of approach is prone to manipulation and other problems when the verification is not reliably done. This means I don’t unambiguously prefer idea #5 to idea #4, in which everyone is doing their own legwork.

One solid takeaway from idea #5 though is that if you can actually trust other people/processes to do some of the verification, this can really simplify things. For example, if you believe that your friend Bob’s verification is mostly trustworthy, then what Bob thinks can inform your read of the group.

Here’s a sketch for something that might work in practice:

Have a leader or a few leaders who are by-and-large legible and trustworthy to the group, in whom we entrust to do the loop-cutting / choice-making when the group is onboard.

Additionally, have everyone try to build trust with other group members. If this works well, your colleagues can help you know if the group is trustworthy or not. (And if this works poorly, then your group may not be meeting the legibility bar!)

If anyone discovers there are important places in which legibility is scarce, e.g. if the leaders aren’t legibly trustworthy to everyone, throw an exception and raise this to collective awareness.

Also, if anyone discovers that other members can’t do the proper verification handshake, because perhaps some important things are illegible to them, raise the alarm.

If the group is failing to know that it’s trustworthy, notice this. It’s perhaps necessary to dissolve the group and reorganize into new structures that can trust themselves.

## Some constructions for proof-based cooperation without Löb

This post presents five closely-related ways to achieve proof-based cooperation without using Löb’s theorem, and muses on legible cooperation in the real world.

(

Edit: maybe they’re closer to just-use-Löb’s-theorem than I originally thought! See this comment. If these constructions somehow work better, I’m more confused than before about why.New edit: I do think there’s something up about these constructions being better-founded than a Löb-based approach, specifically thanks to these results, though I don’t have a more full account yet for what’s up with that.)I’m writing this as a follow-up to Andrew Critch’s recent post, to share more of my perspective on the subject.

We’re going to dive straight into the weeds. (I’m planning to also write a more accessible explainer post soon.)

## The ideas

## Idea #1: try to prove A→B

I claim the following are sufficient for robust cooperation:

A↔□(A→B)

B←□A

A tries to prove that A→B, and B tries to prove A. The reason this works is that B can prove that A→□A, i.e.A only cooperates in ways legible to B. (Proof sketch: A↔□X→□□X↔□A.)

The flaw in this approach is that we

neededto know that A won’t cooperate for illegible reasons. Otherwise we can’t verify that B will cooperate whenever A does.This indicates to me that “A→B” isn’t the right “counterfactual”. It shouldn’t matter if A

couldcooperate for illegible reasons, if A isactually cooperatingfor a legible one.## Idea #2: try to prove □A→B

We can weaken the requirements with a simple change:

A←□(□A→B)

B←□A

Note that this form is close to the lemma discussed in Critch’s post.

In this case, the condition □A→B is trivial. And when the condition activates, it also ensures that □A is true, which discharges our assumption and ensures B is true.

I still have the sense that the condition for cooperation should talk about

itself activating, not A. Because we want it to activate whenthatis sufficient for cooperaion. But I do have to admit that □A→B works for mostly the right reasons, comes with a simple proof, and is the cleanest two-agent construction I know.## Idea #3: factor out the loop-cutting gadget

We can factor the part that is trying to cut the loop out from A, like so:

A←□X

B←□A

X↔□(X→B); or alternatively

X↔□(□X→B)

This gives the loop-cutting logic a name, X. Now X can refer to itself, and roughly says “I’ll legibly activate if I can verify this will cause B to be true”.

Like with idea #2, we just need A to reveal a mechanism by which it can be compelled to cooperate. The mechanism X is designed to be self-contained and easy for B to verify.

## Idea #4: everyone tries to prove □me→them

What about three people trying to cooperate? We can try applying lots of idea #2:

A←□(□A→B∧C)

B←□(□B→A∧C)

C←□(□C→A∧B)

And, this works! Proof sketch:

Under the assumption of □C:

A←□(□A→B∧C)←□(□A→B)

B←□(□B→A∧C)←□(□B→A)

A and B form a size-2 group, which cooperates by inductive hypothesis

□C→A∧B, since we proved A and B under the assumption

C and □C follow from (2)

A and B also follow, from (2) and (3)

The proof simplifies the group one person at a time, since each person is asking “what would happen if everyone else could tell I cooperate”. This lets us prove the whole thing by induction. It’s neat that it works, though it’s not the easiest thing to see.

## Idea #5: the group agrees to a shared mechanism or leader

What if we factor out the choosing logic in a larger group? Here’s one way to do it:

A←□X

B←□X

C←□X

X↔□(□X→A∧B∧C)

This is the cleanest idea I know for handling the group case. The group members agree on some trusted leader or process X. They set things up so X activates legibly, verifies things in a way trusted by everyone, and only activates when it verifies this will cause cooperation.

We’ve now localized the choice-making in one place. Everyone can watch our process X verify that □X→A∧B∧C, which causes X to activate, and everyone cooperates.

## Closing remarks on groups in the real world

Centralizing the choosing like in idea #5 make the logic simpler, but this sort of approach is prone to manipulation and other problems when the verification is not reliably done. This means I don’t unambiguously prefer idea #5 to idea #4, in which everyone is doing their own legwork.

One solid takeaway from idea #5 though is that if you can actually trust other people/processes to do some of the verification, this can really simplify things. For example, if you believe that your friend Bob’s verification is mostly trustworthy, then what Bob thinks can inform your read of the group.

Here’s a sketch for something that might work in practice:

Have a leader or a few leaders who are by-and-large legible and trustworthy to the group, in whom we entrust to do the loop-cutting / choice-making when the group is onboard.

Additionally, have everyone try to build trust with other group members. If this works well, your colleagues can help you know if the group is trustworthy or not. (And if this works poorly, then your group may not be meeting the legibility bar!)

If anyone discovers there are important places in which legibility is scarce, e.g. if the leaders aren’t legibly trustworthy to everyone, throw an exception and raise this to collective awareness.

Also, if anyone discovers that other members can’t do the proper verification handshake, because perhaps some important things are illegible to them, raise the alarm.

If the group is failing to know that it’s trustworthy, notice this. It’s perhaps necessary to dissolve the group and reorganize into new structures that

cantrust themselves.And generally don’t forget about “yes requires the possibility of no”.