If the agent’s reasoning is sensible only under certain settings of the default action clause

That was my first rewriting; the second is an example of a more general algorithm that would go something like this. If we assume that both probabilities and utilities are discrete, all of the form q/n for some q, and bounded above and below by N, then something like this (for EU the expected utility, and Actions the set of actions, and b some default action):

```
for q integer in N*n^2 to -N*n^2 (ordered from highest to lowest):
for a in Actions:
if A()=a ⊢ EU=q/n^2 then output a
else output b
```

Then the Löbian proof fails. The agent will fail to prove any of those “if” implications, until it proves “A()=”not cross” ⊢ EU=0″. Then it outputs “not cross”; the default action b is not relevant. Also not relevant, here, is the order in which a is sampled from “Actions”.

You are entirely correct; I don’t know why I was confused.

However, looking at the proof again, it seems there might be a potential hole. You use Löb’s theorem within an assumption sub-loop. This seems to assume that from “A→(□B→B)”, we can deduce “A→B”.

But this cannot be true in general! To see this, set A=□B→B. Then A→(□B→B), trivially; if, from that, we could deduce A→B, we would have (□B→B)→B for any B. But this statement, though it looks like Löb’s theorem, is one that we cannot deduce in general (see Eliezer’s “medium-hard problem” here).

Can this hole be patched?

(note that if A→(□AB→B), where □A is a PA proof that adds A as an extra axiom, then we can deduce A→B).