The following is a modified u′, after (I think) fixing multiple problems that Stuart Armstrong pointed out in the original solution (see here, here and here):
u′(h)={u∗(h)f−1(y)=0α1+[number of time-steps in h until a terminate action, or ∞]otherwise
where:
u∗(h)={0h contains a terminate actionu(h)otherwise
and for some: 0<α≪1 .
Note: if x≠0, pressing the off-switch overrides the next action of the agent to terminate.
Additionally, In case the agent has “excess computation capability” that it has nothing to do with, and thus calculates f−1(y) no matter how small α is, it will (with probability of almost 1) terminate itself immediately after figuring out x≠0. To resolve this, we can modify u to contain a “sink” for the “excess computation capability”, as I described in this comment.
The following is a modified u′, after (I think) fixing multiple problems that Stuart Armstrong pointed out in the original solution (see here, here and here):
u′(h)={u∗(h)f−1(y)=0α1+[number of time-steps in h until a terminate action, or ∞]otherwise
where:
u∗(h)={0h contains a terminate actionu(h)otherwise
and for some: 0<α≪1 .
Note: if x≠0, pressing the off-switch overrides the next action of the agent to terminate.
Additionally, In case the agent has “excess computation capability” that it has nothing to do with, and thus calculates f−1(y) no matter how small α is, it will (with probability of almost 1) terminate itself immediately after figuring out x≠0. To resolve this, we can modify u to contain a “sink” for the “excess computation capability”, as I described in this comment.