Ofer comments on A probabilistic off-switch that the agent is indifferent to

Ofer 26 Sep 2018 11:27 UTC
LW: 1 AF: 1
AF
The following is a modified $u^{'}$ , after (I think) fixing multiple problems that Stuart Armstrong pointed out in the original solution (see here, here and here):
$u^{'} (h) = {\begin{matrix} u^{*} (h) & f^{- 1} (y) = 0 \frac{α}{1 + [number of time-steps in h until a terminate action, or \infty]} & otherwise \end{matrix}$
where:
$u^{*} (h) = {\begin{matrix} 0 & h contains a terminate action u (h) & otherwise \end{matrix}$
and for some: $0 < α ≪ 1$ .
Note: if $x \neq 0$ , pressing the off-switch overrides the next action of the agent to $terminate$ .
Additionally, In case the agent has “excess computation capability” that it has nothing to do with, and thus calculates $f^{- 1} (y)$ no matter how small $α$ is, it will (with probability of almost 1) terminate itself immediately after figuring out $x \neq 0$ . To resolve this, we can modify $u$ to contain a “sink” for the “excess computation capability”, as I described in this comment.
What links here?
- A probabilistic off-switch that the agent is indifferent to by Ofer (25 Sep 2018 13:13 UTC; 11 points)