Ofer comments on A probabilistic off-switch that the agent is indifferent to

Ofer 25 Sep 2018 22:31 UTC
LW: 5 AF: 3
AF
Wow, I agree!
Let us modify the utility for the case $f^{- 1} (y) = 0$ to:
$u^{*} (h) = {\begin{matrix} 0 & h contains "self-terminate" action u (h) & otherwise \end{matrix}$
Meaning: no utility can be gained via subagents if the agent “jumps ship” (i.e. self-terminates to gain utility in case $f^{- 1} (y) \neq 0$ ).
- Stuart_Armstrong 26 Sep 2018 8:44 UTC
  LW: 3 AF: 2
  AF Parent
  Interesting. I’ll think of whether this works and can be generalised (it doesn’t make it reflectively stable—creating u-maximising subagents is still allowed, and doesn’t directly hurt the agent—but might improve the situation).
  What links here?
  - A probabilistic off-switch that the agent is indifferent to by Ofer (25 Sep 2018 13:13 UTC; 11 points)