Good question. I discuss costless shutdown-prevention a bit in footnote 21 and section 21.4. What I say there is: if shutdown-prevention is truly costless, then the agent won’t prefer not to do it, but plausibly we humans can find some way to set things up so that shutdown-prevention is always at least a little bit costly.
Your example suggests that maybe this won’t always be possible. But here’s some consolation. If the agent satisfies POST, it won’t prefer not to costlessly prevent shutdown, but it also won’t prefer to costlessly prevent shutdown. It’ll lack a preference, and so choose stochastically. So if the agent should happen to have many costless opportunities to affect the probabilities of shutdown at each timestep, it won’t reliably choose to delay shutdown rather than hasten it.
Good question. I discuss costless shutdown-prevention a bit in footnote 21 and section 21.4. What I say there is: if shutdown-prevention is truly costless, then the agent won’t prefer not to do it, but plausibly we humans can find some way to set things up so that shutdown-prevention is always at least a little bit costly.
Your example suggests that maybe this won’t always be possible. But here’s some consolation. If the agent satisfies POST, it won’t prefer not to costlessly prevent shutdown, but it also won’t prefer to costlessly prevent shutdown. It’ll lack a preference, and so choose stochastically. So if the agent should happen to have many costless opportunities to affect the probabilities of shutdown at each timestep, it won’t reliably choose to delay shutdown rather than hasten it.