Elliott Thornley (EJT) comments on The Shutdown Problem: Incomplete Preferences as a Solution

Elliott Thornley (EJT) 9 Apr 2024 10:41 UTC
1 point
1
Yep, maybe that would’ve been a better idea!
I think that stochastic choice does suffice for a lack of preference in the relevant sense. If the agent had a preference, it would reliably choose the option it preferred. And tabooing ‘preference’, I think stochastic choice between different-length trajectories makes it easier to train agents to satisfy Timestep Dominance, which is the property that keeps agents shutdownable. And that’s because Timestep Dominance follows from stochastic choice between different-length trajectories and a more general principle that we’ll train agents to satisfy, because it’s a prerequisite for minimally sensible action under uncertainty. I discuss this in a little more detail in section 18.