Gurkenglas comments on Shutdownable Agents through POST-Agency

Gurkenglas 17 Sep 2025 12:24 UTC
2 points
0
If competent agents will always be choosing between same-length lotteries, then every competent agent can without loss of generality be assumed to have Preferences Only Between Same-Length Trajectories, right?
- EJT 25 Sep 2025 20:10 UTC
  1 point
  0
  Parent
  Not quite. ‘Competent agents will always be choosing between same-length lotteries’ is a claim about these agents’ credences, not their preferences. Specifically, the claim is that, in each situation, all available actions will entirely overlap with respect to the trajectory-lengths assigned positive probability. Competent agents will never find themselves in a situation where—e.g. -- they assign positive probability to getting shut down in 1 timestep conditional on action A and zero probability to getting shut down in 1 timestep conditional on action B.
  That’s compatible with these competent agents violating POST by—e.g. -- preferring some trajectory of length 2 to some trajectory of length 1.