TurnTrout comments on Seeking Power is Often Convergently Instrumental in MDPs

TurnTrout 21 Nov 2022 21:14 UTC
LW: 2 AF: 2
0
AF
Consider an agent navigating a tree MDP, with utility on the leaf nodes. At any internal node in the tree, ~most utility functions will have the agent retain options by going towards the branch with the most leaves. But all policies use up all available options—they navigate to a leaf with no more power.
I agree that we shouldn’t update too hard for other reasons. EG this post’s focus on optimal policies seems bad because reward is not the optimization target.
- mattmacdermott 22 Nov 2022 19:41 UTC
  1 point
  0
  Parent
  Fair enough, but in that example making irreversible decisions is unavoidable. What if we consider a modified tree such that one and only one branch is traversible in both directions, and utility can be anywhere?
  I expect we get that the reversible brach is the most popular across the distribution of utility functions (but not necessarily that most utility functions prefer it). That sounds like cause for optimism—‘optimal policies tend to avoid irreversible changes’.