Gordon Seidoh Worley comments on Is value amendment a convergent instrumental goal?

Gordon Seidoh Worley 18 Oct 2019 18:03 UTC
2 points
There is an interesting addition to this, I think, which is that if a goal of the utility function is to encourage exploration then it paradoxically needs to be extremely robust against being modified while it explores and possibly modifies all other goals. I could easily imagine an agent finding some kind of mechanism to avoid local maxima (exploration) being important enough that it would lock it in so the only thing it can’t not continue to do is explore well enough to not get trapped and keep looking for a global maximum.
- philh 19 Oct 2019 15:04 UTC
  2 points
  Parent
  This comment feels like it’s confusing strategies with goals? That is, I wouldn’t normally think of “exploration” as something that an agent had as a goal but as a strategy it uses to achieve its goals. And “let’s try out a different utility function for a bit” is unlikely to be a direction that a stable agent tries exploring in.