TurnTrout comments on Attainable Utility Theory: Why Things Matter

TurnTrout 28 Sep 2019 21:28 UTC
LW: 2 AF: 1
0
AF
Let me substantiate my claim a bit with a random sampling; I just pulled up a relative reachability blogpost. From the first paragraph, (emphasis mine)

An incorrect or incomplete specification of the objective can result in undesirable behavior like specification gaming or causing negative side effects. There are various ways to make the notion of a “side effect” more precise – I think of it as a disruption of the agent’s environment that is unnecessary for achieving its objective. For example, if a robot is carrying boxes and bumps into a vase in its path, breaking the vase is a side effect, because the robot could have easily gone around the vase. On the other hand, a cooking robot that’s making an omelette has to break some eggs, so breaking eggs is not a side effect.

But notice now we’re talking about “disruption of the agent’s environment”. Relative reachability is indeed tackling the impact measure problem, so using what we now understand we might prefer to reframe as:

We think about “side effects” when they change our attainable utilities, so they’re really just a conceptual discretization of “things which negatively affect us”. We want the robot to prefer policies which avoid overly changing our attainable utilities. For example, if a robot is carrying boxes and bumps into a vase in its path, breaking the vase is a side effect, because it’s not that easy for us to repair the vase...