Might agent b rewrite agent a‘s brain to make agent a better satisfy agent b’s utility function? Most forms of wire-heading inherently limit the ability of agents to affect the future
and this
We have not proved that agent b does not try to affect agent a‘s utility function (in fact, I expect in many cases agent b does try to influence agent a’s utility function).
appear to be in conflict. Are you trying to say that depending on the circumstances b may try to influence a’s utility function or avoid doing so?
This:
and this
appear to be in conflict. Are you trying to say that depending on the circumstances b may try to influence a’s utility function or avoid doing so?