In that situation, how do we want an AI to act? There’s a few options, but doing nothing seems like a good default. Constructing the value binding algorithm such that this is the resulting behaviour doesn’t seem that hard, but it might not be trivial.
(and I imagine that the kind of ‘my values bind to something, but in such a way that that it’ll cause me to take very different options than before’ I describe above is much harder to specify)
[addendum]
(and I imagine that the kind of ‘my values bind to something, but in such a way that that it’ll cause me to take very different options than before’ I describe above is much harder to specify)