Instrumental value could also change due to information. For instance, if David and John learn that there aren’t as many vegetarians as we expected looking to trade away sausage for mushroom, then that also updates our instrumental value for mushroom pizza.
In order for the argument to work in such situations, the contract/precommitment/self-modification will probably also need to allow for updating over time—e.g. commit to a policy rather than a fixed set of preferences.
I think this point largely recontextualizes the argument.
The preference completion procedure described in this post seems more like augmenting the options’ terminal value with instrumental value, the latter being computed based on the beliefs about probabilities of future opportunities for trade/exchange. As those beliefs change (what I think we should expect/be open to, in most cases), the instrumental value changes, and so the completed terminal+instrumental value preference ordering is also updated along with it, leaving only the background skeleton of terminal preference ordering untouched.
Since the completed instrumental preference ordering is instrumental for the purpose of shifting probability mass up the poset skeleton, we get an entity that behaves like a myopic (timeslice?) expected utility maximizer, but is not trying to guard/preserve its completed instrumental preference ordering from change (but it would try to preserve its terminal poset skeleton, if given the opportunity, everything else being equal).
There might be some interesting complications if we allow the agent to take actions that can influence the probability of future trades, which would incentivize it to try to stabilize them in a state that maximizes the expected movement of the probability mass up the terminal preference poset. In that case, the case for the agent striving to complete itself into a stable EUMaximizer becomes stronger. (Although it is also the case that, in general, the ordering of probability distributions over a set induced by a partial ordering over this set is still a partial ordering, so it may not have a unique solution, in which case we’re back to square one.)
I think this point largely recontextualizes the argument.
The preference completion procedure described in this post seems more like augmenting the options’ terminal value with instrumental value, the latter being computed based on the beliefs about probabilities of future opportunities for trade/exchange. As those beliefs change (what I think we should expect/be open to, in most cases), the instrumental value changes, and so the completed terminal+instrumental value preference ordering is also updated along with it, leaving only the background skeleton of terminal preference ordering untouched.
Since the completed instrumental preference ordering is instrumental for the purpose of shifting probability mass up the poset skeleton, we get an entity that behaves like a myopic (timeslice?) expected utility maximizer, but is not trying to guard/preserve its completed instrumental preference ordering from change (but it would try to preserve its terminal poset skeleton, if given the opportunity, everything else being equal).
There might be some interesting complications if we allow the agent to take actions that can influence the probability of future trades, which would incentivize it to try to stabilize them in a state that maximizes the expected movement of the probability mass up the terminal preference poset. In that case, the case for the agent striving to complete itself into a stable EUMaximizer becomes stronger. (Although it is also the case that, in general, the ordering of probability distributions over a set induced by a partial ordering over this set is still a partial ordering, so it may not have a unique solution, in which case we’re back to square one.)