The first paragraph is my response to how you describe UDT in the post, I think the slightly different framing where only the abstract algorithm is the agent fits UDT better. It only makes the decision to choose the policy, but it doesn’t make commitments for itself, because it only exists for that single decision, influencing all the concrete situations where that decision (policy) gets accessed/observed/computed (in part).
The second paragraph is the way I think about how to improve on UDT, but I don’t have a formulation of it that I like. Specifically, I don’t like for those past abstract agents to be an explicit part of a multi-step history, like in Logical Induction (or a variant that includes utilities), with explicit stages. It seems too much of a cludge and doesn’t seem to have a prospect of describing coordination between different agents (with different preferences and priors).
Past stages should be able to take into account their influence on any computations at all that choose to listen to them, not just things that were explicitly included as later stages or receiving messages or causal observations, in a particular policy formulation game. Influence on outcomes mediated purely through choice of behavior that an abstract algorithm makes for itself also seems more in spirit of UDT. The issue with UDT is that it tries to do too much in that single policy-choosing thing that it wants to be an algorithm but that mostly can’t be an actual algorithm, rather than working through smaller actual algorithms that form parts of a larger setting, interacting through choice of their behavior and by observing each other’s behavior.
This sounds like how Scott formulated it, but as far as I know none of the actual (semi)formalizations look like this this.
The first paragraph is my response to how you describe UDT in the post, I think the slightly different framing where only the abstract algorithm is the agent fits UDT better. It only makes the decision to choose the policy, but it doesn’t make commitments for itself, because it only exists for that single decision, influencing all the concrete situations where that decision (policy) gets accessed/observed/computed (in part).
The second paragraph is the way I think about how to improve on UDT, but I don’t have a formulation of it that I like. Specifically, I don’t like for those past abstract agents to be an explicit part of a multi-step history, like in Logical Induction (or a variant that includes utilities), with explicit stages. It seems too much of a cludge and doesn’t seem to have a prospect of describing coordination between different agents (with different preferences and priors).
Past stages should be able to take into account their influence on any computations at all that choose to listen to them, not just things that were explicitly included as later stages or receiving messages or causal observations, in a particular policy formulation game. Influence on outcomes mediated purely through choice of behavior that an abstract algorithm makes for itself also seems more in spirit of UDT. The issue with UDT is that it tries to do too much in that single policy-choosing thing that it wants to be an algorithm but that mostly can’t be an actual algorithm, rather than working through smaller actual algorithms that form parts of a larger setting, interacting through choice of their behavior and by observing each other’s behavior.