Updateless decisions are made by agents that know less, to an arbitrary degree. In UDT proper, there is no choice in how much an agent doesn’t know, you just pick the best policy from a position of maximal ignorance. It’s this policy that needs to respond to possible and counterfactual past/future observations, but the policy itself is no longer making decisions, the only decision was about picking the policy.
But in practice knowing too little leads to inability to actually compute (or even meaningfully “write down”) an optimal decision/policy, it becomes necessary to forget less, and that leads to a decision about how much to forget. When forgetting something, you turn into a different, more ignorant agent. So the choice to forget something in particular is a choice to turn into a particular other agent. More generally, you would interact with this other agent instead of turning into it. Knowing too little is also a problem when there is no clear abstraction of preference that survives the amnesia.
This way, updateless decision making turns into acausal trade where you need to pick who to trade with. There is a change in perspective here, where instead of making a decision personally, you choose whose decision to follow. The object level decision itself is made by someone else, but you pick who to follow based on considerations other than the decision they make. This someone else could also be a moral principle, or common knowledge you have between yourself and another agent, this moral principle or common knowledge just needs to itself take the form of an agent. See also thesecomments.
UDT doesn’t really counter my claim that Newcomb-like problems are problems in which we can’t ignore that our decisions aren’t independent of the state of the world when we make that decision, even though in UDT we know less. To make this clear in the example of Newcomb’s, the policy we pick affects the prediction which then affects the results of the policy when the decision is made. UDT isn’t ignoring the fact that our decision and the state of the world are tied together, even if it possibly represents it in a different fashion. The UDT algorithm takes this into account regardless of whether the UDT agent models this explicitly.
I’ll get to talking about UDT rather than TDT soon. I intend for my next post to be about Counterfactual Mugging and why this is such a confusing problem.
UDT still doesn’t forget enough. Variations on UDT that move towards acausal trade with arbitrary agents are more obviously needed because UDT forgets too much, since that makes it impossible to compute in practice and forgetting less poses a new issue of choosing a particular updateless-to-some-degree agent to coordinate with (or follow). But not forgetting enough can also be a problem.
In general, an external/updateless agent (whose suggested policy the original agent follows) can forget the original preference, pursue a different version of it that has undergone an ontological shift. So it can forget the world and its laws, as long as the original agent would still find it to be a good idea to follow its policy (in advance, based on the updateless agent’s nature, without looking at the policy). This updateless agent is shared among the counterfactual variants of the original agent that exist in the updateless agent’s ontology, it’s their chosen updateless core, the source of coherence in their actions.
Updateless decisions are made by agents that know less, to an arbitrary degree. In UDT proper, there is no choice in how much an agent doesn’t know, you just pick the best policy from a position of maximal ignorance. It’s this policy that needs to respond to possible and counterfactual past/future observations, but the policy itself is no longer making decisions, the only decision was about picking the policy.
But in practice knowing too little leads to inability to actually compute (or even meaningfully “write down”) an optimal decision/policy, it becomes necessary to forget less, and that leads to a decision about how much to forget. When forgetting something, you turn into a different, more ignorant agent. So the choice to forget something in particular is a choice to turn into a particular other agent. More generally, you would interact with this other agent instead of turning into it. Knowing too little is also a problem when there is no clear abstraction of preference that survives the amnesia.
This way, updateless decision making turns into acausal trade where you need to pick who to trade with. There is a change in perspective here, where instead of making a decision personally, you choose whose decision to follow. The object level decision itself is made by someone else, but you pick who to follow based on considerations other than the decision they make. This someone else could also be a moral principle, or common knowledge you have between yourself and another agent, this moral principle or common knowledge just needs to itself take the form of an agent. See also these comments.
UDT doesn’t really counter my claim that Newcomb-like problems are problems in which we can’t ignore that our decisions aren’t independent of the state of the world when we make that decision, even though in UDT we know less. To make this clear in the example of Newcomb’s, the policy we pick affects the prediction which then affects the results of the policy when the decision is made. UDT isn’t ignoring the fact that our decision and the state of the world are tied together, even if it possibly represents it in a different fashion. The UDT algorithm takes this into account regardless of whether the UDT agent models this explicitly.
I’ll get to talking about UDT rather than TDT soon. I intend for my next post to be about Counterfactual Mugging and why this is such a confusing problem.
UDT still doesn’t forget enough. Variations on UDT that move towards acausal trade with arbitrary agents are more obviously needed because UDT forgets too much, since that makes it impossible to compute in practice and forgetting less poses a new issue of choosing a particular updateless-to-some-degree agent to coordinate with (or follow). But not forgetting enough can also be a problem.
In general, an external/updateless agent (whose suggested policy the original agent follows) can forget the original preference, pursue a different version of it that has undergone an ontological shift. So it can forget the world and its laws, as long as the original agent would still find it to be a good idea to follow its policy (in advance, based on the updateless agent’s nature, without looking at the policy). This updateless agent is shared among the counterfactual variants of the original agent that exist in the updateless agent’s ontology, it’s their chosen updateless core, the source of coherence in their actions.
How much do you think we should forget?