Furthermore, MUPI provides a new formalism that captures some of the core intuitions of functional decision theory (FDT) without resorting to its most problematic element: logical counterfactuals. FDT advises an agent to choose the action that would yield the best outcome if its decision-making function were to produce that output, thereby accounting for all instances of its own algorithm in the world. This enables FDT to coordinate and cooperate well with copies of itself. FDT must reason about what would have happened if its deterministic algorithm had produced a different output, a notion of logical counterfactuals that is not yet mathematically well-defined. MUPI achieves a similar outcome through a different mechanism: the combination of treating universes including itself as programs, while having epistemic uncertainty about which universe it is inhabiting—including which policy it is itself running. As explained in Remark 3.14, from the agent’s internal perspective, it acts as if its choice of action decides which universe it inhabits, including which policy it is running. When it contemplates taking action a, it updates its beliefs w(λ|æ<ta), effectively concentrating probability mass on universes compatible with taking action a. Because the agent’s beliefs about its own policy are coupled with its beliefs about the environment through structural similarities, this process allows the agent to reason about how its choice of action relates to the behavior of other agents that share structural similarities. This “as if” decision-making process allows MUPI to manifest the sophisticated, similarity-aware behavior FDT aims for, but on the solid foundation of Bayesian inference rather than on yet-to-be-formalized logical counterfactuals.
Could we then say that MUPI obtains acausal coordination from a causal decision theory? This has been suggested a few times in the history of Less Wrong.
Wake up babe, new decision theory just dropped!
Yes, it seems to be closer UDT, but… updateful. So not that close to UDT. Really, it’s “just” an mathematically rigorous, embedded EDT.
Could we then say that MUPI obtains acausal coordination from a causal decision theory? This has been suggested a few times in the history of Less Wrong.
Evidential decision theory allows acausal coordination.