There’s also MUPI now, which tries to sidestep logical counterfactuals:
FDT must reason about what would have happened if its deterministic algorithm had produced a different output, a notion of logical counterfactuals that is not yet mathematically well-defined. MUPI achieves a similar outcome through a different mechanism: the combination of treating universes including itself as programs, while having epistemic uncertainty about which universe it is inhabiting—including which policy it is itself running. As explained in Remark 3.14, from the agent’s internal perspective, it acts as if its choice of action decides which universe it inhabits, including which policy it is running. When it contemplates taking action a, it updates its beliefs w(λ|æ<ta), effectively concentrating probability mass on universes compatible with taking action a. Because the agent’s beliefs about its own policy are coupled with its beliefs about the environment through structural similarities, this process allows the agent to reason about how its choice of action relates to the behavior of other agents that share structural similarities. This “as if” decision-making process allows MUPI to manifest the sophisticated, similarity-aware behavior FDT aims for, but on the solid foundation of Bayesian inference rather than on yet-to-be-formalized logical counterfactuals.
I’d love to see more engagement by MIRI folks as to whether this successfully formalizes a form of LDT or FDT.
There’s also MUPI now, which tries to sidestep logical counterfactuals:
I’d love to see more engagement by MIRI folks as to whether this successfully formalizes a form of LDT or FDT.