UDT shows that decision theory is more puzzling than ever
I feel like MIRI perhaps mispositioned FDT (their variant of UDT) as a clear advancement in decision theory, whereas maybe they could have attracted more attention/interest from academic philosophy if the framing was instead that the UDT line of thinking shows that decision theory is just more deeply puzzling than anyone had previously realized. Instead of one major open problem (Newcomb’s, or EDT vs CDT) now we have a whole bunch more. I’m really not sure at this point whether UDT is even on the right track, but it does seem clear that there are some thorny issues in decision theory that not many people were previously thinking about:
Indexical values are not reflectively consistent. UDT “solves” this problem by implicitly assuming (via the type signature of its utility function) that the agent doesn’t have indexical values. But humans seemingly do have indexical values, so what to do about that?
The commitment races problem extends into logical time, and it’s not clear how to make the most obvious idea of logical updatelessness work.
UDT says that what we normally think of as different approaches to anthropic reasoning are really different preferences, which seems to sidestep the problem. But is that actually right, and if so where are these preferences supposed to come from?
2TDT-1CDT—If there’s a population of mostly TDT/UDT agents and few CDT agents (and nobody knows who the CDT agents are) and they’re randomly paired up to play one-shot PD, then the CDT agents do better. What does this imply?
Game theory under the UDT line of thinking is generally more confusing than anything CDT agents have to deal with.
UDT assumes that the agent has access to its own source code and inputs as symbol strings, so it can potentially reason about logical correlations between its own decisions and other agents’ as well defined mathematical problems. But humans don’t have this, so how are humans supposed to reason about such correlations?
Logical conditionals vs counterfactuals, how should these be defined and do the definitions actually lead to reasonable decisions when plugged into logical decision theory?
These are just the major problems that I was trying to solve (or hoping for others to solve) before I mostly stopped working on decision theory and switched my attention to metaphilosophy. (It’s been a while so I’m not certain the list is complete.) As far as I know nobody has found definitive solutions to any of these problems yet, and most are wide open.