I suspect that it looks like some version of TDT / UDT, where TDT corresponds to something like trying to update on “being the kind of agent who outputs this action in this situation” and UDT corresponds to something more mysterious that I haven’t been able to find a good explanation of yet, but I haven’t thought about this much.
I can try to explain UDT a bit more if you say what you find mysterious about it. Or if you just want to think about it some more, keep in mind that UDT was designed to solve a bunch of problems at the same time, so if you see some feature of it that seems unmotivated, it might be trying to solve a problem that you haven’t focused on yet.
Another thing to keep in mind is that UDT is currently formulated mainly for AI rather than human use (whereas you seem to be thinking mostly in human terms). For example it assumes that the agent has full “bit-level” access to its own source code, memories and sensory data, which allows UDT to conceptualize a decision (the thing you’re deriving consequences from, or conditioning upon) as a logical fact about the input/output map implemented by a certain piece of code. It avoids human concepts like “being the kind of”, “agent”, or “situation”, which might be hard to fully specify and unambiguously translate to code. The downside is that it’s hard for humans (who do not have full introspective access to their own minds and do think in terms of high level concepts) to apply UDT.
Even more than an explanation, I would appreciate an explanation on the LessWrong Wiki because there currently isn’t one! I’ve just reread through the LW posts I could find about UDT and I guess I should let them stew for awhile. I might also ask people at the current MIRI workshop for their thoughts in person.
Another thing to keep in mind is that UDT is currently formulated mainly for AI rather than human use (whereas you seem to be thinking mostly in human terms).
Only as an intuition pump; when it’s time to get down to brass tacks I’m much happier to talk about a well-specified program than a poorly-specified human.
I wrote a brief mathematical write-up of “bare bones” UDT1 and UDT1.1. The write-up describes the version that Wei Dai gave in his original posts. The write-up doesn’t get into more advanced versions that invoke proof-length limits, try to “play chicken with the universe”, or otherwise develop how the “mathematical intuition module” is supposed to work.
Without trying to make too much of the analogy, I think that I would describe TDT as “non-naive” CDT, and UDT as “non-naive” EDT.
This is not much of an exaggeration. Still, UDT basically solves many toy problems where we get to declare what the output of the MIM is (“Omega tells you that …”).
Even more than an explanation, I would appreciate an explanation on the LessWrong Wiki because there currently isn’t one!
What kind of explanation are you looking for, though? The best explanation of UDT I can currently give, without some sort of additional information about where you find it confusing or how it should be improved, is in my first post about it, Towards a New Decision Theory.
Only as an intuition pump; when it’s time to get down to brass tacks I’m much happier to talk about a well-specified program than a poorly-specified human.
Ah, ok. Some people (such as Ilya Shpitser) do seem to be thinking mostly in terms of human application, so it seems a good idea to make the distinction explicit.
Are there any problems that (U|T)DT are designed to solve which are not one-shot problems? I apologize if this sounds like a stupid question, but I’m having some difficulty understanding all of the purported problems. Those I understand are one-shot problems like the Prisoner’s Dilemma and the Newcomb Problem. Is there anything like the Iterated Prisoner’s Dilemma for which (E|C)DT is inadequate, but (U|T)DT solves?
I can try to explain UDT a bit more if you say what you find mysterious about it. Or if you just want to think about it some more, keep in mind that UDT was designed to solve a bunch of problems at the same time, so if you see some feature of it that seems unmotivated, it might be trying to solve a problem that you haven’t focused on yet.
Another thing to keep in mind is that UDT is currently formulated mainly for AI rather than human use (whereas you seem to be thinking mostly in human terms). For example it assumes that the agent has full “bit-level” access to its own source code, memories and sensory data, which allows UDT to conceptualize a decision (the thing you’re deriving consequences from, or conditioning upon) as a logical fact about the input/output map implemented by a certain piece of code. It avoids human concepts like “being the kind of”, “agent”, or “situation”, which might be hard to fully specify and unambiguously translate to code. The downside is that it’s hard for humans (who do not have full introspective access to their own minds and do think in terms of high level concepts) to apply UDT.
Even more than an explanation, I would appreciate an explanation on the LessWrong Wiki because there currently isn’t one! I’ve just reread through the LW posts I could find about UDT and I guess I should let them stew for awhile. I might also ask people at the current MIRI workshop for their thoughts in person.
Only as an intuition pump; when it’s time to get down to brass tacks I’m much happier to talk about a well-specified program than a poorly-specified human.
I wrote a brief mathematical write-up of “bare bones” UDT1 and UDT1.1. The write-up describes the version that Wei Dai gave in his original posts. The write-up doesn’t get into more advanced versions that invoke proof-length limits, try to “play chicken with the universe”, or otherwise develop how the “mathematical intuition module” is supposed to work.
Without trying to make too much of the analogy, I think that I would describe TDT as “non-naive” CDT, and UDT as “non-naive” EDT.
In this writeup it really seems like all of the content is in how the mathematical intuition module works.
This is not much of an exaggeration. Still, UDT basically solves many toy problems where we get to declare what the output of the MIM is (“Omega tells you that …”).
What kind of explanation are you looking for, though? The best explanation of UDT I can currently give, without some sort of additional information about where you find it confusing or how it should be improved, is in my first post about it, Towards a New Decision Theory.
Ah, ok. Some people (such as Ilya Shpitser) do seem to be thinking mostly in terms of human application, so it seems a good idea to make the distinction explicit.
Are there any problems that (U|T)DT are designed to solve which are not one-shot problems? I apologize if this sounds like a stupid question, but I’m having some difficulty understanding all of the purported problems. Those I understand are one-shot problems like the Prisoner’s Dilemma and the Newcomb Problem. Is there anything like the Iterated Prisoner’s Dilemma for which (E|C)DT is inadequate, but (U|T)DT solves?