Wow, this is great work—congratulations! If it pans out, it bridges a really fundamental gap.
I’m still digesting the idea, and perhaps I’m jumping the gun here, but I’m trying to envision a UDT (or TDT) agent using the sense of subjective probability you define. It seems to me that an agent can get into trouble even if its subjective probability meets the coherence criterion. If that’s right, some additional criterion would have to be required. (Maybe that’s what you already intend? Or maybe the following is just muddled.)
Let’s try invoking a coherent P in the case of a simple decision problem for a UDT agent. First, define G <--> P(“G”) < 0.1. Then consider the 5&10 problem:
If the agent chooses A, payoff is 10 if ~G, 0 if G.
If the agent chooses B, payoff is 5.
And suppose the agent can prove the foregoing. Then unless I’m mistaken, there’s a coherent P with the following assignments:
P(G) = 0.1
P(Agent()=A) = 0
P(Agent()=B) = 1
P(G | Agent()=B) = P(G) = 0.1
And P assigns 1 to each of the following:
P(“Agent()=A”) < epsilon
P(“Agent()=B”) > 1-epsilon
P(“G & Agent()=B”) / P(“Agent()=B”) = 0.1 +- epsilon
P(“G & Agent()=A”) / P(“Agent()=A”) > 0.5
The last inequality is consistent with the agent indeed choosing B, because the postulated conditional probability of G makes the expected payoff given A less than the payoff given B.
Is that P actually incoherent for reasons I’m overlooking? If not, then we’d need something beyond coherence to tell us which P a UDT agent should use, correct?
I’ve also tried applying this theory to UDT, and have run into similar 5-and-10-ish problems (though I hadn’t considered making the reward depend on a statement like G, that’s a nice trick!). My tentative conclusion is that the reflection principle is too weak to have much teeth when considering a version of UDT based on conditional expected utility, because for all actions A that the agent doesn’t take, we have P(Agent() = A) = 0; we might still have P(“Agent() = A”) > 0 (but smaller than epsilon), but the reflection axioms do not need to hold conditional on Agent() = A, i.e., for X a reflection axiom we can have P assign positive probability to e.g. P(“X & Agent() = A”) / P(“Agent() = A”) < 0.9.
But it’s difficult to ask for more. In order to evaluate the expected utility conditional on choosing A, we need to coherently imagine a world in which the agent would choose A, and if we also asked the probability distribution conditional on choosing A to satisfy the reflection axioms, then choosing A would not be optimal conditional on choosing A—contradiction to the agent choosing A… (We could have P(“Agent() = A”) = 0, but not if you have the agent playing chicken, i.e., play A if P(“Agent() = A”); if we have such a chicken-playing agent, we can coherently imagine a world in which it would play A—namely, a world in which P(“Agent() = A”) = 0 -- but this is a world that assigns probability zero to itself. To make this formal, replace “world” by “complete theory”.)
I think applying this theory to UDT will need more insights. One thing to play with is a formalization of classical game theory:
Specify a decision problem by a function from (a finite set of) possible actions to utilities. This function is allowed to be written in the full formal language containing P(”.”).
Specify a universal agent which takes a decision problem D(.), evaluates the expected utility of every action—not in the UDT way of conditioning on Agent(D) = A, but by simply evaluating the expectation of D(A) under P(”.”) -- and returns the action with the highest expected utility.
Specify a game by a payoff function, which is a function from pure strategy profiles (which assign a pure strategy to every player) to utilities for every player.
Given a game G(.), for every player, recursively define actions A_i := Agent(D_i) and decision problems D_i(a) := G_i(A_1, …, A_(i-1), a, A_(i+1), …, A_n), where G_i is the i’th component of G (i.e., the utility of player i).
Then, (A_1, …, A_n) will be a Nash equilibrium of the game G. I believe it’s also possible to show that for every Nash equilibrium, there is a P(.) satisfying reflection which makes the players play this NE, but I have yet to work carefully through the proof. (Of course we don’t want to become classical economists who believe in defection on the one-shot prisoner’s dilemma, but perhaps thinking about this a bit might help with finding an insight for making an interesting version of UDT work. It seems worth spending at least a bit of time on.)
It occurs to me that my references above to “coherence” should be replaced by “coherence & P(T)=1 & reflective consistency”. That is, there exists (if I understand correctly) a P that has all three properties, and that assigns the probabilities listed above. Therefore, those three properties would not suffice to characterize a suitable P for a UDT agent. (Not that anyone has claimed otherwise.)