The CDT and FDT agents have the same utility functions and behave differently. Of course, if you gave them different tailored utility functions you could get them to behave the same in any given case, but that doesn’t seem very sensible, imo.
What I mean is that you can think of “CDT agent with certain utility function” and “FDT agent” as exactly the same. They’re the same concept. So when you say “I don’t think it is approximating FDT. I think it is just different values.” I reply that “different values” and “approximating FDT” are the exact same thing, at least in the case where the mentioned “different values” are “justice, trust, and honor”, in my opinion.
So if both Derek and Will were running CDT, but they were to value honesty at infinity dollars,
In that case, Derek could demand infinite dollars from Will and Will would pay it.
Well, Will only values his life at a million dollars, so he would rather die than pay more than that. I admit that when I wrote that bit I was mentally conflating “one million” and “infinity” to simplify reasoning. Hopefully all the other shortcuts I’m using don’t break anything.
Intuition says the “infinity” here comes from Derek and Will’s infinitely accurate predictions. As in, if the predictions were less than infinitely accurate, then you would need less than infinity dollars of honest-value to make CDT act like FDT. Dunno if that’s true and it doesn’t matter if it does, so, whatever.
[the rest of the reply]
I should have clarified more, oops. I was talking about a minor variation of the scenario where the “negotiation is not possible” restriction is lifted (while still keeping the information asymmetry somehow). In this case, with no other changes, the problem is basically the same, since Derek just says “btw I swear on God almighty that I am not negotiating at all, since this way I get the best outcomes” and then the rest of the scenario plays out the same (as long as we posit that Will’s memory of this exchange is magically erased and so FDT-Will doesn’t consider changing his behavior to get a better deal)
And meanwhile if Derek’s $1,000,000 value on honesty is set to $0 BUT he uses FDT then the exact same thing happens absent any weird commitment-race dynamics with FDT-Will
Meanwhile if Derek has $0 honesty and is CDT and he says “btw no negotiation” then FDT-Will can say “no, screw you, we’re negotiating or I swear I will bury my head in the sand and die” and Derek will say “oh ok, i can tell that you will keep your promise, nevermind then, let’s negotiate”. FDT-Will then says “give me $0.99 and I’ll let you save my life” and poor CDT-Derek will agree.
The FDT-Derek + FDT-Will case is probably important but it scares me and I don’t know how to reason about it. Probably with geometric utility. In this case, if we add a rule saying Derek gets 1 million dollars of utility from being alive, FDT-Derek pays 50 cents to FDT-Will to maximize the logarithm of utility, since Derek gets $1,000,000 + $0.50 and Will gets $1,000,000 + $0.50 and since these numbers are the same utility is maximized which is the best possible output for the function (we are ignoring the honesty-utility here)
I don’t have the time to give this full consideration, but on the whole I think you are correct if Will has the information asymmetry in both the negotiation phase and the payment phase, whereas I was implicitly assuming Will having full information in negotiation and suddenly gaining an information asymmetry in the payment phase (which doesn’t make much sense). So, yeah, I think I agree.