I agree, though if we’re defining rationality as a preference for better methods, I think we ought to further disambiguate between “a decision theory that will dissolve apparent conflicts between what we currently want our future selves to do and what those future selves actually want to do” and “practical strategies for aligning our future incentives with our current ones”
Suppose someone tells you that they’ll offer you $100 tomorrow and $10,000 today if you make a good-faith effort to prevent yourself from accepting the $100 tomorrow. The best outcome would be to make a genuine attempt to disincentivize yourself from accepting the money tomorrow, but fail and accept the money anyway- however, you can’t actually try and make that happen without violating the terms of the deal.
if your effort to constrain your future self on day one does fail, I don’t think there’s a reasonable decision theory that would argue you should reject the money anyway. On day one, you’re being paid to temporarily adopt preferences misaligned with your preferences on day two. You can try to make that change in preferences permanent, or to build an incentive structure to enforce that preference, or maybe even strike an acausal bargain with your day two self, but if all of that fails, you ought to go ahead and accept the $100.
I think coordination problems are a lot like that. They reward you for adopting preferences genuinely at odds with those you may have later on. And what’s rational according to one set of preferences will be irrational according to another.
if your effort to constrain your future self on day one does fail, I don’t think there’s a reasonable decision theory that would argue you should reject the money anyway
That’s one of the things motivating UDT. On day two, you still ask what global policy you should follow (that in particular encompasses your actions in the past, and in the counterfactuals relative to what you actually observe in the current situation). Then you see where/when you actually are, what you actually observe, and enact what the best policy says you do in the current situation. You don’t constrain yourself on day one, but still enact the global policy on day two.
I think coordination problems are a lot like that. They reward you for adopting preferences genuinely at odds with those you may have later on.
Adopting preferences is a lot like enacting a policy, but when enacting a policy you don’t need to adopt preferences, a policy is something external, an algorithmic action (instead of choosing Cooperate, you choose to follow some algorithm that decides what to do, even if that algorithm gets no further input). Contracts in the usual sense act like that, assurance contracts is an example where you are explicitly establishing coordination. You can judge an algorithmic action like you judge an explicit action, but there are more algorithmic actions than there are explicit actions, and algorithmic actions taken by you and your opponents can themselves reason about each other, which enablescoordination.
I agree, though if we’re defining rationality as a preference for better methods, I think we ought to further disambiguate between “a decision theory that will dissolve apparent conflicts between what we currently want our future selves to do and what those future selves actually want to do” and “practical strategies for aligning our future incentives with our current ones”
Suppose someone tells you that they’ll offer you $100 tomorrow and $10,000 today if you make a good-faith effort to prevent yourself from accepting the $100 tomorrow. The best outcome would be to make a genuine attempt to disincentivize yourself from accepting the money tomorrow, but fail and accept the money anyway- however, you can’t actually try and make that happen without violating the terms of the deal.
if your effort to constrain your future self on day one does fail, I don’t think there’s a reasonable decision theory that would argue you should reject the money anyway. On day one, you’re being paid to temporarily adopt preferences misaligned with your preferences on day two. You can try to make that change in preferences permanent, or to build an incentive structure to enforce that preference, or maybe even strike an acausal bargain with your day two self, but if all of that fails, you ought to go ahead and accept the $100.
I think coordination problems are a lot like that. They reward you for adopting preferences genuinely at odds with those you may have later on. And what’s rational according to one set of preferences will be irrational according to another.
That’s one of the things motivating UDT. On day two, you still ask what global policy you should follow (that in particular encompasses your actions in the past, and in the counterfactuals relative to what you actually observe in the current situation). Then you see where/when you actually are, what you actually observe, and enact what the best policy says you do in the current situation. You don’t constrain yourself on day one, but still enact the global policy on day two.
Adopting preferences is a lot like enacting a policy, but when enacting a policy you don’t need to adopt preferences, a policy is something external, an algorithmic action (instead of choosing Cooperate, you choose to follow some algorithm that decides what to do, even if that algorithm gets no further input). Contracts in the usual sense act like that, assurance contracts is an example where you are explicitly establishing coordination. You can judge an algorithmic action like you judge an explicit action, but there are more algorithmic actions than there are explicit actions, and algorithmic actions taken by you and your opponents can themselves reason about each other, which enables coordination.