Does any process in which they ended up the way they did without considering your decision procedure count as #2? Like, suppose almost all the other agents it expects to encounter are CDT agents that do give in to extortion, and it thinks the risk of nuclear war with the occasional rock or UDT agent is worth it.
almost all the other agents it expects to encounter are CDT agents
Given this particular setup (you both get source codes of each other and make decision simultaneously without any means to verify choices of counterparty until outcomes happen), you shouldn’t self-modify into extortionist, because CDT agents always defect, because no amount of reasoning about source code can causally affect your decision and D-D is Nash equilibrium. CDT agents can expect with high probability to meet extortionist in the future and self-modify into weird Son-of-CDT agent, which gives in to extortion, but for this setup to work in any non-trivial way you should be at least EDT-ish.
But yes, general principle here is “evaluate how much other player decision procedure is logically influenced by my decision procedure, calculate expected value, act accordingly”. The same is true for situation when you decide about self-modification.
For example, if you think that modifying into extortionist is a good policy, it can lead to situation where everyone is extortionist and everybody nukes each other.
Does any process in which they ended up the way they did without considering your decision procedure count as #2? Like, suppose almost all the other agents it expects to encounter are CDT agents that do give in to extortion, and it thinks the risk of nuclear war with the occasional rock or UDT agent is worth it.
Given this particular setup (you both get source codes of each other and make decision simultaneously without any means to verify choices of counterparty until outcomes happen), you shouldn’t self-modify into extortionist, because CDT agents always defect, because no amount of reasoning about source code can causally affect your decision and D-D is Nash equilibrium. CDT agents can expect with high probability to meet extortionist in the future and self-modify into weird Son-of-CDT agent, which gives in to extortion, but for this setup to work in any non-trivial way you should be at least EDT-ish.
But yes, general principle here is “evaluate how much other player decision procedure is logically influenced by my decision procedure, calculate expected value, act accordingly”. The same is true for situation when you decide about self-modification.
For example, if you think that modifying into extortionist is a good policy, it can lead to situation where everyone is extortionist and everybody nukes each other.