So as I understand it, your (and the MIRI/LW) frame is that:
A choice is not made, it is “discovered” (ie. the choice the agent is determined to make is revealed to it after its computation of a certain “choice-making” procedure). This process is internally indistinguishable from “actually choosing” because we couldn’t know the result of the choice-making computation before doing it. However, an external system *could* know this, for example by simulating us.
There are certain choices we should be more or less happy to discover, or “make” in this sense. We should be more happy to have choice-making procedures that result in happy choices.
Correct decision theory specifies the best choice-making procedures.
I think the issue here is with moving down a level of abstraction to a new “map” in a way that makes the entire ontology of decision theory meaningless.
Yes, on some level, we are just atoms following the laws of physics. There are no “agents”, “identities”, or “decisions”. We can just talk about which configurations of atoms we prefer, and agree that we prefer the configuration of atoms where we get more money.
This is not the correct level for thinking about decision theory—we don’t think about any of our decisions that way. Decision theory is about determining the output of the specific choice-making procedure “consider all available options and pick the best one in the moment”. This is the only sense in which we appear to make choices—insofar as we make choices, those choices are over actions.
A choice is not made, it is “discovered” … choices we should be more or less happy to discover
The decision procedure is what’s making the choices. The diagonalization example was meant to illustrate that even an oracle predicts only at the pleasure of the decision procedure, and only the decision procedure gets to determine the choice. Nothing else gets to dictate to the decision procedure what the choice it, it’s not following a predetermined destiny, instead the destiny has no choice but obey the decision procedure. Also, decision procedures are us, we are the decision procedures when making our own choices, the decision procedures are not some external additional things.
Decision theory is about determining the output of the specific choice-making procedure “consider all available options and pick the best one in the moment”.
Sounds like a reasonable decision procedure. Except most of the options are inevitably not actually chosen, not what gets to actually happen, that’s just how it is. You get to choose which one is actual, and you are free to do so as you wish, since nothing but you determines which one that is, and all the oracles and laws of physics and transistors have to comply with whatever you choose (because that’s just what it means to predict/instantiate/execute you correctly).
This is not the correct level for thinking about decision theory—we don’t think about any of our decisions that way. Decision theory is about determining the output of the specific choice-making procedure “consider all available options and pick the best one in the moment”.
Act only according to that maxim whereby you can at the same time will that it should become a universal law.
I don’t think this is incompatible with making the best decision in the moment. You just decide in the moment to go with the more sophisticated version of the categorical imperative, because that seems best?
If I didn’t reason like this, I would not vote and I would have a harder time to stick to commitments.
I agree thinking about decisions in a way that is not purely greedy is complicated.
Categorical imperative has been popular for a long while
I think Rationalists have stumbled into reasonable beliefs about good strategies for iterated games/situations where reputation matters and people learn about your actions. But you don’t need exotic decision theories for that.
I address this in the post:
...makes sense under two conditions:
Their cooperative actions directly cause desirable outcomes by making observers think they are trustworthy/cooperative.
Being deceptive is too costly, either because it’s literally difficult (requires too much planning/thought), or because it makes future deception impossible (e.g. because of reputation and repeated interactions).
Of course, whether or not we have some free will, we are not entirely free—some actions are outside of our capability. Being sufficiently good at deception may be one of these. Hence why one might rationally decide to always be honest and cooperative—successfully only pretending to be so when others are watching might be literally impossible (and messing up once might be very costly).
How does your purely causal framing escape backward induction? Pure CDT agents defect in the iterated version of the prisoners’ dilemma, too. Since at the last time step you wouldn’t care about your reputation.
In conclusion, if you find yourself freely choosing between options, it’s rational to take a dominating strategy, like two-boxing in Newcomb’s problem, or defecting in the sorted prisoner’s dilemma. However, given the opportunity to actually pre-commit to decisions that get you better outcomes provided your pre-commitment, you should do so.
How do you tell if you are in a “pre-commitment” or in a defecting situation?
So as I understand it, your (and the MIRI/LW) frame is that:
A choice is not made, it is “discovered” (ie. the choice the agent is determined to make is revealed to it after its computation of a certain “choice-making” procedure). This process is internally indistinguishable from “actually choosing” because we couldn’t know the result of the choice-making computation before doing it. However, an external system *could* know this, for example by simulating us.
There are certain choices we should be more or less happy to discover, or “make” in this sense. We should be more happy to have choice-making procedures that result in happy choices.
Correct decision theory specifies the best choice-making procedures.
I think the issue here is with moving down a level of abstraction to a new “map” in a way that makes the entire ontology of decision theory meaningless.
Yes, on some level, we are just atoms following the laws of physics. There are no “agents”, “identities”, or “decisions”. We can just talk about which configurations of atoms we prefer, and agree that we prefer the configuration of atoms where we get more money.
This is not the correct level for thinking about decision theory—we don’t think about any of our decisions that way. Decision theory is about determining the output of the specific choice-making procedure “consider all available options and pick the best one in the moment”. This is the only sense in which we appear to make choices—insofar as we make choices, those choices are over actions.
The decision procedure is what’s making the choices. The diagonalization example was meant to illustrate that even an oracle predicts only at the pleasure of the decision procedure, and only the decision procedure gets to determine the choice. Nothing else gets to dictate to the decision procedure what the choice it, it’s not following a predetermined destiny, instead the destiny has no choice but obey the decision procedure. Also, decision procedures are us, we are the decision procedures when making our own choices, the decision procedures are not some external additional things.
Sounds like a reasonable decision procedure. Except most of the options are inevitably not actually chosen, not what gets to actually happen, that’s just how it is. You get to choose which one is actual, and you are free to do so as you wish, since nothing but you determines which one that is, and all the oracles and laws of physics and transistors have to comply with whatever you choose (because that’s just what it means to predict/instantiate/execute you correctly).
Categorical imperative has been popular for a long while:
I don’t think this is incompatible with making the best decision in the moment. You just decide in the moment to go with the more sophisticated version of the categorical imperative, because that seems best? If I didn’t reason like this, I would not vote and I would have a harder time to stick to commitments. I agree thinking about decisions in a way that is not purely greedy is complicated.
I think Rationalists have stumbled into reasonable beliefs about good strategies for iterated games/situations where reputation matters and people learn about your actions. But you don’t need exotic decision theories for that.
I address this in the post:
How does your purely causal framing escape backward induction? Pure CDT agents defect in the iterated version of the prisoners’ dilemma, too. Since at the last time step you wouldn’t care about your reputation.
How do you tell if you are in a “pre-commitment” or in a defecting situation?