Your points I think are both addressed by the point MacAskill makes that, perhaps in some cases it’s best to be the type of agent that follows functional decision theory. Sometimes rationality will be bad for you—if there’s a demon who tortures all rational people, for example. And as Schwarz points out, in the twin case, you’ll get less utility by following FDT—you don’t always want to be a FDTist.
I find your judgment about the blackmail case crazy! Yes, agents who give in to blackmail do worse on average. Yes, you want to be the kind of agent who never gives in to blackmail. But all of those are consistent with the obvious truth that giving into blackmail, once you’re in that scenario, makes things worse for you and is clearly irrational.
Sometimes rationality will be bad for you—if there’s a demon who tortures all rational people, for example
At some point this gets down to semantics. I think a reasonable question to answer is “what decision rule should be chosen by an engineer who wants to build an agent scoring the most utility across its lifetime?” (quoting from Schwarz). I’m not sure if the answer to this question is well described as rationality, but it seems like a good question to answer to me. (FDT is sort of an attempted answer to this question if you define “decision rule” somewhat narrowly.)
Suppose that I beat up all rational people so that they get less utility. This would not make rationality irrational. It would just mean that the world is bad for the rational. The question you’ve described might be a fine one, but it’s not what philosophers are arguing about in Newcombe’s problem. If Eliezer claims to have revolutionized decision theory, and then doesn’t even know enough about decision theory to know that he is answering a different question from the decision theorists, that is an utter embarrassment that significantly undermines his credibility.
And in that case, Newcombe’s problem becomes trivial. Of course if Newcombe’s problem comes up a lot, you should design agents that one box—they get more average utility. The question is about what’s rational for the agent to do, not what’s rational for it to commit to, become, or what’s rational for its designers to do.
And as Schwarz points out, in the twin case, you’ll get less utility by following FDT—you don’t always want to be a FDTist.
I can’t seem to find this in the linked blog post. (I see discussion of the twin case, but not a case where you get less utility from precommiting to follow FDT at the start of time.)
I find your judgment about the blackmail case crazy!
What about the simulation case? Do you think CDT with non-indexical preferences is crazy here also?
More generally, do you find the idea of legible precommitment to be crazy?
Sorry, I said twin case, I meant the procreation case!
The simulation case seems relevantly like the normal twin case which I’m not as sure about.
Legible precommitment is not crazy! Sometimes, it is rational to agree to do the irrational thing in some case. If you have the ability to make it so that you won’t later change your mind, you should do that. But once you’re in that situation, it makes sense to defect.
As far as I can tell, the procreation case isn’t defined well enough in Schwarz for me to enage with it. In particular, in what exact way are the decision of my father and I entangled? (Just saying the father follows FDT isn’t enough.) But, I do think there is going to be a case basically like this where I bite the bullet here. Noteably, so does EDT.
That would mean that believed he had a father with the same reasons, who believed he had a father with the same reasons, who believed he had a father with the same reasons...
I.e., this would require an infinite line of forefathers. (Or at least of hypothetical, believed-in forefathers.)
If anywhere there’s a break in the chain — that person would not have FDT reasons to reproduce, so neither would their son, etc.
Which makes it disanalogous from any cases we encounter in real life. And makes me more sympathetic to the FDT reasoning, since it’s a stranger case where I have less strong pre-existing intuitions.
...which makes the Procreation case an unfair problem. It punishes FDT’ers specifically for following FDT. If we’re going to punish decision theories for their identity, no decision theory is safe. It’s pretty wild to me that @WolfgangSchwarz either didn’t notice this or doesn’t think it’s a problem.
A more fair version of Procreation would be what I have called Procreation*, where your father follows the same decision theory as you (be it FDT, CDT or whatever).
Cool, so you maybe agree that CDT agents would want to self modify into something like FDT agents (if they could). Then I suppose we might just disagree on the semantics behind the word rational.
(Note that CDT agents don’t exactly self-modify into FDT agents, just something close.)
Your points I think are both addressed by the point MacAskill makes that, perhaps in some cases it’s best to be the type of agent that follows functional decision theory. Sometimes rationality will be bad for you—if there’s a demon who tortures all rational people, for example. And as Schwarz points out, in the twin case, you’ll get less utility by following FDT—you don’t always want to be a FDTist.
I find your judgment about the blackmail case crazy! Yes, agents who give in to blackmail do worse on average. Yes, you want to be the kind of agent who never gives in to blackmail. But all of those are consistent with the obvious truth that giving into blackmail, once you’re in that scenario, makes things worse for you and is clearly irrational.
At some point this gets down to semantics. I think a reasonable question to answer is “what decision rule should be chosen by an engineer who wants to build an agent scoring the most utility across its lifetime?” (quoting from Schwarz). I’m not sure if the answer to this question is well described as rationality, but it seems like a good question to answer to me. (FDT is sort of an attempted answer to this question if you define “decision rule” somewhat narrowly.)
Suppose that I beat up all rational people so that they get less utility. This would not make rationality irrational. It would just mean that the world is bad for the rational. The question you’ve described might be a fine one, but it’s not what philosophers are arguing about in Newcombe’s problem. If Eliezer claims to have revolutionized decision theory, and then doesn’t even know enough about decision theory to know that he is answering a different question from the decision theorists, that is an utter embarrassment that significantly undermines his credibility.
And in that case, Newcombe’s problem becomes trivial. Of course if Newcombe’s problem comes up a lot, you should design agents that one box—they get more average utility. The question is about what’s rational for the agent to do, not what’s rational for it to commit to, become, or what’s rational for its designers to do.
I can’t seem to find this in the linked blog post. (I see discussion of the twin case, but not a case where you get less utility from precommiting to follow FDT at the start of time.)
What about the simulation case? Do you think CDT with non-indexical preferences is crazy here also?
More generally, do you find the idea of legible precommitment to be crazy?
Sorry, I said twin case, I meant the procreation case!
The simulation case seems relevantly like the normal twin case which I’m not as sure about.
Legible precommitment is not crazy! Sometimes, it is rational to agree to do the irrational thing in some case. If you have the ability to make it so that you won’t later change your mind, you should do that. But once you’re in that situation, it makes sense to defect.
As far as I can tell, the procreation case isn’t defined well enough in Schwarz for me to enage with it. In particular, in what exact way are the decision of my father and I entangled? (Just saying the father follows FDT isn’t enough.) But, I do think there is going to be a case basically like this where I bite the bullet here. Noteably, so does EDT.
Your father followed FDT and had the same reasons to procreate as you. He is relevantly like you.
That would mean that believed he had a father with the same reasons, who believed he had a father with the same reasons, who believed he had a father with the same reasons...
I.e., this would require an infinite line of forefathers. (Or at least of hypothetical, believed-in forefathers.)
If anywhere there’s a break in the chain — that person would not have FDT reasons to reproduce, so neither would their son, etc.
Which makes it disanalogous from any cases we encounter in real life. And makes me more sympathetic to the FDT reasoning, since it’s a stranger case where I have less strong pre-existing intuitions.
...which makes the Procreation case an unfair problem. It punishes FDT’ers specifically for following FDT. If we’re going to punish decision theories for their identity, no decision theory is safe. It’s pretty wild to me that @WolfgangSchwarz either didn’t notice this or doesn’t think it’s a problem.
A more fair version of Procreation would be what I have called Procreation*, where your father follows the same decision theory as you (be it FDT, CDT or whatever).
Cool, so you maybe agree that CDT agents would want to self modify into something like FDT agents (if they could). Then I suppose we might just disagree on the semantics behind the word rational.
(Note that CDT agents don’t exactly self-modify into FDT agents, just something close.)