But what if the doctor is confident of keeping it a secret? Well, then causal decision theory would indeed tell her to harvest his organs, but TDT (and also UDT) would strongly advise her against it. Because if TDT endorsed the action, then other people would be able to deduce that TDT endorsed the action, and that (whether or not it had happened in any particular case) their lives would be in danger in any hospital run by a timeless decision theorist, and then we’d be in much the same boat. Therefore TDT calculates that the correct thing for TDT to output in order to maximize utility is “Don’t kill the traveler,” and thus the doctor doesn’t kill the traveler.
If you make things more inconvenient, no one will ever suspect that the doctor runs TDT, either.
That’s a good point, and why I like Gary Drescher’s handling of such dilemmas in Good and Real (ch. 7), which allows a kind of consequentialist but acausal (or extracausal) reasoning. His decision theory is the TDT-like “subjunctive reciprocity” which says that “You should do what you self-interestedly wish all similarly-situtated beings would do, because if you would regard it as optimal, so would they”. That is, as a matter of the structure of reality, beings that are similar to you have similar input/output relationships—so make sure you instantiate “good” I/O behavior.
The upshot is that subjunctive reciprocity can justify not harvesting in this case, and similar cases where we can assume no negative causal fallout from the doctor’s decision. Why? Because if the doctor reasons that way (kill/harvest the patient “for the greater good”) it logically follows that other people in similar situations would do the same thing (and thereby expect and adapt for it), even though there would be (by supposition) no causal influence between the doctor’s decision and those other cases. On the contrary: like in Newcomb-like problems, the relevant effect is not one your decision causes; instead, your decision and the effect are common causal descendents of the same parent—the parent being “the extent to which humans instantiate a decision theory like this”.
Similarly, Drescher reasons e.g. that “I should not cheat business partners, even if I could secretly get away with it, because if I regarded this as optimal, it’s more likely that my business partner regards it as optimal to do to me.” More importantly, “If I make my decision depend on whether I’m likely to get caught, so will my partner’s decision, a dependency I do not wish existed.”
Consider the case of a 1000x run of Prisoner’s Dilemma between 100 “similar” people, where they get randomly paired and play the game many times, but don’t know anyone else’s decision until it’s all over. In that case, if you always cooperate, you are more likely to find yourself with more cooperating partners, even though your decisions can’t possibly causally influence the others.
As for whether it’s a good model of human behavior to assume that people even think about “Does the doctor follow TDT?”, look at it this way: people may not think about it in these exact terms, but then, why else are people convinced by arguments like, “What if everyone did that?” or “How would you like it if the roles were reversed?”, even when
everyone doesn’t “do that”,
you don’t causally influence whether people “do that”, and
the roles aren’t likely to be reversed.
It seems they’re relying on these “subjunctive acausal means-ends links” (SAMELs in my article) that cannot be justified in consequentialist-causal terms.
On the other hand, other people knowing that you’re a TDT agent is the reason TDT agents are able to cooperate. TDT on a large scale kind of depends on other people suspecting it.
Please note that Alicorn’s point is correct. If you are using this as a thought experiment, take “no one will ever suspect the doctor is a TDT agent” as an axiom. If you are looking at it to analyze decision theory, you do have to consider other factors.
Is there anything formalized, or even informalized, showing the boundaries of the circumstances wherein TDT agents should reveal their TDTness, and the subset of those where they should reveal their source code? I’m not aware of any; only of regions within that space where we’re pretty sure they should be open about it.
This least convenient world basically requires the doctor (or perhaps all doctors) to have a massive epistemic advantage over every other human being, such that the idea won’t even cross any patient’s mind.
In general, even if you’re that much smarter than 99.9% of the world, you need to take into account that other people in the 0.1% can communicate their suspicions to everyone else.
I don’t think that follows. The LCPW can allow that people could imagine others being TDT agents, but that the scenario does not provide sufficient evidence that the doctor is killing patients. Furthermore, if only 0.1% of the population can detect the doctor’s decision theory, it’s unlikely that their arguments will be comprehensible to the rest of the population, at least at a cost low enough to have the purported “people stop going to doctors” effect.
Yeah, but at that point you’re not really talking medical ethics, you’re playing against the rest of the world in a game of “God dicks you over in mysterious ways.”
If someone is known, by their friends and family, to be relatively aware when it comes to such issues; and warns said friends and family of this danger, they will not need to give a comprehensible argument.
Their statement is, in itself, evidence to those who trust them.
That that, specific, doctor runs TDT perhaps; but it is implausible to the point of irrelevance that no-one would ever suspect that any doctor anywhere runs on a TDT-esque thought process.
And people suspecting that any doctor might run such processes is sufficient harm.
I wonder what fraction of the world’s population has the necessary concepts in their heads to believe (or disbelieve) anything even slightly like “some doctors use TDT or something like it”. I’d have thought well below 0.1%.
The relevant thing to a TDT person is “how likely is it that there’s someone simulating my mind sufficiently accurately?”
“how trustworthy are doctors?” is a question that results in a simulation of a doctors mind. It seems, to me, that many people simulating that doctors mind will be capable of simulating it sufficiently accurately; even if they don’t understand (on a conscious level) all the necessary jargon to explain what they are doing.
I was aware of, and practising, timeless decision theory before ever stumbling across Lesswrong, and, while I know this may just be the “typical mind fallacy” I would be surprised if only 0.1% of people had similar thoughts.
Sure, I didn’t call it TDT, because that is a piece of jargon only present in this community, but the basic principle is certainly not unique, or unknown, and I would expect that even many who don’t undestand it would use it subconsciously.
If you make things more inconvenient, no one will ever suspect that the doctor runs TDT, either.
That’s a good point, and why I like Gary Drescher’s handling of such dilemmas in Good and Real (ch. 7), which allows a kind of consequentialist but acausal (or extracausal) reasoning. His decision theory is the TDT-like “subjunctive reciprocity” which says that “You should do what you self-interestedly wish all similarly-situtated beings would do, because if you would regard it as optimal, so would they”. That is, as a matter of the structure of reality, beings that are similar to you have similar input/output relationships—so make sure you instantiate “good” I/O behavior.
The upshot is that subjunctive reciprocity can justify not harvesting in this case, and similar cases where we can assume no negative causal fallout from the doctor’s decision. Why? Because if the doctor reasons that way (kill/harvest the patient “for the greater good”) it logically follows that other people in similar situations would do the same thing (and thereby expect and adapt for it), even though there would be (by supposition) no causal influence between the doctor’s decision and those other cases. On the contrary: like in Newcomb-like problems, the relevant effect is not one your decision causes; instead, your decision and the effect are common causal descendents of the same parent—the parent being “the extent to which humans instantiate a decision theory like this”.
Similarly, Drescher reasons e.g. that “I should not cheat business partners, even if I could secretly get away with it, because if I regarded this as optimal, it’s more likely that my business partner regards it as optimal to do to me.” More importantly, “If I make my decision depend on whether I’m likely to get caught, so will my partner’s decision, a dependency I do not wish existed.”
Consider the case of a 1000x run of Prisoner’s Dilemma between 100 “similar” people, where they get randomly paired and play the game many times, but don’t know anyone else’s decision until it’s all over. In that case, if you always cooperate, you are more likely to find yourself with more cooperating partners, even though your decisions can’t possibly causally influence the others.
I discussed the merit of incorporating acausal consequences into your reasoning in Morality as Parfitian-Filtered Decision Theory.
As for whether it’s a good model of human behavior to assume that people even think about “Does the doctor follow TDT?”, look at it this way: people may not think about it in these exact terms, but then, why else are people convinced by arguments like, “What if everyone did that?” or “How would you like it if the roles were reversed?”, even when
everyone doesn’t “do that”,
you don’t causally influence whether people “do that”, and
the roles aren’t likely to be reversed.
It seems they’re relying on these “subjunctive acausal means-ends links” (SAMELs in my article) that cannot be justified in consequentialist-causal terms.
On the other hand, other people knowing that you’re a TDT agent is the reason TDT agents are able to cooperate. TDT on a large scale kind of depends on other people suspecting it.
Please note that Alicorn’s point is correct. If you are using this as a thought experiment, take “no one will ever suspect the doctor is a TDT agent” as an axiom. If you are looking at it to analyze decision theory, you do have to consider other factors.Is there anything formalized, or even informalized, showing the boundaries of the circumstances wherein TDT agents should reveal their TDTness, and the subset of those where they should reveal their source code? I’m not aware of any; only of regions within that space where we’re pretty sure they should be open about it.
This least convenient world basically requires the doctor (or perhaps all doctors) to have a massive epistemic advantage over every other human being, such that the idea won’t even cross any patient’s mind.
In general, even if you’re that much smarter than 99.9% of the world, you need to take into account that other people in the 0.1% can communicate their suspicions to everyone else.
I don’t think that follows. The LCPW can allow that people could imagine others being TDT agents, but that the scenario does not provide sufficient evidence that the doctor is killing patients. Furthermore, if only 0.1% of the population can detect the doctor’s decision theory, it’s unlikely that their arguments will be comprehensible to the rest of the population, at least at a cost low enough to have the purported “people stop going to doctors” effect.
Yeah, but at that point you’re not really talking medical ethics, you’re playing against the rest of the world in a game of “God dicks you over in mysterious ways.”
If someone is known, by their friends and family, to be relatively aware when it comes to such issues; and warns said friends and family of this danger, they will not need to give a comprehensible argument.
Their statement is, in itself, evidence to those who trust them.
That that, specific, doctor runs TDT perhaps; but it is implausible to the point of irrelevance that no-one would ever suspect that any doctor anywhere runs on a TDT-esque thought process.
And people suspecting that any doctor might run such processes is sufficient harm.
I wonder what fraction of the world’s population has the necessary concepts in their heads to believe (or disbelieve) anything even slightly like “some doctors use TDT or something like it”. I’d have thought well below 0.1%.
I think they would probably frame a question with similar content as “how trustworthy are doctors?”
I don’t think the content of that question is similar at all.
The relevant thing to a TDT person is “how likely is it that there’s someone simulating my mind sufficiently accurately?”
“how trustworthy are doctors?” is a question that results in a simulation of a doctors mind. It seems, to me, that many people simulating that doctors mind will be capable of simulating it sufficiently accurately; even if they don’t understand (on a conscious level) all the necessary jargon to explain what they are doing.
I was aware of, and practising, timeless decision theory before ever stumbling across Lesswrong, and, while I know this may just be the “typical mind fallacy” I would be surprised if only 0.1% of people had similar thoughts.
Sure, I didn’t call it TDT, because that is a piece of jargon only present in this community, but the basic principle is certainly not unique, or unknown, and I would expect that even many who don’t undestand it would use it subconsciously.