I’m coming to realize just how much of this stuff derives from Eliezer’s insistance on reflective consistency of a decision theory. Given any decision theory, Eliezer will find an Omega to overthrow it.
But doesn’t a diagonal argument show that no decision theory can be reflectively consistent over all test data presented by a malicious Omega? Just as there is no enumeration of the reals, isn’t there a game which can make any specified rational agent regret its rationality? Omega holds all the cards. He can always make you regret your choice of decision theory.
Just as there is no enumeration of the reals, isn’t there a game which can make any specified rational agent regret its rationality? Omega holds all the cards. He can always make you regret your choice of decision theory.
No. We can ensure that no such problem exists if we assume that (1) only the output decisions are used, not any internals; and (2) every decision is made with access to the full problem statement.
I’m not entirely sure what “every decision is made with full access to the problem statement means”, but I can’t see how it can possibly get around the diagonalisation argument. Basically, Omega just says “I simulated your decision on problem A, on which your algorithm outputs something different from algorithm X, and give you a shiny black ferrari iff you made the same decision as algorithm X”
As cousin_it pointed out last time I brought this up, Caspian made this argument in response to the very first post on the Counterfactual Mugging. I’ve yet to see anyone point out a flaw in it as an existence proof.
As far as I can see the only premise needed for this diagonalisation to work is that your decision theory doesn’t agree with algorithm X on all possible decisions, so just make algorithm X “whatever happens, recite the Bible backwards 17 times”.
I’m not entirely sure what “every decision is made with full access to the problem statement means”, but I can’t see how it can possibly get around the diagonalisation argument. Basically, Omega just says “I simulated your decision on problem A, on which your algorithm outputs something different from algorithm X, and give you a shiny black ferrari iff you made the same decision as algorithm X”
In that case, your answer to problem A is being used in a context other than problem A. That other context is the real problem statement, and you didn’t have it when you chose your answer to A, so it violates the assumption.
Yeah, that definitely violates the “every decision is made with full access to the problem statement” condition. The outcome depends on your decision on problem A, but when making your decision on problem A you have no knowledge that your decision will also be used for this purpose.
I don’t see how this is useful. Let’s take a concrete example, let’s have decision problem A, Omega offers you the choice of $1,000,000, or being slapped in the face with a wet fish. Which would you like your decision theory to choose?
Now, No-mega can simulate you, say, 10 minutes before you find out who he is, and give you 3^^^3 utilons iff you chose the fish-slapping. So your algorithm has to include some sort of prior on the existence of “fish-slapping”-No-megas.
My algorithm “always get slapped in the face with a wet fish where that’s an option”, does better than any sensible algorithm on this particular problem, and I don’t see how this problem is noticeably less realistic than any others.
In other words, I guess I might be willing to believe that you can get around diagonalisation by posing some stringent limits on what sort of all-powerful Omegas you allow (can anyone point me to a proof of that?) but I don’t see how it’s interesting.
Now, No-mega can simulate you, say, 10 minutes before you find out who he is, and give you 3^^^3 utilons iff you chose the fish-slapping. So your algorithm has to include some sort of prior on the existence of “fish-slapping” No-megas.
Actually, no, the probability of fish-slapping No-megas is part of the input given to the decision theory, not part of the decision theory itself. And since every decision theory problem statement comes with an implied claim that it contains all relevant information (a completely unavoidable simplifying assumption), this probability is set to zero.
Decision theory is not about determining what sorts of problems are plausible, it’s about getting from a fully-specified problem description to an optimal answer. Your diagonalization argument requires that the problem not be fully specified in the first place.
“I simulated your decision on problem A, on which your algorithm outputs something different from algorithm X, and give you a shiny black ferrari iff you made the same decision as algorithm X”
This is a no-choice scenario. If you say that the Bible-reciter is the one that will “win” here, you are using the verb “to win” with a different meaning from the one used when we say that a particular agent “wins” by making the choice that leads to the best outcome.
But doesn’t a diagonal argument show that no decision theory can be reflectively consistent over all test data presented by a malicious Omega?
With the strong disclaimer that I have no background in decision theory beyond casually reading LW...
I don’t think so. The point of simulation (Omega) problems, to me, doesn’t seem to be to judo your intelligence against yourself; rather, it is to “throw your DT off the scent”, building weird connections between events (weird, but still vaguely possible, at least for AIs), that a particular DT isn’t capable of spotting and taking into account.
My human, real-life decision theory can be summarised as “look at as many possible end-result worlds as I can, and at what actions will bring them into being; evaluate how much I like each of them; then figure out which actions are most efficient at leading to the best worlds”. But that doesn’t exactly fly when you’re programming a computer, you need something that can be fully formalised, and that is where those strange Omega scenarios are useful, because your code must get it right “on autopilot”, it cannot improvise a smarter approach on the spot—the formula is on paper, and if it can’t solve a given problem, but another one can, it means that there is room for improvement.
In short, DT problems are just clever software debugging.
I agreed with everything you said after “I don’t think so”. So I am left confused as to why you don’t think so.
You analogize DT problems as test data used to determine whether we should accept or reject a decision theory. I am claiming that our requirements (i.e. “reflective consistency”) are so unrealistic that we will always be able to find test data forcing us to reject. Why do you not think so?
Because I suspect that there are only so many functionally different types of connections between events (at the very least, I see no hint that there must be infinitely many) and once you’ve found them all you will have the possibility of writing a DT that can’t be led to corner itself into suboptimal outcomes due to blind spots.
at the very least, I see no hint that there must be infinite ones
Am I correct in interpreting this as “infinitely many of them”? If so, I am curious as to what you mean by “functionally different types of connections between events”. Could you provide an example of some “types of connections between events”? Functionally different ones to be sure.
Presumably, the relevance must be your belief that decision theories differ in just how many of these different kinds of connections they handle correctly. Could you illustrate this by pointing out how the decision theory of your choice handles some types of connections, and why you have confidence that it does so correctly?
Am I correct in interpreting this as “infinitely many of them”?
Oops, yes. Fixed.
If so, I am curious as to what you mean by “functionally different types of connections between events”. Could you provide an example of some “types of connections between events”? Functionally different ones to be sure.
CDT can ‘see’ the classical, everyday causal connections that are marked in formulas with the symbol “>” (and I’d have to spend several hours reading at least the Stanford Encyclopaedia before I could give you a confident definition of that), but it cannot ‘see’ the connection in Newcomb’s problem between the agent’s choice of boxes and the content of the opaque box (sometimes called ‘retrocausality’).
Presumably, the relevance must be your belief that decision theories differ in just how many of these different kinds of connections they handle correctly. Could you illustrate this by pointing out how the decision theory of your choice handles some types of connections, and why you have confidence that it does so correctly?
I don’t have a favourite formal decision theory, because I am not sufficiently familiar with the underlying math and with the literature of discriminating scenarios to pick a horse. If you’re talking about the human decision “theory” of mine I described above, it doesn’t explicitly do that; the key hand-waving passage is “figure out which actions are most efficient at leading to the best worlds”, meaning I’ll use whatever knowledge I currently possess to estimate how big is the set of Everett branches where I do X and get A, compared to the set of those where I do X and get B. (For example, six months ago I hadn’t heard of the concept of acausal connections and didn’t account for them at all while plotting the likelihoods of possible futures, whereas now I do—at least technically; in practice, I think that between human agents they are a negligible factor. For another example, suppose that some years from now I became convinced that the complexity of human minds, and the variability between different ones, were much greater than I previously thought; then, given the formulation of Newcomb’s problem where Omega isn’t explicitly defined as a perfect simulator and all we know is that it has had a 100% success rate so far, I would suitably increase my estimation of the chances of Omega screwing up and making two-boxing profitable.)
CDT can ‘see’ the classical, everyday causal connections that are marked in formulas with the symbol “>” (and I’d have to spend several hours reading at least the Stanford Encyclopaedia before I could give you a confident definition of that), but it cannot ‘see’ the connection in Newcomb’s problem between the agent’s choice of boxes and the content of the opaque box (sometimes called ‘retrocausality’).
Ok, so if I understand you, there are only some finite number of valid kinds of connections between events and when we have all of them incorporated—when our decision theory can “see” each of them—we are then all done. We have the final, perfect decision theory (FPDT).
But what do you do then when someone—call him Yuri Geller—comes along and points out that we left out one important kind of connection: the “superspooky” connection. And then he provides some very impressive statistical evidence that this connection exists and sets up games in front of large (paying) audiences in which FPDT agents fail to WIN. He then proclaims the need for SSPDT.
Or, if you don’t buy that, maybe you will prefer this one. Yuri Geller doesn’t really exist. He is a thought experiment. Still the existence of even the possibility of superspooky connections proves that they really do exist and hence that we need to have SADT—Saint Anselm’s Decision Theory.
Ok, I’ve allowed my sarcasm to get the better of me. But the question remains—how are you ever going to know that you have covered all possible kinds of connections between events?
But the question remains—how are you ever going to know that you have covered all possible kinds of connections between events?
You can’t, I guess. Within an established mathematical model, it may be possible to prove that a list of possible configurations of event pairs {A, B} is exhaustive. But the model may always prove in need of expansion or refinement—whether because some element gets understood and modellised at a deeper level (eg the nature of ‘free’ will) or, more worryingly, because of paradigm shifts about physical reality (eg turns out we can time travel).
Decision theories should usually be seen as normative, not descriptive. How “realistic” something is, is not very important, especially for thought experiments. Decision theory cashes out where you find a situation that can indeed be analyzed with it, and where you’ll secure a better outcome by following theory’s advice. For example, noticing acausal control has advantages in many real-world situations (Parfit’s Hitchhiker variants). Eliezer’s TDT paper discusses this towards the end of Part I.
I believe you misinterpreted my “unrealistic requirements”. A better choice of words would have been “unachievably stringent requirements”. I wasn’t complaining that Omega and the like are unrealistic. At least not here.
The version I have of Eliezer’s TDT paper doesn’t have a “Part I”. It is dated “September 2010 and has 112 pages. Is there a better version available?
I don’t understand your other comments. Or, perhaps more accurately, I don’t understand what they were in response to.
Ok thanks.
I’m coming to realize just how much of this stuff derives from Eliezer’s insistance on reflective consistency of a decision theory. Given any decision theory, Eliezer will find an Omega to overthrow it.
But doesn’t a diagonal argument show that no decision theory can be reflectively consistent over all test data presented by a malicious Omega? Just as there is no enumeration of the reals, isn’t there a game which can make any specified rational agent regret its rationality? Omega holds all the cards. He can always make you regret your choice of decision theory.
No. We can ensure that no such problem exists if we assume that (1) only the output decisions are used, not any internals; and (2) every decision is made with access to the full problem statement.
I’m not entirely sure what “every decision is made with full access to the problem statement means”, but I can’t see how it can possibly get around the diagonalisation argument. Basically, Omega just says “I simulated your decision on problem A, on which your algorithm outputs something different from algorithm X, and give you a shiny black ferrari iff you made the same decision as algorithm X”
As cousin_it pointed out last time I brought this up, Caspian made this argument in response to the very first post on the Counterfactual Mugging. I’ve yet to see anyone point out a flaw in it as an existence proof.
As far as I can see the only premise needed for this diagonalisation to work is that your decision theory doesn’t agree with algorithm X on all possible decisions, so just make algorithm X “whatever happens, recite the Bible backwards 17 times”.
In that case, your answer to problem A is being used in a context other than problem A. That other context is the real problem statement, and you didn’t have it when you chose your answer to A, so it violates the assumption.
Yeah, that definitely violates the “every decision is made with full access to the problem statement” condition. The outcome depends on your decision on problem A, but when making your decision on problem A you have no knowledge that your decision will also be used for this purpose.
I don’t see how this is useful. Let’s take a concrete example, let’s have decision problem A, Omega offers you the choice of $1,000,000, or being slapped in the face with a wet fish. Which would you like your decision theory to choose?
Now, No-mega can simulate you, say, 10 minutes before you find out who he is, and give you 3^^^3 utilons iff you chose the fish-slapping. So your algorithm has to include some sort of prior on the existence of “fish-slapping”-No-megas.
My algorithm “always get slapped in the face with a wet fish where that’s an option”, does better than any sensible algorithm on this particular problem, and I don’t see how this problem is noticeably less realistic than any others.
In other words, I guess I might be willing to believe that you can get around diagonalisation by posing some stringent limits on what sort of all-powerful Omegas you allow (can anyone point me to a proof of that?) but I don’t see how it’s interesting.
Actually, no, the probability of fish-slapping No-megas is part of the input given to the decision theory, not part of the decision theory itself. And since every decision theory problem statement comes with an implied claim that it contains all relevant information (a completely unavoidable simplifying assumption), this probability is set to zero.
Decision theory is not about determining what sorts of problems are plausible, it’s about getting from a fully-specified problem description to an optimal answer. Your diagonalization argument requires that the problem not be fully specified in the first place.
This is a no-choice scenario. If you say that the Bible-reciter is the one that will “win” here, you are using the verb “to win” with a different meaning from the one used when we say that a particular agent “wins” by making the choice that leads to the best outcome.
With the strong disclaimer that I have no background in decision theory beyond casually reading LW...
I don’t think so. The point of simulation (Omega) problems, to me, doesn’t seem to be to judo your intelligence against yourself; rather, it is to “throw your DT off the scent”, building weird connections between events (weird, but still vaguely possible, at least for AIs), that a particular DT isn’t capable of spotting and taking into account.
My human, real-life decision theory can be summarised as “look at as many possible end-result worlds as I can, and at what actions will bring them into being; evaluate how much I like each of them; then figure out which actions are most efficient at leading to the best worlds”. But that doesn’t exactly fly when you’re programming a computer, you need something that can be fully formalised, and that is where those strange Omega scenarios are useful, because your code must get it right “on autopilot”, it cannot improvise a smarter approach on the spot—the formula is on paper, and if it can’t solve a given problem, but another one can, it means that there is room for improvement.
In short, DT problems are just clever software debugging.
I agreed with everything you said after “I don’t think so”. So I am left confused as to why you don’t think so.
You analogize DT problems as test data used to determine whether we should accept or reject a decision theory. I am claiming that our requirements (i.e. “reflective consistency”) are so unrealistic that we will always be able to find test data forcing us to reject. Why do you not think so?
Because I suspect that there are only so many functionally different types of connections between events (at the very least, I see no hint that there must be infinitely many) and once you’ve found them all you will have the possibility of writing a DT that can’t be led to corner itself into suboptimal outcomes due to blind spots.
Am I correct in interpreting this as “infinitely many of them”? If so, I am curious as to what you mean by “functionally different types of connections between events”. Could you provide an example of some “types of connections between events”? Functionally different ones to be sure.
Presumably, the relevance must be your belief that decision theories differ in just how many of these different kinds of connections they handle correctly. Could you illustrate this by pointing out how the decision theory of your choice handles some types of connections, and why you have confidence that it does so correctly?
Oops, yes. Fixed.
CDT can ‘see’ the classical, everyday causal connections that are marked in formulas with the symbol “>” (and I’d have to spend several hours reading at least the Stanford Encyclopaedia before I could give you a confident definition of that), but it cannot ‘see’ the connection in Newcomb’s problem between the agent’s choice of boxes and the content of the opaque box (sometimes called ‘retrocausality’).
I don’t have a favourite formal decision theory, because I am not sufficiently familiar with the underlying math and with the literature of discriminating scenarios to pick a horse. If you’re talking about the human decision “theory” of mine I described above, it doesn’t explicitly do that; the key hand-waving passage is “figure out which actions are most efficient at leading to the best worlds”, meaning I’ll use whatever knowledge I currently possess to estimate how big is the set of Everett branches where I do X and get A, compared to the set of those where I do X and get B. (For example, six months ago I hadn’t heard of the concept of acausal connections and didn’t account for them at all while plotting the likelihoods of possible futures, whereas now I do—at least technically; in practice, I think that between human agents they are a negligible factor. For another example, suppose that some years from now I became convinced that the complexity of human minds, and the variability between different ones, were much greater than I previously thought; then, given the formulation of Newcomb’s problem where Omega isn’t explicitly defined as a perfect simulator and all we know is that it has had a 100% success rate so far, I would suitably increase my estimation of the chances of Omega screwing up and making two-boxing profitable.)
Ok, so if I understand you, there are only some finite number of valid kinds of connections between events and when we have all of them incorporated—when our decision theory can “see” each of them—we are then all done. We have the final, perfect decision theory (FPDT).
But what do you do then when someone—call him Yuri Geller—comes along and points out that we left out one important kind of connection: the “superspooky” connection. And then he provides some very impressive statistical evidence that this connection exists and sets up games in front of large (paying) audiences in which FPDT agents fail to WIN. He then proclaims the need for SSPDT.
Or, if you don’t buy that, maybe you will prefer this one. Yuri Geller doesn’t really exist. He is a thought experiment. Still the existence of even the possibility of superspooky connections proves that they really do exist and hence that we need to have SADT—Saint Anselm’s Decision Theory.
Ok, I’ve allowed my sarcasm to get the better of me. But the question remains—how are you ever going to know that you have covered all possible kinds of connections between events?
You can’t, I guess. Within an established mathematical model, it may be possible to prove that a list of possible configurations of event pairs {A, B} is exhaustive. But the model may always prove in need of expansion or refinement—whether because some element gets understood and modellised at a deeper level (eg the nature of ‘free’ will) or, more worryingly, because of paradigm shifts about physical reality (eg turns out we can time travel).
Decision theories should usually be seen as normative, not descriptive. How “realistic” something is, is not very important, especially for thought experiments. Decision theory cashes out where you find a situation that can indeed be analyzed with it, and where you’ll secure a better outcome by following theory’s advice. For example, noticing acausal control has advantages in many real-world situations (Parfit’s Hitchhiker variants). Eliezer’s TDT paper discusses this towards the end of Part I.
I believe you misinterpreted my “unrealistic requirements”. A better choice of words would have been “unachievably stringent requirements”. I wasn’t complaining that Omega and the like are unrealistic. At least not here.
The version I have of Eliezer’s TDT paper doesn’t have a “Part I”. It is dated “September 2010 and has 112 pages. Is there a better version available?
I don’t understand your other comments. Or, perhaps more accurately, I don’t understand what they were in response to.
“Part I” is chapters 1-9. (This concept is referred to in the paper itself.)