Gödel’s First Incompleteness Theorem and the Halting Problem both imply that it’s impossible to write a program that correctly deduces in general1 the output of arbitrary other programs. So we have to be prepared for our inference module to fail sometimes.
And here you fail the very basic thing, the domain of applicability: limited computational power. Unlimited computational power introduces nasty infinities into things. “Naive decision theory” is not interested in games where you confuse yourself using infinities. The contest organizers have to be able to know output of any program, to be able to score! The contest should not be run by a halting oracle. Each program has to terminate in N cpu cycles. That’s how contests are done. Nobody who comes up with naive decision theory really cares about contests run by a halting oracle.
Let’s make it a little more practical: finite time and space. You can run any other programs, but with a caveat: you may run out of time. You will have heuristics to optimize the code before running it, which will not be able to work on optimized code, and for which you’ll know exactly how many cycles your optimizer has shaved off, so you know if you have to abort, or not (assuming there is no penalty for non-terminating in time if other program did not terminate in time either). You also recognize any instances of the self, for which you know their output equals your output, and you recognize instances of self as well as the code that runs instances of you.
Then you proceed with this ‘prove outputs(X) = a’ . This is something out of some TDT or the like. NDT does not do this kind of stuff. You run in limited space and time; your optimizer does not work on your own source code (due to inclusion of the optimizer in your source code), and the only way you could prove that you output a, is for you to run yourself and find yourself outputting a, which you can’t do because this way you enter infinite recursion and run out of cycles. You can’t just drop in, ‘okay what if you prove this’. What if you prove 1=2 ? You will end up proving all numbers are equal by 1+1=2+1 and so on.
On top of this, while you assume that you output xi, you do not assume that you are utility optimizer any more.
It’s really easy to confuse oneself with abstractions, without seeing the trees in the forest. You need to step by step, with actual proof, to show that the programs will end up proving that they output something.
edit: ahh, and another thing: you don’t quite assume (outputs X) = xi in this way. I.e. you don’t put your source into left side of the statement; you black-box yourself. You assume your output is a, and you try different values of a and then calculate payoff, without assuming a maximizes utility. You also, whenever you see a copy of yourself, substitute a.
Also, your intuitions for how to do decision theory are an unformalized version of something like CDT or TDT, as you’ll see soon enough. [snark deleted by author]
I never assumed that the inference module was infinitary- in fact, I said that it should be designed to return either a correct answer or “Unknown”.
The very first example that was used for the self-fulfilling prophecy problem was that of a bounded proof-checker, which looks through all proofs shorter than length N. Finitary vs. infinitary is not the issue here.
edit: okay, that’s just argument over the definition of NDT.
Then don’t refer to Halting problem and Godel’s incompleteness theorem. It is hard to object on the very informal things; all I can say, ‘you didn’t prove this’, but it is a fully general counter argument to informal things.
The other misinterpretation is that NDT does not make assumption (outputs X) = xi in this ‘transparent’ way with source code on the left side, and take this as an axiom. It has a black box for itself, denoted by symbol A, which it substitutes for own source code whenever it spots it’s own source code. It substitutes different xi as A into resulting utility equations, solving for utility, and finding the maximum utility. It does not analyze itself.
I do agree that for ONDT (orthonormal’s interpretation of NDT) which you present here, what you say, is correct. I do not agree that your NDT is the NDT that applied mathematicians use. There are substantial differences.
If you were substituting in variables for the output of X rather than analyzing the round as a whole, then you’re not talking about Naive Decision Theory, you’re talking about something in the family of CDT and TDT.
EDIT: Maybe you assumed that I was denigrating your intuitions on decision theory for departing from NDT? If so, that’s not the case- substituting for the output of X in various spots turns out to be a good way to avoid the problem of spurious counterfactuals. CDT does this in a very basic way, TDT in a better way.
What I am saying is that the decision theory which applied mathematicians follow, operates under practical constraints (limited computing time), and this prevents introduction of very fancy things like your (output X) simply because they are computationally heavy. The theory that always substitutes is what I originally noted in the other thread, regarding the newcomb’s. Due to substitution, that theory doesn’t ‘pollute’ it’s own proof checker with a proposition that may be untrue and may break the proof checker (but practically, would make proof checker very slow).
edit: That being said, the decision theories operate under notion of perfect accuracy and unlimited computing time; the applied mathematicians see this field as not very relevant to any practical software (e.g. AI). The practical AI is best off substituting for X everywhere. The practical AI is never a guaranteed utility maximizer; there will always be problems that it won’t be able to solve correctly in the given time, and the important consideration is to try to be able to solve as large set of problems in limited time, as possible. That is why i have very little inclination to write some semi-formalization of what i use when i am writing code to decide on things. I would rather write specifications for making AI that actually does it; the specifications only need to be as formal as for the programmer who implements them to understand the intent. (I can alternatively implement it myself, and then it becomes completely formalized to the point that computer can run it)
It’s just an argument over word definitions, in any case. You can call what i propose TDT if you wish. I’m just pointing out that Eliezer, being a bright guy outside the field, is precisely the kind of person to make a naive but not stupid decision theory which matches what many people actually use in practice.
Agreed. Given access to the SOURCE of the opponent, you can just RUN the opponent and observe what they’ll do this round. Of course, given, say, 10 seconds to calculate between rounds, the wise opponent just ensures it will take exactly 10 seconds to run their code, that way you can either calculate your own move OR the opponent’s, but not both.
Then you get in to the complexities of whether you have enough time at any given point to attempt to OPTIMIZE your opponent’s equation (“Roll dice for 9.99 seconds then select rock” is very easy to optimize down to “select rock”), and at that point I think we’re sufficiently outside the simplified domain of the post :)
So, basically, you’re right, but I don’t think it’s actually a relevant critique. I would enjoy seeing it addressed in the post, though, because it bugged me enough that I was going to post about it myself :)
Given access to the SOURCE of the opponent, you can just RUN the opponent and observe what they’ll do this round.
Main problem: if you’re playing against yourself, this creates an infinite loop of you simulating you simulating you simulating...
Fortunately, it’s often possible to prove things about other programs without running them.
Of course, given, say, 10 seconds to calculate between rounds, the wise opponent just ensures it will take exactly 10 seconds to run their code, that way you can either calculate your own move OR the opponent’s, but not both.
In a Prisoner’s Dilemma, TDT would defect if it couldn’t deduce anything in time about the other player. The only way to achieve mutual cooperation is to make it easy for the other player to deduce what you’re doing in that circumstance. (I’ll cover this in more depth in the TDT section.) This doesn’t mean “always cooperating”, nor does it mean giving up mixed strategies in zero-sum games.
And here you fail the very basic thing, the domain of applicability: limited computational power. Unlimited computational power introduces nasty infinities into things. “Naive decision theory” is not interested in games where you confuse yourself using infinities. The contest organizers have to be able to know output of any program, to be able to score! The contest should not be run by a halting oracle. Each program has to terminate in N cpu cycles. That’s how contests are done. Nobody who comes up with naive decision theory really cares about contests run by a halting oracle.
Let’s make it a little more practical: finite time and space. You can run any other programs, but with a caveat: you may run out of time. You will have heuristics to optimize the code before running it, which will not be able to work on optimized code, and for which you’ll know exactly how many cycles your optimizer has shaved off, so you know if you have to abort, or not (assuming there is no penalty for non-terminating in time if other program did not terminate in time either). You also recognize any instances of the self, for which you know their output equals your output, and you recognize instances of self as well as the code that runs instances of you.
Then you proceed with this ‘prove outputs(X) = a’ . This is something out of some TDT or the like. NDT does not do this kind of stuff. You run in limited space and time; your optimizer does not work on your own source code (due to inclusion of the optimizer in your source code), and the only way you could prove that you output a, is for you to run yourself and find yourself outputting a, which you can’t do because this way you enter infinite recursion and run out of cycles. You can’t just drop in, ‘okay what if you prove this’. What if you prove 1=2 ? You will end up proving all numbers are equal by 1+1=2+1 and so on.
On top of this, while you assume that you output xi, you do not assume that you are utility optimizer any more.
It’s really easy to confuse oneself with abstractions, without seeing the trees in the forest. You need to step by step, with actual proof, to show that the programs will end up proving that they output something.
edit: ahh, and another thing: you don’t quite assume (outputs X) = xi in this way. I.e. you don’t put your source into left side of the statement; you black-box yourself. You assume your output is a, and you try different values of a and then calculate payoff, without assuming a maximizes utility. You also, whenever you see a copy of yourself, substitute a.
Also, your intuitions for how to do decision theory are an unformalized version of something like CDT or TDT, as you’ll see soon enough. [snark deleted by author]
I never assumed that the inference module was infinitary- in fact, I said that it should be designed to return either a correct answer or “Unknown”.
The very first example that was used for the self-fulfilling prophecy problem was that of a bounded proof-checker, which looks through all proofs shorter than length N. Finitary vs. infinitary is not the issue here.
edit: okay, that’s just argument over the definition of NDT.
Then don’t refer to Halting problem and Godel’s incompleteness theorem. It is hard to object on the very informal things; all I can say, ‘you didn’t prove this’, but it is a fully general counter argument to informal things.
The other misinterpretation is that NDT does not make assumption (outputs X) = xi in this ‘transparent’ way with source code on the left side, and take this as an axiom. It has a black box for itself, denoted by symbol A, which it substitutes for own source code whenever it spots it’s own source code. It substitutes different xi as A into resulting utility equations, solving for utility, and finding the maximum utility. It does not analyze itself.
I do agree that for ONDT (orthonormal’s interpretation of NDT) which you present here, what you say, is correct. I do not agree that your NDT is the NDT that applied mathematicians use. There are substantial differences.
If you were substituting in variables for the output of X rather than analyzing the round as a whole, then you’re not talking about Naive Decision Theory, you’re talking about something in the family of CDT and TDT.
TDT was made by Eliezer, right? Here, a bright guy (maybe naive, but certainly not stupid) outside the field’s theory.
I don’t understand your comment.
EDIT: Maybe you assumed that I was denigrating your intuitions on decision theory for departing from NDT? If so, that’s not the case- substituting for the output of X in various spots turns out to be a good way to avoid the problem of spurious counterfactuals. CDT does this in a very basic way, TDT in a better way.
re: EDIT
Ahh, I see.
What I am saying is that the decision theory which applied mathematicians follow, operates under practical constraints (limited computing time), and this prevents introduction of very fancy things like your (output X) simply because they are computationally heavy. The theory that always substitutes is what I originally noted in the other thread, regarding the newcomb’s. Due to substitution, that theory doesn’t ‘pollute’ it’s own proof checker with a proposition that may be untrue and may break the proof checker (but practically, would make proof checker very slow).
edit: That being said, the decision theories operate under notion of perfect accuracy and unlimited computing time; the applied mathematicians see this field as not very relevant to any practical software (e.g. AI). The practical AI is best off substituting for X everywhere. The practical AI is never a guaranteed utility maximizer; there will always be problems that it won’t be able to solve correctly in the given time, and the important consideration is to try to be able to solve as large set of problems in limited time, as possible. That is why i have very little inclination to write some semi-formalization of what i use when i am writing code to decide on things. I would rather write specifications for making AI that actually does it; the specifications only need to be as formal as for the programmer who implements them to understand the intent. (I can alternatively implement it myself, and then it becomes completely formalized to the point that computer can run it)
It’s just an argument over word definitions, in any case. You can call what i propose TDT if you wish. I’m just pointing out that Eliezer, being a bright guy outside the field, is precisely the kind of person to make a naive but not stupid decision theory which matches what many people actually use in practice.
I edited my comment while you were replying to it. I don’t think we’re actually disagreeing on this particular point.
Agreed. Given access to the SOURCE of the opponent, you can just RUN the opponent and observe what they’ll do this round. Of course, given, say, 10 seconds to calculate between rounds, the wise opponent just ensures it will take exactly 10 seconds to run their code, that way you can either calculate your own move OR the opponent’s, but not both.
Then you get in to the complexities of whether you have enough time at any given point to attempt to OPTIMIZE your opponent’s equation (“Roll dice for 9.99 seconds then select rock” is very easy to optimize down to “select rock”), and at that point I think we’re sufficiently outside the simplified domain of the post :)
So, basically, you’re right, but I don’t think it’s actually a relevant critique. I would enjoy seeing it addressed in the post, though, because it bugged me enough that I was going to post about it myself :)
Main problem: if you’re playing against yourself, this creates an infinite loop of you simulating you simulating you simulating...
Fortunately, it’s often possible to prove things about other programs without running them.
In a Prisoner’s Dilemma, TDT would defect if it couldn’t deduce anything in time about the other player. The only way to achieve mutual cooperation is to make it easy for the other player to deduce what you’re doing in that circumstance. (I’ll cover this in more depth in the TDT section.) This doesn’t mean “always cooperating”, nor does it mean giving up mixed strategies in zero-sum games.