# A problem with Timeless Decision Theory (TDT)

According to Ingredients of Timeless Decision Theory, when you set up a factored causal graph for TDT, “You treat your choice as determining the result of the logical computation, and hence all instantiations of that computation, and all instantiations of other computations dependent on that logical computation”, where “the logical computation” refers to the TDT-prescribed argmax computation (call it C) that takes all your observations of the world (from which you can construct the factored causal graph) as input, and outputs an action in the present situation.

I asked Eliezer to clarify what it means for another logical computation D to be either the same as C, or “dependent on” C, for purposes of the TDT algorithm. Eliezer answered:

For D to depend on C means that if C has various logical outputs, we can infer new logical facts about D’s logical output in at least some cases, relative to our current state of non-omniscient logical knowledge. A nice form of this is when supposing that C has a given exact logical output (not yet known to be impossible) enables us to infer D’s exact logical output, and this is true for every possible logical output of C. Non-nice forms would be harder to handle in the decision theory but we might perhaps fall back on probability distributions over D.

I replied as follows (which Eliezer suggested I post here).

If that’s what TDT means by the logical dependency between Platonic computations, then TDT may have a serious flaw.

Consider the following version of the transparent-boxes scenario. The predictor has an infallible simulator D that predicts whether I one-box here [EDIT: if I see $1M]. The predictor also has a module E that computes whether the ith digit of pi is zero, for some ridiculously large value of i that the predictor randomly selects. I’ll be told the value of i, but the best I can do is assign an a priori probability of .1 that the specified digit is zero.

...reasoning under logical uncertainty using limited computing power… is another huge unsolved open problem of AI. Human mathematicians had this whole elaborate way of believing that the Taniyama Conjecture implied Fermat’s Last Theorem at a time when they didn’t know whether the Taniyama Conjecture was true or false; and we seem to treat this sort of implication in a rather different way than ‘2=1 implies FLT’, even though the material implication is equally valid.

*Good and Real*) are: the predictor conducts a simulation that tentatively presumes there will be $1M in the large box, and then puts $1M in the box (for real) iff that simulation showed one-boxing. Thus, if the large box turns out to be

*empty*, there is no requirement for that to be predictive of the agent’s choice under those circumstances. The present variant is the same, except that (D xor E) determines the $1M, instead of just D. (Sorry, I should have said this to begin with, instead of assuming it as background knowledge.)

- Omega’s subcontracting to Alpha by 16 Mar 2010 18:52 UTC; 15 points) (
- 14 Jun 2011 18:14 UTC; 1 point) 's comment on A problem with Timeless Decision Theory (TDT) by (
- 29 Aug 2011 17:56 UTC; 0 points) 's comment on Decision Theory Paradox: PD with Three Implies Chaos? by (
- 24 Mar 2017 14:26 UTC; 0 points) 's comment on Making equilibrium CDT into FDT in one+ easy step by (

And this was my reply:This is an unfinished part of the theory that I’ve also thought about, though your example puts it very crisply (you might consider posting it to LW?)

My current thoughts on resolution tend to see two main avenues:

1) Construct a full-blown DAG of math and Platonic facts, an account of which mathematical facts make other mathematical facts true, so that we can compute mathematical counterfactuals.

2) Treat differently mathematical knowledge that we learn by genuinely mathematical reasoning and by physical observation. In this case we know (D xor E) not by mathematical reasoning, but by physically observing a box whose state we believe to be correlated with D xor E. This may justify constructing a causal DAG with a node descending from D and E, so a counterfactual setting of D won’t affect the setting of E.

Currently I’d say that (2) looks like the better avenue. Can you come up with an improper mathematical dependency where E is inferred from D, and shouldn’t be seen as counterfactually affected, based on mathematical reasoning only without postulating the observation of a physical variable that descends from both E and D?

Incidentally, note that an unsolvable problem that should stay unsolvable is as follows: I’m asked to pick red or green, and told “A simulation of you given this information as well picked the wrong color and got shot.” Whichever choice I make, I deduce that the other choice was better. The exact details here will depend on how I believe the simulator chose to tell me this, but ceteris paribus it’s an unsolvable problem.

Perhaps I’m misunderstanding you here, but D and E are Platonic computations. What does it mean to construct a causal DAG among Platonic computations? [EDIT: Ok, I may understand that a little better now; see my edit to my reply to (1).] Such a graph links together general mathematical facts, so the same issues arise as in (1), it seems to me: Do the links correspond to logical inference, or something else? What makes the graph acyclic? Is mathematical causality even coherent? And if you did have a module that can detect (presumably timeless) causal links among Platonic computations, then why not use that module directly to solve your decision problems?

Plus I’m not convinced that there’s a meaningful distinction between math knowledge that you gain by genuine math reasoning, and math knowledge that you gain by physical observation.

Let’s say, for instance, that I feed a particular conjecture to an automatic theorem prover, which tells me it’s true. Have I then learned that math fact by genuine mathematical reasoning (performed by the physical computer’s Platonic abstraction)? Or have I learned it by physical observation (of the physical computer’s output), and hence be barred from using that math fact for purposes of TDT’s logical-dependency-detection? Presumably the former, right? (Or else TDT will make even worse errors.)

But then suppose the predictor has simulated the universe sufficiently to establish that U (the universe’s algorithm, including physics and initial conditions) leads to there being $1M in the box in this situation. That’s a mathematical fact about U, obtained by (the simulator’s) mathematical reasoning. Let’s suppose that when the predictor briefs me, the briefing includes mention of this mathematical fact. So even if I keep my eyes closed and never physically see the $1M, I can rely instead on the corresponding mathematically derived fact.

(Or more straightforwardly, we can view the universe itself as a computer that’s performing mathematical reasoning about how U unfolds, in which case any physical observation is intrinsically obtained by mathematical reasoning.)

Logical uncertainty has always been more difficult to deal with than physical uncertainty; the problem with logical uncertainty is that if you

analyze it enough, itgoes away. I’ve never seen any really good treatment of logical uncertainty.But if we depart from TDT for a moment, then it does seem clear that we need to have

causelike nodescorresponding to logical uncertainty in a DAG which describes our probability distribution. There is no other way you can completely observe the state of a calculator sent to Mars and a calculator sent to Venus, and yet remain uncertain of their outcomes yet believe the outcomes are correlated. And if you talk about error-prone calculators, two of which say 17 and one of which says 18, and you deduce that the “Platonic answer” was probably in fact 17, you can see that logical uncertainty behaves in an even more causelike way than this.So, going back to TDT, my hope is that there’s a neat set of rules for factoring our logical uncertainty in our causal beliefs, and that these same rules also resolve the sort of situation that you describe.

If you consider the notion of the correlated error-prone calculators, two returning 17 and one returning 18, then the most intuitive way to handle this would be to see a “Platonic answer” as its own causal node, and the calculators as error-prone descendants. I’m pretty sure this is how my brain is drawing the graph, but I’m not sure it’s the correct answer; it seems to me that a more principled answer would involve uncertainty about

whichmathematical fact affects each calculator—physically uncertain gates which determine which calculation affects each calculator.For the (D xor E) problem, we know the behavior we

wantthe TDT calculation to exhibit; we want (D xor E) to be a descendant node of D and E. If we view the physical observation of $1m as telling us the raw mathematical fact (D xor E), and then perform mathematical inference on D, we’ll find that we can affect E, which is not what we want. Conversely if we view D as having a physical effect, and E as having a physical effect, and the node D xor E as a physical descendant of D and E, we’ll get the behavior we want. So the question is whether there’s any principled way of setting this up which will yield the second behavior rather than the first, and also, presumably, yield epistemically correct behavior when reasoning about calculators and so on.That’s if we go down avenue (2). If we go down avenue (1), then we give primacy to our intuition that if-counterfactually you make a different decision, this logically controls the mathematical fact (D xor E) with E held constant, but does not logically control E with (D xor E) held constant. While this does sound intuitive in a sense, it isn’t quite nailed down—after all, D is ultimately just as constant as E and (D xor E), and to change any of them makes the model equally inconsistent.

These sorts of issues are something I’m still thinking through, as I think I’ve mentioned, so let me think out loud for a bit.

In order to observe anything that you think has already been controlled by your decision—any physical thing in which a copy of D has already played a role—then (leaving aside the question of Omega’s strategy that simulated alternate versions of you to select a self-consistent problem, and whether this introduces conditional-strategy-dependence rather than just decision-dependence into the problem) there have to be

other physical factswhich combine with D to yield our observation.Some of these physical facts may themselves be affected by mathematical facts, like an implemented computation of E; but the point is that in order to have observed anything controlled by D, we already had to draw a

physical, causaldiagram in which other nodesdescendedfrom D.So suppose we introduce the rule that in every case like this, we will have some physical node that is affected by D, and if we can observe info that depends on D in any way, we’ll view the other mathematical facts as combining with D’s physical node. This is a rule that tells us not to draw the diagram with a physical node being determined by the mathematical fact D xor E, but rather to have a physical node determined by D, and then a physical descendent D xor E. (Which in this particular problem should descend from a physical node E that descends from the mathematical fact E, because the mathematical fact E is correlated with our uncertainty about other things, and a factored causal graph should have no remaining correlated sources of background uncertainty; but if E didn’t correlate to anything else in particular, we could just have D descending to (D xor E) via the (xor with E) rule.)

When I evaluate this proposed solution for ad-hoc-ness, it does admittedly look a bit ad-hoc, but it solves at least

oneother problem than the one I started with, and which I didn’t think of until now. Suppose Omega tells me that I make the same decision in the Prisoner’s Dilemma as Agent X.This does not necessarily imply that I should cooperate with Agent X.X and I could have made the same decision for different (uncorrelated) reasons, and Omega could have simply found out by simulating the two of us that X and I gave the same response. In this case, presumably defecting; but if I cooperated, X wouldn’t do anything differently. X is just a piece of paper with “Defect” written on it.If I draw a causal diagram of how I came to learn this correlation from Omega, and I follow the rule of always drawing a causal boundary around the mathematical fact D as soon as it physically affects something, then, given the way Omega simulated both of us to observe the correlation, I see that D and X separately physically affected the correlation-checker node.

On the other hand, if I can analyze the two pieces of code D and X and see that they return the same output, without yet knowing the output, then this knowledge was obtained in a way that doesn’t involve D producing an output, so I don’t have to draw a hard causal boundary around that output.

If this works, the underlying principle that makes it work is something along the lines of “for D to control X, the correlation between our uncertainty about D and X has to emerge in a way that doesn’t involve anyone already computing D”. Otherwise D has no free will (said firmly tongue-in-cheek). I am not sure that this principle has any more elegant expression than the rule, “whenever, in your physical model of the universe, D

finishescomputing, draw a physical/causal boundary around that finished computation and have other things physically/causally descend from it”.If this principle is violated then D ends up “correlated” to all sorts of other things we observe, like the price of fish and whether it’s raining outside, via the magic of xor.

When you use terms like “draw a hard causal boundary” I’m forced to imagine you’re actually drawing these things on the back of a cocktail napkin somewhere using some sorts of standard symbols. Are there such standards, and do you have such diagrams scanned in online somewhere?

ETA: A note for future readers: Eliezer below is referring to Judea Pearl (simply “Pearl” doesn’t convey much via google-searching, though I suppose “pearl causality” does at the moment)

Read Pearl. I

thinkhis online intros should give you a good idea of what the cocktail napkin looks like.Hmm… Pearl uses a lot of diagrams but they all seem pretty ad-hoc. Just the sorts of arrows and dots and things that you’d use to represent any graph (in the mathematics sense). Should I infer from this description that the answer is, “No, there isn’t a standard”?

I was picturing something like a legend that would tell someone, “Use a dashed line for a causal boundary, and a red dotted line to represent a logical inference, and a pink squirrel to represent postmodernism”

Um… I’m not sure there’s much I can say to that beyond “Read Probabilistic Reasoning in Intelligent Systems, or Causality”.

Pearl’s system is not ad-hoc. It is very not ad-hoc. It has a metric fuckload of math backing up the simple rules. But Pearl’s system does not include logical uncertainty. I’m trying to put logical uncertainty into it, while obeying the rules. This is a work in progress.

I’d just like to register a general approval of specifying that one’s imaginary units are

metric.FWIW

Thomblake’s observation may be that while Pearl’s system is extremely rigorous the diagrams used do not give an authoritative standard style for diagram drawing.

That’s correct—I was looking for a standard style for diagram drawing.

I’m rereading past discussions to find insights. This jumped out at me:

Do you still believe this?

Playing chicken with Omega may result in you becoming counterfactual.

Why is cooperation more likely to qualify as “playing chicken” than defection here?

I was referring to the example Eliezer gives with your opponent being a DefectBot, in which case cooperating makes Omega’s claim false, which may just mean that you’d make your branch of the thought experiment counterfactual, instead of convincing DefectBot to cooperate:

So? That doesn’t hurt my utility in reality. I would cooperate because that wins if agent X is correlated with me, and doesn’t lose otherwise.

Winning is about how alternatives you choose between compare. By cooperating against a same-action DefectBot, you are choosing nonexistence over a (D,D), which is not

obviouslya neutral choice.I don’t think this is how it works. Particular counterfactual instances of you can’t influence whether they are counterfactual or exist in some stronger sense. They can only choose whether there are more real instances with identical experiences (and their choices can sometimes acausally influence what happens with real instances, which doesn’t seem to be the case here since the real you will choose defect either way as predicted by Omega). Hypothetical instances don’t lose anything by being in the branch that chooses the opposite of what the real you chooses unless they value being identical to the real you, which IMO would be silly.

What can influence things like that? Whatever property of a situation can mark it as counterfactual (more precisely, given by a contradictory specification, or not following from a preceding construction, assumed-real past state for example), that property could as well be a decision made by an agent present in that situation. There is nothing special about agents or their decisions.

Why do you think something can influence it? Whether you choose to cooperate or defect, you can always ask both “what would happen if I cooperated?” and “what would happen if I defected?”. In as far as being counterfactual makes sense the alternative to being the answer to “what would happen if I cooperated?” is being the answer to “what would happen if I defected?”, even if you know that the real you defects.

Compare Omega telling you that your answer will be the the same as the Nth digit of Pi. That doesn’t you allow to choose the Nth digit of Pi.

This becomes a (relatively) straightforward matter of working out where the (potentially counterfactual—depending what you choose) calculation is being performed to determine exactly what this ‘nonexistence’ means. Since this particular thought experiment doesn’t seem to specify any other broader context I assert that cooperate

isclearly the correct option. Any agent which doesn’t cooperate is broken.Basically, if you ever find yourself in this situation then you don’t matter. It’s your job to play chicken with the universe and not exist so the actual you can win.

Agent X is a piece of paper with “Defect” written on it. I defect against it. Omega’s claim is true and does not imply that I should cooperate.

I don’t see this argument making sense. Omega’s claim reduces to neglibible chances that a choice of Defection will be advantageous for me, because Omega’s claim makes it of neglible probability that either (D,C) or (C, D) will be realized. So I can only choose between the worlds of (C, C) and (D, D). Which means that the Cooperation world is advantageous, and that I

shouldCooperate.In contrast, if Omega had claimed that we’d make the opposite decisions, then I’d only have to choose between the worlds of (D, C) or (C, D) -- with the worlds of (C, C) and (D, D) now having negligible probability. In which case, I should, of course, Defect.

The reasons for the correlation between me and Agent X are irrelevant when the

factof their correlation is known.Sorry, was this intended as part of the problem statement, like “Omega tells you that agent X is a DefectBot that will play the same as you”? If yes, then ok. But if we don’t know what agent X is, then I don’t understand why a DefectBot is apriori more probable than a CooperateBot. If they are equally probable, then it cancels out (

edit:no it doesn’t, it actually makes cooperating a better choice, thx ArisKatsaris). And there’s also the case where X is a copy of you, where cooperating does help. So it seems to be a better choice overall.There is also a case where X is an anticopy (performs opposite action), which argues for defecting in the same manner.

Edit: This reply is wrong.No it doesn’t. If X is an anticopy, the situation can’t be real and your action doesn’t matter.

Why can’t it be real?

Because Omega has told you that X’s action is the same as yours.

OK.

I agree this sounds intuitive. As I mentioned earlier, though, nailing this down is tantamount to circling back and solving the full-blown problem of (decision-supporting) counterfactual reasoning: the problem of how to distinguish which facts to “hold fixed”, and which to “let vary” for consistency with a counterfactual antecedent.

In any event, is the idea to try to build a separate graph for math facts, and use that to analyze “logical dependency” among the Platonic nodes in the original graph, in order to carry out TDT’s modified “surgical alteration” of the original graph? Or would you try to build one big graph that encompasses physical and logical facts alike, and then use Pearl’s decision procedure without further modification?

Wait, isn’t it decision-computation C—rather than simulation D—whose “effect” (in the sense of logical consequence) on E we’re concerned about here? It’s the logical dependents of C that get surgically altered in the graph when C gets surgically altered, right? (I know C and D are logically equivalent, but you’re talking about inserting a physical node after D, not C, so I’m a bit confused.)

I’m having trouble following the gist of avenue (2) at the moment. Even with the node structure you suggest, we can still infer E from C and from the physical node that matches (D xor E)—unless the new rule prohibits relying on that physical node, which I guess is the idea. But what exactly is the prohibition? Are we forbidden to infer any mathematical fact from any physical indicator of that fact? Or is there something in particular about node (D xor E) that makes it forbidden? (It would be circular to cite the node’s dependence on C in the very sense of “dependence” that the new rule is helping us to compute.)

I definitely want one big graph if I can get it.

Sorry, yes, C.

No, but whenever we see a

physicalfact F that depends on a decision C/D we’re still in the process of making plus Something Else (E), then we express our uncertainty in the form of acausalgraph with directed arrows from C to D, D to F, and E to F. Thus when we compute acounterfactualon C, we find that F changes, but E does not.Wait, F depends on decision computation C in what sense of “depends on”? It can’t quite be the originally defined sense (quoted from your email near the top of the OP), since that defines dependency between Platonic computations, not between a Platonic computation and a physical fact. Do you mean that D depends on C in the original sense, and F in turn depends on D (and on E) in a different sense?

Ok, but these arrows can’t be used to define the relevant sense of dependency above, since the relevant sense of dependency is what tells us we need to draw the arrows that way, if I understand correctly.

Sorry to keep being pedantic about the meaning of “depends”; I know you’re in thinking-out-loud mode here. But the theory gives wildly different answers depending (heh) on how that gets pinned down.

In my view, the chief form of “dependence” that needs to be discriminated is inferential dependence and causal dependence. If earthquakes

causeburglar alarms to go off, then we caninferan earthquake from a burglar alarm orinfera burglar alarm from an earthquake. Logical reasoning doesn’t have the kind of directionality that causation does—or at least, classical logical reasoning does not—there’s no preferred form between ~A->B, ~B->A, and A \/ B.The link between the Platonic decision C and the physical decision D might be different from the link between the physical decision D and the physical observation F, but I don’t know of anything in the current theory that calls for treating them differently. They’re just directional causal links. On the other hand, if C mathematically implies a decision C-2 somewhere else, that’s a logical implication that ought to symmetrically run backward to ~C-2 → ~C, except of course that we’re presumably controlling/evaluating C rather than C-2.

Thinking out loud here, the view is that your mathematical uncertainty ought to be in one place, and your physical uncertainty should be built on top of your mathematical uncertainty. The mathematical uncertainty is a logical graph with symmetric inferences, the physical uncertainty is a directed acyclic graph. To form controlling counterfactuals, you update the mathematical uncertainty, including any logical inferences that take place in mathland, and watch it propagate downward into the physical uncertainty. When you’ve already observed facts that physically depend on mathematical decisions you control but you haven’t yet made and hence whose values you don’t know, then those observations stay in the causal, directed, acyclic world; when the counterfactual gets evaluated, they get updated in the Pearl, directional way, not the logical, symmetrical inferential way.

No, D was the Platonic simulator. That’s why the nature of the C->D dependency is crucial here.

Okay, then we have a logical link from C-platonic to D-platonic, and causal links descending from C-platonic to C-physical, E-platonic to E-physical, and D-platonic to D-physical to F-physical = D-physical xor E-physical. The idea being that when we counterfactualize on C-platonic, we update D-platonic and its descendents, but not E-platonic or its descendents.

I suppose that as written, this requires a rule, “for purposes of computing counterfactuals, keep in the causal graph rather than the logical knowledge base, any mathematical knowledge gained by observing a fact descended from your decision-output or any logical implications of your decision-output”. I could hope that this is a special case of something more elegant, but it would only be hope.

Ok. I think it would be very helpful to sketch, all in one place, what TDT2 (i.e., the envisioned avenue-2 version of TDT) looks like, taking care to pin down any needed sense of “dependency”. And similarly for TDT1, the avenue-1 version. (These suggestions may be premature, I realize.)

If X isn’t like us, we can’t “control” X by making a decision similar to what we would want X to output*. We shouldn’t go from being an agent that defects in the prisoner’s dilemma with Agent X when told we “make the same decision in the Prisoner’s Dilemma as Agent X” to being one that does not defect, just as we do not unilaterally switch from natural to precision bidding when in contract bridge a partner opens with two clubs (which signals a good hand under precision bidding, and not under natural bidding).

However, there do exist agents who should cooperate every time they hear they “make the same decision in the Prisoner’s Dilemma as Agent X”, those who have committed to cooperate in such cases. In some such cases, they are up against pieces of paper on which “cooperate” is written (too bad they didn’t have a more discriminating algorithm/clear Omega), in others, they are up against copies of themselves or other agents whose output depends on what Omega tells them. In any case, many agents should cooperate when they hear that.

Yes? No?

Why shouldn’t one be such an agent? Do we know ahead of time that we are likely to be up against pieces of paper with “cooperate” on them, and Omega would tell unhelpfully tell us we “make the same decision in the Prisoner’s Dilemma as Agent X” in all such cases, though if we had a different strategy we could have gotten useful information and

defectedin that case?*Other cases include us defecting to get X to cooperate, and others where X’s play depends on ours, but this is the natural case to use when considering if the Agent X’s action depends on ours, a not strategically incompetent Agent X that has a strategy at least as good as always defecting or cooperating and does not try to condition his cooperating on our defecting or the like.

“Makes true” means logically implies? Why would that graph be acyclic? [EDIT: Wait, maybe I see what you mean. If you take a pdf of your beliefs about various mathematical facts, and run Pearl’s algorithm, you should be able to construct an acyclic graph.]

Although I know of no worked-out theory that I find convincing, I believe that counterfactual inference (of the sort that’s appropriate to use in the decision computation) makes sense with regard to events in universes characterized by certain kinds of physical laws. But when you speak of mathematical counterfactuals more generally, it’s not clear to me that that’s even coherent.

Plus, if you did have a general math-counterfactual-solving module, why would you relegate it to the logical-dependency-finding subproblem in TDT, and then return to the original factored causal graph? Instead, why not cast the whole problem as a mathematical abstraction, and then directly ask your math-counterfactual-solving module whether, say, (Platonic) C’s one-boxing counterfactually entails (Platonic) $1M? (Then do the argmax over the respective math-counterfactual consequences of C’s candidate outputs.)

I’ve been reviewing some of this discussion, and noticed that Eliezer hasn’t answered the question in your last paragraph. Here is his answer to one of my questions, which is similar to yours. But I’m afraid I still don’t have a really good understanding of the answer. In other words, I’m still not really sure why we need all the extra machinery in TDT, when having a general math-counterfactual-solving module (what I called “mathematical intuition module”) seems both necessary and sufficient.

I wonder if you, or anyone else, understands this well enough to try to explain it. It might help me, and perhaps others, to understand Eliezer’s approach to see it explained in a couple of different ways.

This is basically the approach I took in (what I now call) UDT1.

For now, let me just reply to your incidental concluding point, because that’s brief.

I disagree that the red/green problem is unsolvable. I’d say the solution is that, with respect to the available information, both choices have equal (low) utility, so it’s simply a toss-up. A correct decision algorithm will just flip a coin or whatever.

Having done so, will a correct decision algorithm try to revise its choice in light of its (tentative) new knowledge of what its choice is? Only if it has nothing more productive to do with its remaining time.

Actually, one can do even better than that. As (I think), Eliezer implied, the key is Omega

sayingthose words. (about the simulated you getting it wrong)Did the simulated version receive that message too? (if yes, and if we assume Omega is always truthful, this implies an infinite recursion of simulations… let us not go invoking infinite nested computations willy-nilly.) If there was only a single layer of simulation, them Omega either gave that statement as input to it or did not. If yes, Omega is untruthful, which throws pretty much all of the standard reasoning about Omega out the window and we can simply take into account the possibility that Omega is blatantly lying.

If Omega is truthful, even to the simulations, then the simulation would

nothave received that prefix message. In which case you are in a different state than simulated you was. So all you have to do is make the decision opposite to what you would have done if you hadn’t heard that particular extra message. This may be guessed by simply one iteration of “I automatically want to guess color1… but wait, simulated me got it wrong, so I’ll guess color2 instead” since “actual” you has the knowledge that the previous version of you got it wrong.If Omega lies to simulations and tells truth to “actuals” (and can somehow simulate without the simulation being conscious, so there’s no ambiguity about which you are, yet still be accurate… (am skeptical but confused on that point)), then we have an issue. But then it would require Omega to take a risk: if when telling the lie to the simulation, the simulation then gets it right, then what does Omega tell “actual” you?

(“actual” in quotes because I honestly don’t know whether or not one could be modeled with sufficient accuracy, however indirectly, without the model being conscious. I’m actually kind of skeptical of the prospect of a perfectly accurate model not being conscious, although a model that can determine some properties/approximations of the person without being conscious is probably possible)

TL;DR: even without access to coinflips beyond Omega’s predictive power, one might be able to do better in the red/green problem simply by noting that the nature of the additional information Omega provided you opens up the possibility that Omega’s simulation of you was a bit different than the actual situation you are in.

Omega can use the following algorithm:

“Simulate telling the human that they got the answer wrong. If in this case they get the answer wrong, actually tell them that they get the answer wrong. Otherwise say nothing.”

This ought to make it relatively easy for Omega to truthfully put you in a “you’re screwed” situation a fair amount of the time. Albeit, if you know that this is Omega’s procedure, the rest of the time you should figure out what you would have done if Omega said “you’re wrong” and then do that.

Thiskind of thinking is, I think, outside the domain of current TDT, because it involves strategies that depend on actions you would have taken in counterfactual branches. I think it may even be outside the domain of current UDT for the same reason.I don’t see why this is outside of UDT’s domain. It seems straightforward to model and solve the decision problem in UDT1. Here’s the world program:

Assuming a preference to maximize the occurrence of outcome=”live” averaged over P(“green”) and P(“red”), UDT1 would conclude that the optimal S returns a constant, either “green” or “red”, and do that.

BTW, do you find this “world program” style analysis useful? I don’t want to over-do them and get people annoyed. (I refrained from doing this for the problem described in Gary’s post, since it doesn’t mention UDT at all, and therefore I’m assuming you want to find a TDT-only solution.)

Yes, I was focusing on a specific difficulty in TDT, But I certainly have no objection to bringing UDT into the thread too. (I myself haven’t yet gotten around to giving UDT the attention I think it deserves.)

The world program I would use to model this scenario is:

The else branch seems unreachable, given color = S(“your’e wrong) and the usual assumptions about Omega.

I don’t understand what your nested if statements are modeling.

I was modeling what Eliezer wrote in the comment that I was responding to:

BTW, if you add a tab in front of each line of your program listing, it will get formatted correctly.

Ah, I see. Then it seems that you are really solving the problem of minimizing the probability that Omega presents this problem in the first place.

What about the scenario, where Omega uses the strategy: Simulate telling the human that they got the answer wrong. Define the resulting answer as wrong, and the other as right.

This is what I modeled.

Thanks. Is there an easier way to get a tab into the comment input box than copy paste from an outside editor?

In that case it should be modeled like this:

Not that I’m aware of.

Are you guys talking about getting code to indent properly? You can do that by typing four spaces in front of each line. Each quadruple of spaces produces a further indentation.

http://daringfireball.net/projects/markdown/syntax#precode

Spaces? Think of the wasted negentropy! I say we maketabthe official Less Wrong indention symbol, and kick out anyone who disagrees. Who’s with me? :-)Hm, I think the difference in our model programs indicates something that I don’t understand about UDT, like a wrong assumption that justified an optimization. But it seems they both produce the same result for P(S(“you’re wrong”)), which is outcome=”die” for all S.

Do you agree that this problem is, and should remain, unsolvable? (I understand “should remain unsolvable” to mean that any supposed solution must represent some sort of confusion about the problem.)

The input to P is supposed to contain the physical randomness in the problem, so P(S(“you’re wrong”)) doesn’t make sense to me. The idea is that both P(“green”) and P(“red”) get run, and we can think of them as different universes in a multiverse. Actually in this case I should have wrote “def P():” since there is no random correct color.

I’m not quite sure what you mean here, but in general I suggest just translating the decision problem directly into a world program without trying to optimize it.

No, like I said, it seems pretty straightforward to solve in UDT. It’s just that even in the optimal solution you still die.

Ok, now I understood why you wrote your program the way you did.

By solve, I meant find a way to win. I think that after getting past different word use, we agree on the nature of the problem.

Fair enough.

I’m not sure the algorithm you describe here is necessarily outside current TDT though. The counterfactual still corresponds to an actual thing Omega simulated. It’d be more like this: Omega did not add the “you are wrong” prefix. Therefore, conditioning on the idea that Omega always tries simulating with that prefix and only states the prefix if I (or whoever Omega is offering the challenge to) was wrong in that simulation, the simulation in question then did not produce the wrong answer.

Therefore a sufficient property for a good answer (one with higher expected utility) is that it should have the same output as that simulation. Therefore determine what that output was...

ie, TDT shouldn’t have much more problem (in principle) with that than with being told that it needs to guess the Nth digit of Pi. If possible, it would simply compute the Nth digit of Pi. In this case, it has to simply compute the outcome of a certain different algorithm which happens to be equivalent to its own decision algorithm when faced with a certain situation. I don’t

THINKthis would be inherently outside of current TDT as I understand itI may be completely wrong on this, though, but that’s the way it seems to me.

As far as stuff like the problem in the OP, I suspect though that the Right Way for dealing with things analogous to counterfactual mugging (and extended to the problem in the OP) and such amounts to a very general precommitment… Or a retroactive precommitment.

My thinking here is rather fuzzy. I do

suspectthough that the Right Way probably looks something like the the TDT, in advance, doing a very general precommitment to be the sort of being that tends to have high expected utility when faced with counterfactual muggers and whatnot… (Or retroactively deciding to be the sort of being that effectively has the logical implication of being mathematically “precommited” to be such.)By “unsolvable” I mean that you’re screwed over in final outcomes, not that TDT fails to have an output.

The interesting part of the problem is that, whatever you decide, you deduce facts about the background such that you know that what you are doing is the wrong thing. However, if you do anything differently, you would have to make a different deduction about the background facts, and again know that what you were doing was the wrong thing. Since we don’t believe that our decision is capable of

affectingthe background facts, the background facts ought to be a fixed constant, and we should be able to alter our decision without affecting the background facts… however, as soon as we do so, our inference about the unalterable background facts changes. It’s not 100% clear how to square this with TDT.This is like trying to decide whether this statement is true:

“You will decide that this statement is false.”

There is nothing paradoxical about this statement. It is either true or false. The only problem is that you can’t get it right.

Actually, there is an optimal solution to this dilemma. Rather than use any internal process to decide, using a truly random process gives a 50% chance of survival. If you base your decision on a quantum randomness source, in principle no simulation can predict your choice (or rather, a complete simulation would correctly predict you fail in 50% of possible worlds).

Knowing how to use randomness against an intelligent adversary is important.

Gary postulated an infallible simulator, which presumably includes your entire initial state and all pseudorandom algorithms you might run. Known quantum randomness methods can only amplify existing entropy, not manufacture it ab initio. So you have no recourse to coinflips.

EDIT: Oops! pengvado is right. I was thinking of the case discussed here, where the random bits are provided by some quantum black box.

Quantum coinflips work even if Omega can predict them. It’s like a branch-both-ways instruction. Just measure some quantum variable, then measure a noncommuting variable, and voila, you’ve been split into two or more branches that observe different results and thus can perform different strategies. Omega’s perfect predictor tells it that you will do both strategies, each with half of your original measure. There is no arrangement of atoms (encoding the right answer) that Omega can choose in advance that would make both of you wrong.

I agree, and for this reason whenever I make descriptions I make Omega’s response to quantum smart-asses and other randomisers explicit and negative.

If Omega wants to smack down the use of randomness, I can’t stop it. But there are a number of game theoretic situations where the optimal response is random play, and any decision theory that can’t respond correctly is broken.

Does putting the ‘quantum’ in a black box change anything?

Not sure I know which question you’re asking:

A black box RNG is still useless despite being based on a quantum mechanism, or

That a quantum device will necessarily manufacture random bits.

Counterexamples to 2 are pretty straightforward (quantum computers), so I’m assuming you mean 1. I’m operating at the edge of my knowledge here (as my original mistake shows), but I think the entire point of Pironio et al’s paper was that you can verify random bits obtained from an adversary, subject to the conditions:

Bell inequality violations are observable (i.e., it’s a quantum generator).

The adversary can’t predict your measurement strategy.

Am I misunderstanding something?

Oh ok. So it’s unsolvable in the same sense that “Choose red or green. Then I’ll shoot you.” is unsolvable. Sometimes choice really

isfutile. :) [EDIT: Oops, I probably misunderstood what you’re referring to by “screwed over”.]Yes, assuming that you’re the sort of algorithm that can (without inconsistency) know its own choice here before the choice is executed.

If you’re the sort of algorithm that may revise its intended action in response to the updated deduction, and if you have enough time left to perform the updated deduction, then the (previously) intended action may not be reliable evidence of what you will actually do, so it fails to provide sound reason for the update in the first place.

If mathematical truths were drawn in a DAG graph, it’s unclear how counterfactuals would work. Since math is consistent, then, by the principle of explosion, the inversion of any statement makes all statements true. The counterfactual graph would therefore be completely uninformative.

Or, perhaps, it would just generate another system of math. But then you have to know the inferential relationship between that new math and the rest of the world.

I don’t see how logical entailment acts as functional causal dependence in Pearl’s account of causation. Can you explain?

Pearl’s account doesn’t include logical uncertainty at all so far as I know, but I made my case here

http://lesswrong.com/lw/15z/ingredients_of_timeless_decision_theory/

that Pearl’s account has to be modified to include logical uncertainty on purely epistemic grounds, never mind decision theory.

If this isn’t what you’re asking about then please further clarify the question?

Treating same inputs on duplicate functions also arises in the treatment of counterfactuals (since one duplicates the causal graph across worlds of interest). The treatment I am familiar with is systematic merges of portions of the counterfactual graph which can be proved to be the same. I don’t really understand why this issue is about logic (rather than about duplication).

What was confusing me, however, was the remark that it is possible to create causal graphs of mathematical facts (presumably with entailment functioning as a causal relationship between facts). I really don’t see how this can be done. In particular the result is highly cyclic, infinite for most interesting theories, and it is not clear how to define interventions on such graphs in a satisfactory way.

I was going to suggest (2) myself, but then I realized that it seems to follow directly from your definition of “dependent on”, so you must have thought of it yourself:

I think this problem is based (at least in part) on an incoherence in the basic transparent box variant of Newcomb’s problem.

If the subject of the problem will two-box if he sees the big box has the million dollars, but will one-box if he sees the big box is empty. Then there is no action Omega could take to satisfy the conditions of the problem.

In this variant that introduces the digit of pi, there is an unknown bit such that whatever strategy the subject takes, there is a value of that bit that allows Omega an action consistant with the conditions. However, that does not mean the bit actually has that value, it may in fact have the other value and the problem still is not coherent.

I suspect that there is still something this says about TDT, but I am not sure how to illustrate it with an example that does not also have the problem I have described.

EditAs I was typing this, Eliezer posted his reply, including “an unsolvable problem that should stay unsolvable” that should stay unsolved which is equivalent to the problem I have described.The rules of the transparent-boxes problem (as specified in

Good and Real) are: the predictor conducts a simulation that tentatively presumes there will be $1M in the large box, and then puts $1M in the box (for real) iff the simulation showed one-boxing. So the subject you describe gets an empty box and one-boxes, but that doesn’t violate the conditions of the problem, which do not require the empty box to be predictive of the subject’s choice.I drew a causal graph of this scenario (with the clarification you just provided), and in order to see the problem with TDT you describe, I would have to follow a causation arrow backwards, like in Evidential Decision Theory, which I don’t think is how TDT handles counterfactuals.

The backward link isn’t causal. It’s a logical/Platonic-dependency link, which is indeed how TDT handles counterfactuals (i.e., how it handles the propagation of “surgical alterations” to the decision node C).

My understanding of the link in question, is that the logical value of the digit of pi causes Omega to take the physical action of putting the money in the box.

See Eliezer’s second approach:

My original post addressed Eliezer’s original specification of TDT’s sense of “logical dependency”, as quoted in the post.

I don’t think his two proposals for revising TDT are pinned down enough yet to be able to tell what the revised TDTs would decide in any particular scenario. Or at least, my own understanding of the proposals isn’t pinned down enough yet. :)

Ah, I was working from different assumptions. That at least takes care of the basic clear box variant. I will have to think about the digit of pi variation again with this specification.

In this case the paradox lies within having made a false statement about Omega, not about TDT. In other words, it’s not a problem with the decision theory, but a problem with what we supposedly believe about Omega.

But yes, whenever you suppose that the agent can observe an effect of its decision before making that decision, there must be given a consistent account of how Omega simulates possible versions of you that see different versions of your own decision, and on that basis selects at least one consistent version to show you. In general, I think, maximizing may require choosing among possible

strategiesforsetsof conditional responses. And this indeed intersects with some of the open issues in TDT and UDT.This is what I was alluding to by saying, “The exact details here will depend on how I believe the simulator chose to tell me this”.

Yes, that is what I meant.

In considering this problem, I was wondering if it had to do with the directions of arrows on the causal graph, or a distinction between the relationships directly represented in the graph and those that can be derived by reasoning about the graph, but this false statement about Omega is getting in my way of investigating this.

I’m not clear at all what the problem is, but it seems to be symantic. It’s disturbing that this post can get 17 upvotes with almost no (2?) comments actually referring to what you’re saying- indicating that no one else here really gets the point either.

It seems you have an issue with the word ‘dependent’ and the definition that Eliezer provided. Under that definition, E (the ith digit of pi) would be dependent on C (our decision to one or two box) if we two-boxed and got a million dollars, because then we would know that E = 0, and we would not have known this if we had not two-boxed. So we can infer E from C, thus dependency. By Eliezer’s definition, which seems to be a special information-theoretical definition, I see no problem with this conclusion. The problem only seems to arise if you then take the intuitive definition of the word ‘dependent’ as meaning ‘contingent upon,’ as in ‘Breaking the egg is contingent upon my dropping it.’ Your symantic complain goes beyond newcome- by Eliezer’s definition of ‘dependent,’ the pH of water (E) is dependent upon our litmus testing it, since the result of the litmus test (C) allows us to infer the water’s actual pH. C lets us infer E, thus dependency.

Sorry, the above post omits some background information. If E “depends on” C in the particular sense defined, then the TDT algorithm mandates that when you “surgically alter” the output of C in the factored causal graph, you then you must correspondingly surgically alter the output of E in the graph.

So it’s not at all a matter of any intuitive connotation of “depends on”. Rather, “depends on”, in this context, is purely a technical term that designates a particular test that the TDT algorithm performs. And the algorithm’s prescribed use of that test culminates in the algorithm making the wrong decision in the case described above (namely, it tells me to two-box when I should one-box).

No, I still don’t get why adding in the ith digit of pi clause changes Newcome’s problem at all. If omega says you’ll one-box and you two-box then omega was wrong, plain and simple. The ith digit of pi is an independent clause. I don’t see how one’s desire to make i=0 by two-boxing after already getting the million is any different than one wanting to make omega wrong by two-boxing after getting the million. If you are the type of person who, after getting the million thinks, “Gee, I want i=0! I’ll two-box!” Then omega wouldn’t have given you the million to begin with. After determining that he would not give you the million, he’d look at the ith digit of pi and either put the million in or not. You two-boxing has nothing to do with i.

If D=false and E=true and there’s $1M in the box and I two-box, then (in the particular Newcomb’s variant described above) the predictor is not wrong. The predictor correctly computed that (D xor E) is true, and set up the box accordingly, as the rules of this particular variant prescribe.

Yes- but your two-boxing didn’t cause i=0, rather the million was there because i=0. I’m saying that if (D or E) = true and you get a million dollars, and you two-box, then you haven’t caused E=0. E=0 before you two boxed, or if it did not, then omega was wrong and thought D = onebox, when in fact you are a two-boxer.

Everything you just said is true.*

Everything you just said is also consistent with everything I said in my original post.

*Except for one typo: you wrote (D or E) instead of (D xor E).

I’m in the same confused camp as Laura. This paragraph confuses me.

Why is it the wrong decision? If Omega can perfectly predict the TDT and TDT sees 1 million dollars, then the TDT must be in a world that the ith digit of PI is 0. It is an unlikely world, to be sure.

Actually, you’re in a different camp than Laura: she agrees that it’s incorrect to two-box regardless of any preference you have about the specified digit of pi. :)

The easiest way to see why two-boxing is wrong is to imagine a large number of trials, with a different chooser, and a different value of i, for each trial. Suppose each chooser strongly prefers that their trial’s particular digit of pi be zero. The proportion of two-boxer simulations that end up with the digit equal to zero is no different than the proportion of one-boxer simulations that end up with the digit equal to zero (both are approximately .1). But the proportion of the one-boxer simulations that end up with an actual $1M is much higher (.9) than the proportion of two-boxer simulations that end up with an actual $1M (.1).

But the proportion of two-boxers

that saw $1M in the boxthat end upwith their digit being 0

andwith the $1M

is even higher (1). I already saw the $1M, so, by two-boxing, aren’t I just choosing to be one of those who see their E module output True?

Not if a counterfactual consequence of two-boxing is that the large box (probably) would be empty (even though in fact it is not empty, as you can already see).

That’s the same question that comes up in the original transparent-boxes problem, of course. We probably shouldn’t try to recap that whole debate in the middle of this thread. :)

Don’t worry; I don’t want to do that :). If I recall the original transparent-boxes problem correctly, I agree with you on what to do in that case.

Just to check my memory, in the original problem, there are two transparent boxes,

AandB. You see thatAcontains $1M andBcontains $1000. You know thatBnecessarily contains $1000, butAwould have contained $1M iff it were the case that you will decide to take onlyA. Otherwise,Awould have been empty. The conclusion (with which I agree) is that you should take onlyA. Is that right? (If I’m misremembering something crucial, is there a link to the full description online?) [ETA: I see that you added a description to your post. My recollection above seems to be consistent with your description.]In the original problem, if we use the “many choosers” heuristic, there are no choosers who two-box and yet who get the $1M. Therefore, you cannot “choose to be” one of them. This is why two-boxing should have no appeal to you.

In contrast, in your new problem, there

aretwo-boxers who get the $1Mandwho get their E module to output True. So youcan“choose to be” one of them, no? And since they’re the biggest winners, that’s what you should do, isn’t it?Have some Omega thought experiments been one shot, never to be repeated type deals or is my memory incorrect?

Yes I wasn’t thinking through what would happen when the ith digit wasn’t 0. You can’t switch to one boxing in that case because you don’t know when that would be, or rather when you see an empty box you are forced to do the same as when you see a full box due to the way the game is set up.

Yes, and that’s the intent in this example as well. Still, it can be useful to look at the expected distribution of outcomes over a large enough number of trials that have the same structure, in order to infer the (counterfactual) probabilities that apply to a single trial.

Yes, they have. And most can be formulated as such as long as p(Omega is honest) is given as ‘high’ somewhere.

In UDT1, I would model this problem using the following world program. (For those not familiar with programming convention, 0=False, and 1=True.)

We then ask, what function S maximizes the expected payout at the end of P? When S sees “box is empty” clearly it should return 0. What should it do when it sees “box contains $1M”?

If it returns 0 (i.e. two-boxes), then

with probability .1, E=1, D^E=1, and payout = 1e9 + 1001000,

with probability .9, E=0, D^E=0, and payout = 1000

If it returns 1 (i.e. one-boxes), then

with probability .1, E=1, D^E=0, and payout = 1e9 + 1000,

with probability .9, E=0, D^E=1, and payout = 1000000

So returning 1 maximizes expected payout. If S=UDT1, then whenever it’s called, it performs the above computation to determine what the optimal S

is, then returns the same value that Swould given that input.The updateless part of the solution is that when determining the counterfactual dependencies that are necessary to find the optimal S*, UDT1 doesn’t look at its input, so that even when called with “box contains $1M”, it still doesn’t “know” that D^E=1, in which case E is clearly independent of what it returns.

I can’t follow the payouts here. For example:

`1001000 - C * 1000 + E * 1e9`

, seems to indicate that the payout could be over $2 million. How is that possible?The “E * 1e9” (note that 1e9 is a billion) part is supposed to model “Thus, if I happen to have a strong enough preference that E output True”. Does that help?

Ah, thanks, that makes sense now!

That’s very elegant! But the trick here, it seems to me, lies in the rules for setting up the world program in the first place.

First, the world-program’s calling tree should match the structure of TDT’s graph, or at least match the graph’s (physically-)causal links. The physically-causal part of the structure tends to be uncontroversial, so (for present purposes) I’m ok with just stipulating the physical structure for a given problem.

But then there’s the choice to use the same variable S in multiple places in the code. That corresponds to a choice (in TDT) to splice in a logical-dependency link from the Platonic decision-computation node to other Platonic nodes. In both theories, we need to be precise about the criteria for this dependency. Otherwise, the sense of dependency you’re invoking might turn out to be wrong (it makes the theory prescribe incorrect decisions) or question-begging (it implicitly presupposes an answer to the key question that the theory itself is supposed to figure out for us, namely what things are or are not counterfactual consequences of the decision-computation).

So the question, in UDT1, is: under what circumstances do you represent two real-world computations as being tied together via the same variable in a world-program?

That’s perhaps straightforward if S is implemented by literally the same physical state in multiple places. But as you acknowledge, you might instead have distinct Si’s that diverge from one another for some inputs (though not for the actual input in this case). And the different instances need not have the same physical substrate, or even use the same algorithm, as long as they give the same answers when the relevant inputs are the same, for some mapping between the inputs and between the outputs of the two Si’s. So there’s quite a bit of latitude as to whether to construe two computations as “logically equivalent”.

So, for example, for the conventional transparent-boxes problem, what principle tells us to formulate the world program as you proposed, rather than having:

(along with a similar program P2 that uses constant S2, yielding a different output from Omega_Predict)?

This alternative formulation ends up telling us to two-box. In this formulation, if S and S1 (or S and S2) are in fact the same, they would (counterfactually) differ if a different answer (than the actual one) were output from S—which is precisely what a causalist asserts. (A similar issue arises when deciding what facts to model as “inputs” to S—thus forbidding S to “know” those facts for purposes of figuring out the counterfactual dependencies—and what facts to build instead into the structure of the world-program, or to just leave as implicit background knowledge.)

So my concern is that UDT1 may covertly beg the question by selecting, among the possible formulations of the world-program, a version that turns out to presuppose an answer to the very question that UDT1 is intended to figure out for us (namely, what counterfactually depends on the decision-computation). And although I agree that the formulation you’ve selected in this example is correct and the above alternative formulation isn’t, I think it remains to explain why.

(As with my comments about TDT, my remarks about UDT1 are under the blanket caveat that my grasp of the intended content of the theories is still tentative, so my criticisms may just reflect a misunderstanding on my part.)

First, to clear up a possible confusion, the S in my P is not supposed to be a variable. It’s a constant, more specifically a piece of code that implements UDT1 itself. (If I sometimes talk about it as if it’s a variable, that’s because I’m trying to informally describe what is going on inside the computation that UDT1 does.)

For the more general question of how do we know the structure of the world program, the idea is that for an actual AI, we would program it to care about all possible world programs (or more generally, mathematical structures, see example 3 in my UDT1 post, but also Nesov’s recent post for a critique). The implementation of UDT1 in the AI would then figure out which world programs it’s in by looking at its inputs (which would contain all of the AI’s memories and sensory data) and checking which world programs call it with those inputs.

For these sample problems, the assumption is that somehow Omega has previously provided us with enough evidence for us to trust its word on what the structure of the current problem is. So in the actual P, ‘S(i, “box contains $1M”)’ is really something like ‘S(memories, omegas_explanations_about_this_problem, i, “box contains $1M”)’ and these additional inputs allow S to conclude that it’s being invoked inside this P, and not some other world program.

(An additional subtlety here is that if we consider all possible world programs, there are bound to be some other world programs where S is being called with these exact same inputs, for example ones where S is being instantiated inside a Boltzmann brain, but presumably those worlds/regions have very low weights, meaning that the AI doesn’t care much about them.)

Let me know if that answers your questions/concerns. I didn’t answer you point by point because I’m not sure which questions/concerns remain after you see my general answers. Feel free to repeat anything you still want me to answer.

Then it should be S(P), because S can’t make any decisions without getting to read the problem description.

Note that since our agent is considering possible world-programs, these world-programs are in some sense already part of the agent’s program (and the agent is in turn part of some of these world-programs-inside-the-agent, which reflects recursive character of the definition of the agent-program). The agent is a much better top-level program to consider than all-possible-world-programs, which is even more of a simplification if these world-programs somehow “exist at the same time”. When the (prior) definition of the world is seen as already part of the agent, a lot of the ontological confusion goes away.

(along with a similar program P2 that uses constant S2, yielding a different output from Omega_Predict)?

This alternative formulation ends up telling us to two-box. In this formulation, if S and S1 (or S and S2) are in fact the same, they would (counterfactually) differ if a different answer (than the actual one) were output from S—which is precisely what a causalist asserts. (A similar issue arises when deciding what facts to model as “inputs” to S—thus forbidding S to “know” those facts for purposes of figuring out the counterfactual dependencies—and what facts to build instead into the structure of the world-program, or to just leave as implicit background knowledge.)

So my concern is that UDT1 may covertly beg the question by selecting, among the possible formulations of the world-program, a version that turns out to presuppose an answer to the very question that UDT1 is intended to figure out for us (namely, what counterfactually depends on the decision-computation). And although I agree that the formulation you’ve selected in this example is correct and the above alternative formulation isn’t, I think it remains to explain why.

(As with my comments about TDT, my remarks about UDT1 are under the blanket caveat that my grasp of the intended content of the theories is still tentative, so my criticisms may just reflect a misunderstanding on my part.)

It seems to me that the world-program is part of the problem description, not the analysis. It’s equally tricky whether it’s given in English or in a computer program; Wei Dai just translated it faithfully, preserving the strange properties it had to begin with.

My concern is that there may be several world-programs that correspond faithfully to a given problem description, but that correspond to different analyses, yielding different decision prescriptions, as illustrated by the P1 example above. (Upon further consideration, I should probably modify P1 to include “S()=S1()” as an additional input to S and to Omega_Predict, duly reflecting that aspect of the problem description.)

If there are multiple translations, then either the translations are all mathematically equivalent, in the sense that they agree on the output for every combination of inputs, or the problem is underspecified. (This seems like it ought to be the definition for the word underspecified. It’s also worth noting that all game-theory problems are underspecified in this sense, since they contain an opponent you know little about.)

Now, if two world programs were mathematically equivalent but a decision theory gave them different answers, then

thatwould be a serious problem with the decision theory. And this does, in fact, happen with some decision theories; in particular, it happens to theories that work by trying to decompose the world program into parts, when those parts are related in a way that the decision theory doesn’t know how to handle. If you treat the world-program as an opaque object, though, then all mathematically equivalent formulations of it should give the same answer.I assume (please correct me if I’m mistaken) that you’re referring to the payout-value as the output of the world program. In that case, a P-style program and a P1-style program can certainly give different outputs for some hypothetical outputs of S (for the given inputs). However, both programs’s payout-outputs will be the same for whatever turns out to be the

actualoutput of S (for the given inputs).P and P1 have the same causal structure. And they have the same output with regard to (whatever is) the

actualoutput of S (for the given inputs). But P and P1 differcounterfactuallyas to what the payout-outputwould beif the output of S (for the given inputs) were different than whatever it actually is.So I guess you could say that what’s unspecified are the counterfactual consequences of a hypothetical decision, given the (fully specified) physical structure of the scenario. But figuring out the counterfactual consequences of a decision is the main thing that the decision theory itself is supposed to do for us; that’s what the whole Newcomb/Prisoner controversy boils down to. So I think it’s the solution that’s underspecified here, not the problem itself. We need a theory that takes the physical structure of the scenario as input, and generates counterfactual consequences (of hypothetical decisions) as outputs.

PS: To make P and P1 fully comparable, drop the “E*1e9” terms in P, so that both programs model the conventional transparent-boxes problem without an extraneous pi-preference payout.

This conversation is a bit confused. Looking back, P and P1 aren’t the same at all; P1 corresponds to the case where Omega never asks you for any decision at all! If S must be equal to S1 and S1 is part of the world program, then S must be part of the world program, too, not chosen by the player. If choosing an S such that S!=S1 is allowed, then it corresponds to the case where Omega simulates someone else (not specified).

The root of the confusion seems to be that Wei Dai wrote “def P(i): …”, when he should have written “def P(S): …”, since S is what the player gets to control. I’m not sure where making i a parameter to P came from, since the English description of the problem had i as part of the world-program, not a parameter to it.

TDT is Timeless Decision Theory. It wouldn’t be bad to say that in the first paragraph somewhere.

EDIT: Excellent. Thanks.

Done.

Can you fix the font size issue too?

Hm, sorry, it’s displaying for me in the same size as the rest of the site, so I’m not sure what you’re seeing. I’ll strip the formatting and see if that helps.

For me, the text within “You treat your choice… probability distributions over D” and “If that’s what TDT… the specified digit is zero” show up in 7.5 point font.

Better now?

That fixed it

Ugh. I removed the formatting, and now it displays for me with large vertical gaps between the paragraphs.

I suggest adding a link to this discussion to the TDT wiki entry.

So let’s say I’m confronted with this scenario, and I see $1M in the large box.

So lets get the facts:

1) There is $1M in the large box and thus (D xor E)=true

2) I know that I am an one boxing agent

3) Thus D=”one boxing”

4) Thus I know D/=E since the xor is true

5) I one-box and live happily with $1,000,000

When Omega simulates me with the

samescenario and without lying there is no problem.Seems like much of the mindgames are hindered by simply precommitting to choices.

For the red-and-green just toss a coin (or whatever choice of randomness you have).

We could make an ad-hoc repair to TDT by saying that you’re not allowed to infer from a logical fact to another logical fact going via a physical (empirical) fact.

In this case, the mistake happened because we went from “My decision algorithm’s output” (Logical) to “Money in box” (Physical) to “Digits of Pi” (Logical), where the last step involved following an arrow on a causal graph backwards: The digits of Pi has a causal arrow going into the “money in box” node.

The TDT dependency inference could be implemented by, for example, by first making all sufficiently simple logical inferences from “My decision algorithm’s output” to be made, and a limited set of logical nodes generated, and then physical influences tracked forward from there.

The key is that in the step where you infer logical consequences of the logical node for your decision algorithm, you should only be able to use mathematical proofs, not empirical evidence. Once you’ve done all you can with proofs (logical influence), then place all relevant derived logical facts in your causal graph, and use causal decision theory as usual.

This ad-hoc fix breaks as soon as Omega makes a slightly messier game, wherein you receive a physical clue as to a computation output, and this computation and your decision determine your reward.

Suppose that for any output of the computation there is a a unique best decision, and that furthermore this set of (computation output, predicted decision) pairs are mapped to distinct physical clues. Then given the clue you can infer what decision to make and the logical computation, but this

requiresthat you infer from a logical fact (the predictor of you) to the physical state to the clue to the logical fact of the computation.Can you provide a concrete example? (because I think that a series of fix-example-fix … cases might get us to the right answer)

The game is to pick a box numbered from 0 to 2; there is a hidden logical computation E yielding another value 0 to 2. Omega has a perfect predictor D of you. You choose C.

The payout is 10^((E+C)mod 3), and there is a display showing the value of F = (E-D)mod 3.

If F = 0, then:

D = 0 implies E = 0 implies optimal play is C = 2; contradiction

D = 1 implies E = 1 implies optimal play is C = 1; no contradiction

D = 2 implies E = 2 implies optimal play is C = 0; contradiction

And similarly for F = 1, F = 2 play C = F+1 as the only stable solution (which nets you 100 per play)

If you’re not allowed to infer anything about E from F, then you’re faced with a random pick from winning 1, 10 or 100, and can’t do any better...

I’m not sure this game is well defined. What value of F does the predictor D see? (That is, it’s predicting your choice after seeing what value of F?)

The same one that you’re currently seeing; for all values of E there is a value of F such that this is consistent, ie that D has actually predicted you in the scenario you currently find yourself in.

The logical/physical distinction itself can be seen as ad-hoc: you can consider the whole set-up Q as a program that is known to you (R), because the rules of the game were explained, and also consider yourself (R) as a program known to Q. Then, Q can reason about R in interaction with various situations (that is, run, given R, but R is given as part of Q, so “given R” doesn’t say anything about Q), and R can do the same with Q (and with the R within that Q, etc.). Prisoner’s dilemma can also be represented in this way, even though nobody is pulling Omega in that case.

When R is considering “the past”, it in fact considers facts about Q, which is known to R, and so facts about the past can be treated as “logical facts”. Similarly, when these facts within Q reach R at present and interact with it, they are no more “physical facts” than anything else in this setting (these interactions with R “directed from the past” can be seen as what R predicts Q-within-R-within-Q-… to do with R-within-Q-within-R-...).

Does ADT solve this particular issue?

I’m trying to get a grip on what this post is about, but I don’t know enough of the literature about Newcomb’s Problem to be sure what is referred to here by “the transparent-boxes scenario”. Can someone who knows briefly recap the baseline scenario of which this is a version?

So let’s say I’m confronted with this scenario, and I see $1M in the large box.

So lets get the facts:

There is $1M in the large box and thus (D xor E)=true

I know that I am an one boxing agent

Thus D=”one boxing”

Thus I know D/=E (since xor=true)

I one-box and live happily with $1,000,000

When Omega simulates me with the

samescenario and without lying there is no problem.Seems like much of the mindgames are hindered by simply precommitting to choices.

For the red-and-green just toss a coin (or whatever choice of randomness you have).

I have a question that is probably stupid and/or already discussed in the comments. But I don’t have time to read all the comments, so, if someone nonetheless would kindly explain why I’m confused, I would be grateful.

The OP writes

It seems to me that TDT should just bite the bullet here. By hypothesis, I really want E to output True. Let’s say that E represents the output True by flashing a green light.. Note, the issue

isn’tthat I want thei-th digit of π to be 0. Rather, I’m just reallyreallykeen on seeing that flashing green light.So I pick both boxes to maximize my chance of seeing the green light flash. After all, if the light flashes, I leave the game overflowing with utility. But if I’d picked only one box, I would be guaranteeing that the light doesn’t flash. Why would I want to do that?

ETA: Gary Drescher gives an explanation here for why two-boxing is wrong. But I don’t understand his explanation.

So let’s say I’m confronted with this scenario, and I see $1M in the large box.

So lets get the facts:

There is $1M in the large box and thus (D xor E)=true

I know that I am an one boxing agent

Thus D=”one boxing”

Thus I know D/=E since the xor is true

I one-box and live happily with $1,000,000

When Omega simulates me with the

samescenario and without lying there is no problem.Seems like much of the mindgames are hindered by simply precommitting to choices.

For the red-and-green just toss a coin (or whatever choice of randomness you have).

So let’s say I’m confronted with this scenario, and I see $1M in the large box.

So lets get the facts:

There is $1M in the large box and thus (D xor E)=true

I know that I am an one boxing agent

Thus D=”one boxing”

Thus I know D/=E since xor=true

I one-box and live happily with $1,000,000

When Omega simulates me with the

samescenario and without lying there is no problem.Seems like much of the mindgames are hindered by simply precommitting to choices.

For the red-and-green just toss a coin (or whatever choice of randomness you have).

Let:

M be ‘There is $1 in the big box’

When:

D(M) = true, D(!M) = true, E = true

Omega fails.D(M) = true, D(!M) = true, E = false

Omega chooses M or !M. I get $1M or 0.D(M) = true, D(!M) = false, E = true

Omega chooses M=false. I get $0.1.D(M) = true, D(!M) = false, E = false

Omega chooses M=true. I get $1M.D(M) = false, D(!M) = false, E = true

M chooses either M or !M. I get either $1.1 or $0.1 depending on Omega’s whimsD(M) = false, D(!M) = false, E = false

Omega has no option. I make Omega look like a fool.So, depending on how ‘Omega is wrong’ is resolved I use either D(M) = M or D(M) = false.

If Omega is just infallible then when D(M) = false, !E just never happens and I get either $0.1M or $1.1M depending on Omega’s whims. Since I’m being a smart ass I probably get $0.1M. So I use D(M) = M and get expected payout of $0.91M.

If Omega resolves “I am wrong” to “I give maximum payout” then I choose D(M) = false and get $1.1M (or sometimes either $1.1 or $0.1).

If Omega resolves “I am wrong” to “I give minimum payout” then I once again get $0.1M when D(M) = false and E.

These are the conclusions of Wedrifid-Just-Works-It-Out Decision Theory. It should match TDT when TDT is formulated right (and I don’t make a mistake).

No, but it seems that way because I neglected in my OP to supply some key details of the transparent-boxes scenario. See my new edit at the end of the OP.

So, with those details, that resolves to “I get $0”. This makes D(M) = !M the unambiguous ‘correct’ decision function.

First thought: We can get out of this dilemma by noting that the output of C also causes the predictor to choose a suitable i, so that saying we cause the ith digit of pi to have a certain value is glossing over the fact that we actually caused the i[C]th digit of pi to have a certain value.

How’s that? Any

ithat is sufficiently large is suitable. It doesn’t depend on the output of C. It just needs to be beyond C’s ability to learn anything beyond the ignorance prior regarding thei-th digit of π.I’ve finally figured out where my intuition on that was coming from (and I don’t think it saves TDT). Suppose for a moment you were omniscient except about the relative integrals Vk (1) over measures of the components of the wavefunction which

had a predictor that chose an

isuch that pi[i] = kwould evolve into components with a you (2) where the predictor would present the boxes, question, etc to you, but would not tell you its choice of

i.Here my ignorance prior on pi[x] for large values of x happens to be approximately equivalent to your ignorance prior over a certain ratio of integrals (relative “sum” of measures of relevant components). When you implement C = one-box, you choose that the relative sum of measures of you that gets $0, $1000, $1000,000, and $1001,000 is (3):

$0: 0

$1000: V0

$1000000: (1-V0)

$1001000: 0

whereas when you implement C = two-box, you get

$0: 0

$1000: (1-V0)

$1000000: 0

$1001000: V0

If your preferences over wavefunctions happens to include a convenient part that tries to maximize the expected integral of dollars you[k] gets times measure of you[k], you probably one-box here, just like me. And now for you it’s much more like you’re choosing to have the predictor pick a sweet

i^{9}⁄_{10}of the time.(1) by relative integral I mean instead of Wk, you know Vk = Wk/(W0+W1+...+W9)

(2) something is a you when it has the same preferences over solutions to the wavefunction as you and implements the same decision theory as you, whatever

preciselythat means(3) this bit only works because the measure we’re using, the square of the modulus of the amplitude, is preserved under time-evolution

Some related questions and possible answers below.

I wonder if that sort of transform is in general useful? Changing your logical uncertainty into an equivalent uncertainty about measure. For the calculator problem you’d say you knew exactly the answer to all multiplication problems, you just didn’t know what the calculators had been programmed to calculate. So when you saw the answer 56,088 on your Mars calculator, you’d immediately know that your Venus calculator was flashing 56,088 as well (barring asteroids, etc). This information does not travel faster than light—if someone typed 123x456 on your Mars calculator while someone else typed 123x456 on your Venus calculator, you would not know that they were both flashing 56,088 - you’d have to wait until you learned that they both typed the same input. Or if you told someone to think of an input, then tell someone else who would go to Venus and type it in there, you’d still have to wait for them to get to Venus (which they can do a light speed, whynot).

How about whether P=NP, then? No matter what, once you saw 56,088 on Mars you’d know the correct answer to “what’s on the Venus calculator?” But before you saw it, your estimate of the probability “56,088 is on the Venus calculator” would depend on how you transformed the problem. Maybe you knew they’d type 123x45?, so your probability was

^{1}⁄_{10}. Maybe you knew they’d type 123x???, so your probability was 1/1000. Maybe you had no idea so you had a sort of a complete ignorance prior.I think this transform comes down to choosing appropriate reference classes for your logical uncertainty.

Why would you or I have such a preference that cares about my ancestor’s time-evolved descendants rather than just my time-evolved descendants? My guess is that

a human’s preferences are (fairly) stable under time-evolution, and

the only humans that survive are the ones that care about their descendants, and

humans that we see around us are the time-evolution of similar humans,

So e.g. I[now] care approximately about what I[5-minutes-ago] cared about, and I[5-minutes-ago] didn’t just care about me[now], he also cared about me[now-but-in-a-parallel-branch].

In the setup in question, D goes into an infinite loop (since in the general case it must call a copy of C, but because the box is transparent, C takes as input the output of D).

In Eliezer’s similar red/green problem, if the simulation is fully deterministic and the initial conditions are the same, then the simulator must be lying, because he must’ve told the same thing to the first instance, at a time when there had been no previous copy. (If those conditions do not hold, then the solution is to just flip a coin and take your 50-50 chance.)

Are these still problems when you change them to fix the inconsistencies?

No, because by stipulation here, D

onlysimulates the hypothetical case in which the box contains $1M, which doesnotnecessarily correspond to the output of D (see my earlier reply to JGWeissman:http://lesswrong.com/lw/1qo/a_problem_with_timeless_decision_theory_tdt/1kpk).