# Gary_Drescher comments on A problem with Timeless Decision Theory (TDT)

• That’s very el­e­gant! But the trick here, it seems to me, lies in the rules for set­ting up the world pro­gram in the first place.

First, the world-pro­gram’s call­ing tree should match the struc­ture of TDT’s graph, or at least match the graph’s (phys­i­cally-)causal links. The phys­i­cally-causal part of the struc­ture tends to be un­con­tro­ver­sial, so (for pre­sent pur­poses) I’m ok with just stipu­lat­ing the phys­i­cal struc­ture for a given prob­lem.

But then there’s the choice to use the same vari­able S in mul­ti­ple places in the code. That cor­re­sponds to a choice (in TDT) to splice in a log­i­cal-de­pen­dency link from the Pla­tonic de­ci­sion-com­pu­ta­tion node to other Pla­tonic nodes. In both the­o­ries, we need to be pre­cise about the crite­ria for this de­pen­dency. Other­wise, the sense of de­pen­dency you’re in­vok­ing might turn out to be wrong (it makes the the­ory pre­scribe in­cor­rect de­ci­sions) or ques­tion-beg­ging (it im­plic­itly pre­sup­poses an an­swer to the key ques­tion that the the­ory it­self is sup­posed to figure out for us, namely what things are or are not coun­ter­fac­tual con­se­quences of the de­ci­sion-com­pu­ta­tion).

So the ques­tion, in UDT1, is: un­der what cir­cum­stances do you rep­re­sent two real-world com­pu­ta­tions as be­ing tied to­gether via the same vari­able in a world-pro­gram?

That’s per­haps straight­for­ward if S is im­ple­mented by liter­ally the same phys­i­cal state in mul­ti­ple places. But as you ac­knowl­edge, you might in­stead have dis­tinct Si’s that di­verge from one an­other for some in­puts (though not for the ac­tual in­put in this case). And the differ­ent in­stances need not have the same phys­i­cal sub­strate, or even use the same al­gorithm, as long as they give the same an­swers when the rele­vant in­puts are the same, for some map­ping be­tween the in­puts and be­tween the out­puts of the two Si’s. So there’s quite a bit of lat­i­tude as to whether to con­strue two com­pu­ta­tions as “log­i­cally equiv­a­lent”.

So, for ex­am­ple, for the con­ven­tional trans­par­ent-boxes prob­lem, what prin­ci­ple tells us to for­mu­late the world pro­gram as you pro­posed, rather than hav­ing:

``````def P1(i):
const S1;
E = (Pi(i) == 0)
D = Omega_Pre­dict(S1, i, “box con­tains \$1M”)
if D ^ E:
C = S(i, “box con­tains \$1M”)
pay­out = 1001000 - C * 1000
else:
C = S(i, “box is empty”)
pay­out = 1000 - C * 1000
``````

(along with a similar pro­gram P2 that uses con­stant S2, yield­ing a differ­ent out­put from Omega_Pre­dict)?

This al­ter­na­tive for­mu­la­tion ends up tel­ling us to two-box. In this for­mu­la­tion, if S and S1 (or S and S2) are in fact the same, they would (coun­ter­fac­tu­ally) differ if a differ­ent an­swer (than the ac­tual one) were out­put from S—which is pre­cisely what a causal­ist as­serts. (A similar is­sue arises when de­cid­ing what facts to model as “in­puts” to S—thus for­bid­ding S to “know” those facts for pur­poses of figur­ing out the coun­ter­fac­tual de­pen­den­cies—and what facts to build in­stead into the struc­ture of the world-pro­gram, or to just leave as im­plicit back­ground knowl­edge.)

So my con­cern is that UDT1 may covertly beg the ques­tion by se­lect­ing, among the pos­si­ble for­mu­la­tions of the world-pro­gram, a ver­sion that turns out to pre­sup­pose an an­swer to the very ques­tion that UDT1 is in­tended to figure out for us (namely, what coun­ter­fac­tu­ally de­pends on the de­ci­sion-com­pu­ta­tion). And al­though I agree that the for­mu­la­tion you’ve se­lected in this ex­am­ple is cor­rect and the above al­ter­na­tive for­mu­la­tion isn’t, I think it re­mains to ex­plain why.

(As with my com­ments about TDT, my re­marks about UDT1 are un­der the blan­ket caveat that my grasp of the in­tended con­tent of the the­o­ries is still ten­ta­tive, so my crit­i­cisms may just re­flect a mi­s­un­der­stand­ing on my part.)

• First, to clear up a pos­si­ble con­fu­sion, the S in my P is not sup­posed to be a vari­able. It’s a con­stant, more speci­fi­cally a piece of code that im­ple­ments UDT1 it­self. (If I some­times talk about it as if it’s a vari­able, that’s be­cause I’m try­ing to in­for­mally de­scribe what is go­ing on in­side the com­pu­ta­tion that UDT1 does.)

For the more gen­eral ques­tion of how do we know the struc­ture of the world pro­gram, the idea is that for an ac­tual AI, we would pro­gram it to care about all pos­si­ble world pro­grams (or more gen­er­ally, math­e­mat­i­cal struc­tures, see ex­am­ple 3 in my UDT1 post, but also Nesov’s re­cent post for a cri­tique). The im­ple­men­ta­tion of UDT1 in the AI would then figure out which world pro­grams it’s in by look­ing at its in­puts (which would con­tain all of the AI’s mem­o­ries and sen­sory data) and check­ing which world pro­grams call it with those in­puts.

For these sam­ple prob­lems, the as­sump­tion is that some­how Omega has pre­vi­ously pro­vided us with enough ev­i­dence for us to trust its word on what the struc­ture of the cur­rent prob­lem is. So in the ac­tual P, ‘S(i, “box con­tains \$1M”)’ is re­ally some­thing like ‘S(mem­o­ries, omegas_ex­pla­na­tions_about_this_prob­lem, i, “box con­tains \$1M”)’ and these ad­di­tional in­puts al­low S to con­clude that it’s be­ing in­voked in­side this P, and not some other world pro­gram.

(An ad­di­tional sub­tlety here is that if we con­sider all pos­si­ble world pro­grams, there are bound to be some other world pro­grams where S is be­ing called with these ex­act same in­puts, for ex­am­ple ones where S is be­ing in­stan­ti­ated in­side a Boltz­mann brain, but pre­sum­ably those wor­lds/​re­gions have very low weights, mean­ing that the AI doesn’t care much about them.)

Let me know if that an­swers your ques­tions/​con­cerns. I didn’t an­swer you point by point be­cause I’m not sure which ques­tions/​con­cerns re­main af­ter you see my gen­eral an­swers. Feel free to re­peat any­thing you still want me to an­swer.

• First, to clear up a pos­si­ble con­fu­sion, the S in my P is not sup­posed to be a vari­able. It’s a con­stant, more speci­fi­cally a piece of code that im­ple­ments UDT1 it­self. (If I some­times talk about it as if it’s a vari­able, that’s be­cause I’m try­ing to in­for­mally de­scribe what is go­ing on in­side the com­pu­ta­tion that UDT1 does.)

Then it should be S(P), be­cause S can’t make any de­ci­sions with­out get­ting to read the prob­lem de­scrip­tion.

• Note that since our agent is con­sid­er­ing pos­si­ble world-pro­grams, these world-pro­grams are in some sense already part of the agent’s pro­gram (and the agent is in turn part of some of these world-pro­grams-in­side-the-agent, which re­flects re­cur­sive char­ac­ter of the defi­ni­tion of the agent-pro­gram). The agent is a much bet­ter top-level pro­gram to con­sider than all-pos­si­ble-world-pro­grams, which is even more of a sim­plifi­ca­tion if these world-pro­grams some­how “ex­ist at the same time”. When the (prior) defi­ni­tion of the world is seen as already part of the agent, a lot of the on­tolog­i­cal con­fu­sion goes away.