Comparison of decision theories (with a focus on logical-counterfactual decision theories)

Introduction

Summary

This post is a com­par­i­son of var­i­ous ex­ist­ing de­ci­sion the­o­ries, with a fo­cus on de­ci­sion the­o­ries that use log­i­cal coun­ter­fac­tu­als (a.k.a. the kind of de­ci­sion the­o­ries most dis­cussed on LessWrong). The post com­pares the de­ci­sion the­o­ries along out­er­most iter­a­tion (ac­tion vs policy vs al­gorithm), up­date­less­ness (up­date­less or up­date­ful), and type of coun­ter­fac­tual used (causal, con­di­tional, log­i­cal). It then ex­plains the de­ci­sion the­o­ries in more de­tail, in par­tic­u­lar giv­ing an ex­pected util­ity for­mula for each. The post then gives ex­am­ples of spe­cific ex­ist­ing de­ci­sion prob­lems where the de­ci­sion the­o­ries give differ­ent an­swers.

Value-added

There are some other com­par­i­sons of de­ci­sion the­o­ries (see the “Other com­par­i­sons” sec­tion), but they ei­ther (1) don’t fo­cus on log­i­cal-coun­ter­fac­tual de­ci­sion the­o­ries; or (2) are out­dated (writ­ten be­fore the new func­tional/​log­i­cal de­ci­sion the­ory ter­minol­ogy came about).

To give a more per­sonal mo­ti­va­tion, af­ter read­ing through a bunch of pa­pers and posts about these de­ci­sion the­o­ries, and feel­ing like I un­der­stood the ba­sic ideas, I re­mained highly con­fused about ba­sic things like “How is UDT differ­ent from FDT?”, “Why was TDT de­p­re­cated?”, and “If TDT performs worse than FDT, then what’s one de­ci­sion prob­lem where they give differ­ent out­puts?” This post hopes to clar­ify these and other ques­tions.

None of the de­ci­sion the­ory ma­te­rial in this post is novel. I am still learn­ing the ba­sics my­self, and I would ap­pre­ci­ate any cor­rec­tions (even about sub­tle/​nit­picky stuff).

Audience

This post is in­tended for peo­ple who are similarly con­fused about the differ­ences be­tween TDT, UDT, FDT, and LDT. In terms of reader back­ground as­sumed, it would be good to know the state­ments to some stan­dard de­ci­sion the­ory prob­lems (New­comb’s prob­lem, smok­ing le­sion, Parfit’s hitch­hiker, trans­par­ent box New­comb’s prob­lem, coun­ter­fac­tual mug­ging (a.k.a. cu­ri­ous bene­fac­tor; see page 56, foot­note 89)) and the “cor­rect” an­swers to them, and hav­ing enough back­ground in math to un­der­stand the ex­pected util­ity for­mu­las.

If you don’t have the back­ground, I would recom­mend read­ing chap­ters 5 and 6 of Gary Drescher’s Good and Real (ex­plains well the idea of sub­junc­tive means–end re­la­tions), the FDT pa­per (ex­plains well how FDT’s ac­tion se­lec­tion var­i­ant works, and how FDT differs from CDT and EDT), “Cheat­ing Death in Da­m­as­cus”, and “Toward Ideal­ized De­ci­sion The­ory” (ex­plains the differ­ence be­tween policy se­lec­tion and log­i­cal coun­ter­fac­tu­als well), and un­der­stand­ing what Wei Dai calls “de­ci­sion the­o­retic think­ing” (see com­ments: 1, 2, 3). I think a lot of (es­pe­cially old) con­tent on de­ci­sion the­ory is con­fus­ingly writ­ten or un­friendly to be­gin­ners, and would recom­mend skip­ping around to find ex­pla­na­tions that “click”.

Com­par­i­son dimensions

My main mo­ti­va­tion is to try to dis­t­in­guish be­tween TDT, UDT, and FDT, so I fo­cus on three di­men­sions for com­par­i­son that I think best dis­play the differ­ences be­tween these de­ci­sion the­o­ries.

Outer­most iteration

All of the de­ci­sion the­o­ries in this post iter­ate through some set of “op­tions” (in­ten­tion­ally vague) at the out­er­most layer of ex­e­cu­tion to find the best “op­tion”. How­ever, the na­ture (type) of these “op­tions” differs among the var­i­ous the­o­ries. Most de­ci­sion the­o­ries iter­ate through ei­ther ac­tions or poli­cies. When a de­ci­sion the­ory iter­ates through ac­tions (to find the best ac­tion), it is do­ing “ac­tion se­lec­tion”, and the de­ci­sion the­ory out­puts a sin­gle ac­tion. When a de­ci­sion the­ory iter­ates through poli­cies (to find the best policy), it is do­ing “policy se­lec­tion”, and out­puts a sin­gle policy, which is an ob­ser­va­tion-to-ac­tion map­ping. To get an ac­tion out of a de­ci­sion the­ory that does policy se­lec­tion (be­cause what we re­ally care about is know­ing which ac­tion to take), we must call the policy on the ac­tual ob­ser­va­tion.

Us­ing the no­ta­tion of the FDT pa­per, an ac­tion has type while a policy has type , where is the set of ob­ser­va­tions. So given a policy and ob­ser­va­tion , we get the ac­tion by call­ing on , i.e. .

From the ex­pected util­ity for­mula of the de­ci­sion the­ory, you can tell ac­tion vs policy se­lec­tion by see­ing what vari­able comes be­neath the op­er­a­tor (the op­er­a­tor is what does the out­er­most iter­a­tion); if it is (or similar) then it is iter­at­ing over ac­tions, and if it is (or similar), then it is iter­at­ing over poli­cies.

One ex­cep­tion to the above is UDT2, which seems to iter­ate over al­gorithms.

Updatelessness

In some de­ci­sion prob­lems, the agent makes an ob­ser­va­tion, and has the choice of up­dat­ing on this ob­ser­va­tion be­fore act­ing. Two ex­am­ples of this are: in coun­ter­fac­tual mug­ging (a.k.a. cu­ri­ous bene­fac­tor), where the agent makes the ob­ser­va­tion that the coin has come up tails; and in the trans­par­ent box New­comb’s prob­lem, where the agent sees whether the big box is full or empty.

If the de­ci­sion al­gorithm up­dates on the ob­ser­va­tion, it is up­date­ful (a.k.a. “not up­date­less”). If it doesn’t up­date on the ob­ser­va­tion, it is up­date­less.

This idea is similar to how in Rawls’s “veil of ig­no­rance”, you must pick your moral prin­ci­ples, so­cietal poli­cies, etc., be­fore you find out who you are in the world or as if you don’t know who you are in the world.

How can you tell if a de­ci­sion the­ory is up­date­less? In its ex­pected util­ity for­mula, if it con­di­tions on the ob­ser­va­tion, it is up­date­ful. In this case the prob­a­bil­ity fac­tor looks like , where is the ob­ser­va­tion (some­times the ob­ser­va­tion is called “sense data” and is de­noted by ). If a de­ci­sion the­ory is up­date­less, the con­di­tion­ing on “” is ab­sent. Up­date­less­ness only makes a differ­ence in de­ci­sion prob­lems that have ob­ser­va­tions.

There seem to be differ­ent mean­ings of “up­date­less” in use. In this post I will use the above mean­ing. (I will try to post a ques­tion on LessWrong soon about these differ­ent mean­ings.)

Type of counterfactual

In the course of rea­son­ing about a de­ci­sion prob­lem, the agent can con­struct coun­ter­fac­tu­als or hy­po­thet­i­cals like “if I do this, then that hap­pens”. There are sev­eral differ­ent kinds of coun­ter­fac­tu­als, and de­ci­sion the­o­ries are di­vided among them.

The three types of coun­ter­fac­tu­als that will con­cern us are: causal, con­di­tional/​ev­i­den­tial, and log­i­cal/​sub­junc­tive. The dis­tinc­tions be­tween these are ex­plained clearly in the FDT pa­per so I recom­mend read­ing that (and I won’t ex­plain them here).

In the ex­pected util­ity for­mula, if the prob­a­bil­ity fac­tor looks like then it is ev­i­den­tial, and if it looks like then it is causal. I have seen the log­i­cal coun­ter­fac­tual writ­ten in many ways:

  • e.g. in the FDT pa­per, p. 14

  • e.g. in the FDT pa­per, p. 14

  • e.g. in Hintze, p. 4

  • e.g. on Arbital

Other di­men­sions that I ignore

There are many more di­men­sions along which de­ci­sion the­o­ries differ, but I don’t un­der­stand these and they seem less rele­vant for com­par­ing among the main log­i­cal-coun­ter­fac­tual de­ci­sion the­o­ries, so I will just list them here but won’t go into them much later on in the post:

  • Reflec­tive con­sis­tency (in par­tic­u­lar dy­namic con­sis­tency): I think this is about whether an agent would use pre­com­mit­ment mechanisms or self-mod­ify to use a differ­ent de­ci­sion the­ory. Can this be seen im­me­di­ately from the ex­pected util­ity for­mula? If not, it might be un­like the other three above. My cur­rent guess is that re­flec­tive con­sis­tency is a higher-level prop­erty that fol­lows from the above three.

  • Em­pha­sis on graph­i­cal mod­els: FDT is for­mal­ized us­ing graph­i­cal mod­els (of the kind you can read about in Judea Pearl’s book Causal­ity) while UDT isn’t.

  • Re­cent de­vel­op­ments like us­ing log­i­cal in­duc­tors.

  • Uncer­tainty about where your de­ci­sion al­gorithm is: I think this is some com­bi­na­tion of the three that I’m already cov­er­ing. For pre­vi­ous dis­cus­sions, see this sec­tion of An­drew Critch’s post, this com­ment by Wei Dai, and this post by Vladimir Slep­nev.

  • Differ­ent ver­sions of UDT (e.g. proof-based, modal).

Com­par­i­son table along the given dimensions

Given the com­par­i­son di­men­sions above, the de­ci­sion the­o­ries can be sum­ma­rized as fol­lows:

De­ci­sion the­ory Outer­most iter­a­tion Up­date­less Type of coun­ter­fac­tual
Up­date­less de­ci­sion the­ory 1 (UDT1) ac­tion yes log­i­cal
Up­date­less de­ci­sion the­ory 1.1 (UDT1.1) policy yes log­i­cal
Up­date­less de­ci­sion the­ory 2 (UDT2) al­gorithm yes log­i­cal
Func­tional de­ci­sion the­ory, iter­at­ing over ac­tions (FDT-ac­tion) ac­tion yes log­i­cal
Func­tional de­ci­sion the­ory, iter­at­ing over poli­cies (FDT-policy) policy yes log­i­cal
Log­i­cal de­ci­sion the­ory (LDT) un­speci­fied un­speci­fied log­i­cal
Time­less de­ci­sion the­ory (TDT) ac­tion no log­i­cal
Causal de­ci­sion the­ory (CDT) ac­tion no causal
Ev­i­den­tial de­ci­sion the­ory (EDT, “naive EDT”) ac­tion no con­di­tional

The gen­eral “shape” of the ex­pected util­ity for­mu­las will be:

Or some­times:

Ex­pla­na­tions of each de­ci­sion theory

This sec­tion elab­o­rates on the com­par­i­son above by giv­ing an ex­pected value for­mula for each de­ci­sion the­ory and ex­plain­ing why each cell in the table takes that par­tic­u­lar value. I won’t define the no­ta­tion very clearly, since I am mostly col­lect­ing the var­i­ous no­ta­tions that have been used (so that you can look at the linked sources for the de­tails). My goals are to ex­plain how to fill in the table above and to show how all the ex­ist­ing var­i­ants in no­ta­tion are say­ing the same thing.

UDT1 and FDT (iter­ate over ac­tions)

I will de­scribe UDT1 and FDT’s ac­tion var­i­ant to­gether, be­cause I think they give the same de­ci­sions (if there’s a de­ci­sion prob­lem where they differ, I would like to know about it). The main differ­ences be­tween the two seem to be:

  1. The way they are for­mal­ized, where FDT uses graph­i­cal mod­els and UDT1 uses some kind of non-graph­i­cal “math­e­mat­i­cal in­tu­ition mod­ule”.

  2. The nam­ing, where UDT1 em­pha­sizes the “up­date­less” as­pect and FDT em­pha­sizes the log­i­cal coun­ter­fac­tual as­pect.

  3. Some ad­di­tional as­sump­tions that UDT has that FDT doesn’t. Rob Bens­inger says “ac­cept­ing FDT doesn’t nec­es­sar­ily re­quire a com­mit­ment to some of the philo­soph­i­cal ideas as­so­ci­ated with up­date­less­ness and log­i­cal prior prob­a­bil­ity that MIRI, Wei Dai, or other FDT pro­po­nents hap­pen to ac­cept” and also says UDT “built in some de­bat­able as­sump­tions (over and above what’s needed to show why TDT, CDT, and EDT don’t work)”. I’m not sure what these ad­di­tional as­sump­tions are, but my guess is it has to do with view­ing the world as a pro­gram, Teg­mark’s level IV mul­ti­verse, and things like that (I would be in­ter­ested in hear­ing more about the ex­act as­sump­tions).

In the origi­nal UDT post, the ex­pected util­ity for­mula is writ­ten like this: Here is an “out­put string” (which is ba­si­cally an ac­tion). The sum is taken over all pos­si­ble vec­tors of the ex­e­cu­tion his­to­ries. I pre­fer Tyrrell McAllister’s no­ta­tion:

To ex­plain the UDT1 row in the com­par­i­son table, note that:

  • The out­er­most iter­a­tion is (over out­put strings, a.k.a. ac­tions), so it is do­ing ac­tion se­lec­tion.

  • We don’t up­date on the ob­ser­va­tion. This isn’t re­ally clear from the no­ta­tion, since still de­pends on the in­put string . How­ever, the origi­nal post clar­ifies this, say­ing “Bayesian up­dat­ing is not done ex­plic­itly in this de­ci­sion the­ory”.

  • The coun­ter­fac­tual is log­i­cal be­cause and use the “math­e­mat­i­cal in­tu­ition mod­ule”.

In the FDT pa­per (p. 14), the ac­tion se­lec­tion var­i­ant of FDT is writ­ten as fol­lows:

Again, note that we are do­ing ac­tion se­lec­tion (“”), us­ing log­i­cal coun­ter­fac­tu­als (“”), and be­ing up­date­less (ab­sence of “”).

UDT1.1 and FDT (iter­ate over poli­cies)

UDT1.1 is a de­ci­sion the­ory in­tro­duced by Wei Dai’s post “Ex­plicit Op­ti­miza­tion of Global Strat­egy (Fix­ing a Bug in UDT1)”.

In Hintze (p. 4, 12) UDT1.1 is writ­ten as fol­lows:

Here iter­ates over func­tions that map sense data () to ac­tions (), is the util­ity func­tion, and are out­comes.

Us­ing Tyrrell McAllister’s no­ta­tion, UDT1.1 looks like:

Us­ing no­ta­tion from the FDT pa­per plus a trick I saw on this Ar­bital page we can write the policy se­lec­tion var­i­ant of FDT as:

On the right hand side, the large ex­pres­sion (the part in­side and in­clud­ing the ) re­turns a policy, so to get the ac­tion we call the policy on the ob­ser­va­tion .

The im­por­tant things to note are that UDT1.1 and the policy se­lec­tion var­i­ant of FDT:

  • Do policy se­lec­tion be­cause the out­er­most iter­a­tion is over poli­cies (“” or “” de­pend­ing on the no­ta­tion). Quotes about policy se­lec­tion: The FDT pa­per (p. 11, foot­note 7) says “In the au­thors’ preferred for­mal­iza­tion of FDT, agents ac­tu­ally iter­ate over poli­cies (map­pings from ob­ser­va­tions to ac­tions) rather than ac­tions. This makes a differ­ence in cer­tain multi-agent dilem­mas, but will not make a differ­ence in this pa­per.” See also com­ments by Vladimir Slep­nev (1, 2).

  • Use log­i­cal coun­ter­fac­tu­als (de­noted by cor­ner quotes and boxed ar­row, the math­e­mat­i­cal in­tu­ition , or the op­er­a­tor).

  • Are up­date­less be­cause they don’t con­di­tion on the ob­ser­va­tion (note the ab­sence of con­di­tion­ing of the form ).

TDT

My un­der­stand­ing of TDT is mainly from Hintze. I am aware of the TDT pa­per and skimmed it a while back, but did not re­visit it in the course of writ­ing this post.

Us­ing no­ta­tion from Hintze (p. 4, 11) the ex­pected util­ity for­mula for TDT can be writ­ten as fol­lows:

Here, is a string of sense data (a.k.a. ob­ser­va­tion), is the set of ac­tions, is the util­ity func­tion, are out­comes, the cor­ner quotes and boxed ar­row de­note a log­i­cal coun­ter­fac­tual (“if the TDT al­gorithm were to out­put given in­put ”).

If I were to rewrite the above us­ing no­ta­tion from the FDT pa­per, it would look like:

The things to note are:

  • The out­er­most iter­a­tion is over ac­tions (“”), so TDT does ac­tion se­lec­tion.

  • We con­di­tion on the sense data or ob­ser­va­tion , so TDT is up­date­ful. Quotes about TDT’s up­date­ful­ness: this post de­scribes TDT as “a the­ory by MIRI se­nior re­searcher Eliezer Yud­kowsky that made the mis­take of con­di­tion­ing on ob­ser­va­tions”. The Up­date­less de­ci­sion the­o­ries page on Ar­bital calls TDT “up­date­ful”. Hintze (p. 11): “TDP’s failure on the Cu­ri­ous Bene­fac­tor is straight­for­ward. Upon see­ing the coin­flip has come up tails, it up­dates on the sen­sory data and re­al­izes that it is in the causal branch where there is no pos­si­bil­ity of get­ting a mil­lion.”

  • We use cor­ner quotes and the boxed ar­row, or the op­er­a­tor, to de­note a log­i­cal coun­ter­fac­tual.

UDT2

I know very lit­tle about UDT2, but based on this com­ment by Wei Dai and this post by Vladimir Slep­nev, it seems to iter­ate over al­gorithms rather than ac­tions or poli­cies, and I am as­sum­ing it didn’t aban­don up­date­less­ness and log­i­cal coun­ter­fac­tu­als.

The fol­low­ing search queries might have more in­for­ma­tion:

LDT

LDT (log­i­cal de­ci­sion the­ory) seems to be an um­brella de­ci­sion the­ory that only re­quires the use of log­i­cal coun­ter­fac­tu­als, leav­ing the iter­a­tion type and up­date­less­ness un­speci­fied. So my un­der­stand­ing is that UDT1, UDT1.1, UDT2, FDT, and TDT are all log­i­cal de­ci­sion the­o­ries. See this Ar­bital page, which says:

“Log­i­cal de­ci­sion the­o­ries” are re­ally a fam­ily of re­cently pro­posed de­ci­sion the­o­ries, none of which stands out as be­ing clearly ahead of the oth­ers in all re­gards, but which are allegedly all bet­ter than causal de­ci­sion the­ory.

The page also calls TDT a log­i­cal de­ci­sion the­ory (listed un­der “non-gen­eral but use­ful log­i­cal de­ci­sion the­o­ries”).

CDT

Us­ing no­ta­tion from the FDT pa­per (p. 13), we can write the ex­pected util­ity for­mula for CDT as fol­lows:

Things to note:

  • The out­er­most iter­a­tion is so CDT does ac­tion se­lec­tion.

  • We con­di­tion on so CDT is up­date­ful.

  • The pres­ence of means we use causal coun­ter­fac­tu­als.

EDT

Us­ing no­ta­tion from the FDT pa­per (p. 12), we can write the ex­pected util­ity for­mula for EDT as fol­lows:

Things to note:

  • The out­er­most iter­a­tion is so EDT does ac­tion se­lec­tion.

  • We con­di­tion on so EDT is up­date­ful.

  • We con­di­tion on so EDT uses con­di­tional prob­a­bil­ity as its coun­ter­fac­tual.

There are var­i­ous ver­sions of EDT (e.g. ver­sions that smoke on the smok­ing le­sion prob­lem). The EDT in this post is the “naive” ver­sion. I don’t un­der­stand the more so­phis­ti­cated ver­sions of EDT, but the key­word for learn­ing more about them seems to be the tickle defense.

Com­par­i­son on spe­cific de­ci­sion problems

If two de­ci­sion the­o­ries are ac­tu­ally differ­ent, there should be some de­ci­sion prob­lem where they re­turn differ­ent an­swers.

The FDT pa­per does a great job of dis­t­in­guish­ing the log­i­cal-coun­ter­fac­tual de­ci­sion the­o­ries from EDT and CDT. How­ever, it doesn’t dis­t­in­guish be­tween differ­ent log­i­cal-coun­ter­fac­tual de­ci­sion the­o­ries.

The fol­low­ing is a table that shows the dis­agree­ments be­tween de­ci­sion the­o­ries. For each pair of de­ci­sion the­o­ries speci­fied by a row and column, the de­ci­sion prob­lem named in the cell is one where the de­ci­sion the­o­ries re­turn differ­ent an­swers. The di­ag­o­nal is blank be­cause the de­ci­sion the­o­ries are the same. The lower left tri­an­gle is blank be­cause it re­peats the en­tries in the mir­ror image (along the di­ag­o­nal) spots.

UDT1.1/​FDT-policy UDT1/​FDT-ac­tion TDT EDT CDT
UDT1.1/​FDT-policy Num­ber as­sign­ment prob­lem de­scribed in the UDT1.1 post (both UDT1 copies out­put “A”, the UDT1.1 copies out­put “A” and “B”) Coun­ter­fac­tual mug­ging (a.k.a. cu­ri­ous bene­fac­tor) (TDT re­fuses, UDT1.1 pays) Parfit’s hitch­hiker (EDT re­fuses, UDT1.1 pays) New­comb’s prob­lem (CDT two-boxes, UDT1.1 one-boxes)
UDT1/​FDT-ac­tion Coun­ter­fac­tual mug­ging (a.k.a. cu­ri­ous bene­fac­tor) (TDT re­fuses, UDT1 pays) Parfit’s hitch­hiker (EDT re­fuses, UDT1 pays) New­comb’s prob­lem (CDT two-boxes, UDT1 one-boxes)
TDT Parfit’s hitch­hiker (EDT re­fuses, TDT pays) New­comb’s prob­lem (CDT two-boxes, TDT one-boxes)
EDT New­comb’s prob­lem (CDT two-boxes, EDT one-boxes)
CDT

Other comparisons

Here are some ex­ist­ing com­par­i­sons be­tween de­ci­sion the­o­ries that I found use­ful, along with rea­sons why I felt the cur­rent post was needed.