Forum Digest: Updateless Decision Theory

Sum­mary: This is a quick ex­pos­i­tory re­cap, with links, of the posts on this fo­rum on the topic of up­date­less de­ci­sion the­ory, through 3/​19/​15. Read this if you want to learn more about UDT, or if you’re cu­ri­ous about what we’ve been work­ing on lately!


Up­date­less de­ci­sion the­ory (hence­forth UDT) is a pro­posed al­gorithm for mak­ing good de­ci­sions in a world that can con­tain pre­dic­tions and other echoes of the agent’s de­ci­sion al­gorithm. It is mo­ti­vated by a set of New­comblike prob­lems on which the clas­si­cal de­ci­sion the­o­ries (ev­i­den­tial de­ci­sion the­ory and causal de­ci­sion the­ory) have known failure modes. For more con­text and mo­ti­va­tion for UDT, see the MIRI re­search doc­u­ment Toward Ideal­ized De­ci­sion The­ory.

Philo­soph­i­cally speak­ing, UDT looks over all pos­si­ble strate­gies (maps from “agent ob­serves X” to “agent does Y”), picks the global strat­egy that would be best for all copies of UDT in ag­gre­gate, and then performs the ac­tion that the strat­egy recom­mends for an agent in its po­si­tion. (One ex­pla­na­tion for choos­ing strate­gies rather than ac­tions is that it defends against black­mail: if you are be­ing black­mailed, then pay­ing up ap­pears to be the bet­ter ac­tion; but since the strat­egy of re­fus­ing to pay de­ters all savvy black­mailers, UDT know­ably re­fuses to pay up and thus is rarely black­mailed. Or so the the­ory goes.)

This leads to some in­ter­est­ing am­bi­gui­ties, most im­por­tantly the coun­ter­fac­tu­als about “what would hap­pen” if the de­ter­minis­tic UDT al­gorithm chose strat­egy B, when in fact it will end up choos­ing strat­egy A. Th­ese log­i­cal coun­ter­fac­tu­als can be con­fronted straight­for­wardly in math­e­mat­i­cal mod­els by try­ing to for­mally prove that if UDT se­lects strat­egy S, then the uni­verse is in a state of util­ity U. But then one faces the prob­lem of spu­ri­ous coun­ter­fac­tu­als (“if the agent chose X then Y would hap­pen”, which is only log­i­cally valid be­cause we can prove that the agent does not choose X), which lead to fur­ther com­pli­ca­tions.

The Gödel-Löb modal logic GL (also called prov­abil­ity logic) al­lows us to con­struct a good heuris­tic model of UDT in a math­e­mat­i­cal uni­verse, us­ing Gödel num­ber­ing and quin­ing to cap­ture self-refer­ence, and eval­u­at­ing coun­ter­fac­tu­als via un­bounded proof searches (i.e. a halt­ing or­a­cle). This “modal UDT” has some in­ter­est­ing prop­er­ties: for in­stance, it is prov­ably op­ti­mal on de­ci­sion prob­lems that “only care what it does”, if given suffi­ciently pow­er­ful rea­son­ing ax­ioms. More­over, the tools of Kripke se­man­tics al­low the ac­tions of modal UDT and other modally defined agents to be for­mally ver­ified in polyno­mial time!

Lest we get car­ried away, modal UDT is nei­ther a perfect in­stan­ti­a­tion of the philo­soph­i­cal UDT al­gorithm, nor is it op­ti­mal in prob­lems in­volv­ing uni­verses that pre­dict the agent’s de­ci­sions via prov­abil­ity. Even so, it is an in­cred­ibly fruit­ful ob­ject for fur­ther study in de­ci­sion the­ory!

Without fur­ther ado, here is an overview of all the posts on this fo­rum thus far that re­late to up­date­less de­ci­sion the­ory:

Ex­pos­i­tory material

New re­search on modal UDT

  • Us­ing modal fixed points to for­mal­ize log­i­cal causal­ity, Vladimir Slep­nev. I be­lieve this is ac­tu­ally the first time that some­one con­structed a UDT al­gorithm in prov­abil­ity logic!

  • “Evil” de­ci­sion prob­lems in prov­abil­ity logic, Benja Fallen­stein. For ev­ery modal de­ci­sion the­ory, you can set up a fair de­ci­sion prob­lem (i.e. one that sim­ply maps ac­tions to out­comes rather than de­pend­ing on the in­ner de­tails of how the agent de­liber­ates) which pun­ishes the agent, such that other de­ci­sion the­o­ries can get a bet­ter out­come. This is an im­por­tant limi­ta­tion on any op­ti­mal­ity re­sults, but as we shall see, it is not the fi­nal word.

  • An op­ti­mal­ity re­sult for modal UDT, Benja Fallen­stein. For each fixed fair de­ci­sion prob­lem, there is a set of ax­ioms (in this case, sim­ply the true ax­ioms about the map­ping from ac­tions to out­comes) such that modal UDT equipped with those ax­ioms will achieve the best pos­si­ble out­come.

  • Im­prov­ing the modal UDT op­ti­mal­ity re­sult, Benja Fallen­stein. We can use more gen­eral ax­ioms than the ones in the pre­vi­ous post, and still achieve the best pos­si­ble out­come.

  • Ob­sta­cle to modal op­ti­mal­ity when you’re be­ing modal­ized, Pa­trick LaVic­toire. The op­ti­mal­ity re­sult can­not carry over to even sim­ple de­ci­sion prob­lems where the out­come de­pends on whether the agent prov­ably takes a cer­tain ac­tion.

  • On no­ta­tion for modal UDT, Benja Fallen­stein. This em­pha­sizes that our use of modal agents and de­ci­sion the­o­ries re­lies on fixed points of modal logic op­er­a­tors, and thus we should be care­ful about us­ing func­tional no­ta­tion.

  • An im­ple­men­ta­tion of modal UDT, Benja Fallen­stein. The polyno­mial-time al­gorithm for figur­ing out what a modal UDT agent does in a de­ci­sion prob­lem, now in a sim­ple Haskell pro­gram!

New re­search on log­i­cal counterfactuals

  • Unique­ness of UDT for trans­par­ent uni­verses, Tsvi Ben­son-Til­son. This is a differ­ent ab­stract model of UDT, us­ing what we call “play­ing chicken with the uni­verse”: be­fore rea­son­ing about con­se­quences, take each pos­si­ble ac­tion and at­tempt to prove you don’t take it; if you ever suc­ceed, im­me­di­ately take that ac­tion. This is more use­ful than it sounds, be­cause it rules out cer­tain spu­ri­ous coun­ter­fac­tu­als.

  • A differ­ent an­gle on UDT, Nate Soares. An ex­plo­ra­tion of spu­ri­ous coun­ter­fac­tu­als, with an ex­am­ple (where UDT is used to an­a­lyze an­other agent, rather than to take an ac­tion it­self).

  • Why con­di­tion­ing on “the agent takes ac­tion a” isn’t enough, Nate Soares. Con­tin­u­ing the ex­plo­ra­tion of spu­ri­ous coun­ter­fac­tu­als, with prob­a­bil­is­tic agents.

  • Third-per­son coun­ter­fac­tu­als, Benja Fallen­stein. There is at least some set of as­sump­tions (a “suffi­ciently in­for­ma­tive” de­ci­sion prob­lem) for which modal UDT an­a­lyzes an al­gorithm (per­haps it­self) ad­e­quately and avoids act­ing on spu­ri­ous coun­ter­fac­tu­als… or does it?

  • The odd coun­ter­fac­tu­als of play­ing chicken, Benja Fallen­stein. Nope, modal UDT is perfectly sus­cep­ti­ble to spu­ri­ous coun­ter­fac­tu­als when an­a­lyz­ing other al­gorithms, even if the de­ci­sion prob­lem is suffi­ciently in­for­ma­tive.

Other new re­search rele­vant to UDT

No comments.