A brief note on factoring out certain variables

Jes­sica Tay­lor and Chris Olah has a post on “Max­i­miz­ing a quan­tity while ig­nor­ing effect through some chan­nel”. I’ll briefly pre­sent a differ­ent way of do­ing this, and com­pare the two.

Essen­tially, the AI’s util­ity is given by a func­tion of a vari­able . The AI’s ac­tions are a ran­dom vari­able , but we want to ‘fac­tor out’ an­other ran­dom vari­able .

If we have a prob­a­bil­ity dis­tri­bu­tion over ac­tions, then, given back­ground ev­i­dence , the stan­dard way to max­imise would be to max­imise:

  • .

The most ob­vi­ous idea, for me, is to re­place with , mak­ing ar­tifi­cially in­de­pen­dent of and giv­ing the ex­pres­sion:

  • .

If is de­pen­dent on - if it isn’t, then fac­tor­ing it out is not in­ter­est­ing—then needs some im­plicit prob­a­bil­ity dis­tri­bu­tion over (which is in­de­pen­dent of ). So, in essence, this ap­proach re­lies on two dis­tri­bu­tions over the pos­si­ble ac­tions, one that the agent is op­ti­mis­ing, the other than is left un­op­ti­mised. In terms of Bayes nets, this just seems to be cut­ting from .

Jes­sica and Chris’s ap­proach also re­lies on two dis­tri­bu­tions. But, as far as I un­der­stand their ap­proach, the two dis­tri­bu­tions are taken to be the same, and in­stead, it is as­sumed that can­not be im­proved by changes to the dis­tri­bu­tion of , if one keeps the dis­tri­bu­tion of con­stant. This has the feel of be­ing a kind of differ­en­tial con­di­tion—the in­finites­i­mal im­pact on of changes to but not is non-pos­i­tive.

I sus­pect my ver­sion might have some odd be­havi­our (defin­ing the im­plicit dis­tri­bu­tion for does not seem nec­es­sar­ily nat­u­ral), but I’m not sure of the tran­si­tive prop­er­ties of the differ­en­tial ap­proach.