Yes, UDT means updateless decision theory, “the policy” is used as a placeholder for “whatever policy the agent ends up picking”, much like a variable in an equation, and “the algorithm I wrote” is still unpublished because there were too many things wrong with it for me to be comfortable putting it up, as I can’t even show it has any nice properties in particular. Although now that you mention it, I probably should put it up so future posts about what’s wrong with it have a well-specified target to shoot holes in. >_>
Yes, UDT means updateless decision theory, “the policy” is used as a placeholder for “whatever policy the agent ends up picking”, much like a variable in an equation, and “the algorithm I wrote” is still unpublished because there were too many things wrong with it for me to be comfortable putting it up, as I can’t even show it has any nice properties in particular. Although now that you mention it, I probably should put it up so future posts about what’s wrong with it have a well-specified target to shoot holes in. >_>