Isn’t this the same as the “seamless transition for reward maximizers” technique described in section 5.1 of Stuart and Xavier’s 2017 paper on utility indifference methods? It is a good idea, of course, and if you independantly invented it, kudos, but it seems like something that already exists.
I did explicitly disclaim against novelty, and I did invent this independently; the paper you linked is closely related, and I would like to upvote it as I think those results should also be better known, but I think the problem I solve in this post is different (and technically easier!) than the problems solved in that paper, including in section 5. The problem solved there asks for the optimal agent to act as if it’s an infinite-horizon optimal agent for R1 (including whatever power-seeking would be instrumental for such an agent!) until the time bound causes it to switch into acting like the optimal agent for R2 (and for all that to be reflectively stable). Here, I am not asking for the optimal agent to behave as if it has a longer time horizon than it really does.
Isn’t this the same as the “seamless transition for reward maximizers” technique described in section 5.1 of Stuart and Xavier’s 2017 paper on utility indifference methods? It is a good idea, of course, and if you independantly invented it, kudos, but it seems like something that already exists.
I did explicitly disclaim against novelty, and I did invent this independently; the paper you linked is closely related, and I would like to upvote it as I think those results should also be better known, but I think the problem I solve in this post is different (and technically easier!) than the problems solved in that paper, including in section 5. The problem solved there asks for the optimal agent to act as if it’s an infinite-horizon optimal agent for R1 (including whatever power-seeking would be instrumental for such an agent!) until the time bound causes it to switch into acting like the optimal agent for R2 (and for all that to be reflectively stable). Here, I am not asking for the optimal agent to behave as if it has a longer time horizon than it really does.