Yeah, not a ton. For I think the obvious reason that real-world agents are complicated and hard to reason about.
Though search up “tiling agents” for some MIRI work in this vein.
Yes, thanks!
I am familiar with some work from MIRI about that which focuses on Loebian obstacle, e.g. this 2013 paper: Tiling Agents for Self-Modifying AI, and the Löbian Obstacle.
But I should look closer at other parts of those MIRI papers; perhaps there might be some material which actually establishes some invariants, at least for some simple, idealized examples of self-modification...
Yeah, not a ton. For I think the obvious reason that real-world agents are complicated and hard to reason about.
Though search up “tiling agents” for some MIRI work in this vein.
Yes, thanks!
I am familiar with some work from MIRI about that which focuses on Loebian obstacle, e.g. this 2013 paper: Tiling Agents for Self-Modifying AI, and the Löbian Obstacle.
But I should look closer at other parts of those MIRI papers; perhaps there might be some material which actually establishes some invariants, at least for some simple, idealized examples of self-modification...