Some thoughts on the embedded agents part today, now that I’m inspired to have thoughts on it.
On unrealized implications, I don’t think this is exactly an embedded agent problem so much as a problem of limited computational abilities.
More seriously, I suspect it’s possible for an infinite agent to be both embedded within the structure of it’s universe and also be logically/computationally omniscient, but if we do impose a condition of finiteness, the unrealized implications part comes back.
So in that sense, I think it’s not exactly a problem of being in the world, but rather being finite.
But the finiteness condition is fine for now, so I’ll talk about other things.
A lot of embedded agency problems, IMO are either created or are significantly enhanced via physical universality, which is semi-plausible for our universe, and in particular, a big thing that physical universality does for embedded agency is you can no longer create a perfect isolator, because the environment can always revitalize an isolated area, and this is why any reversible cellular automaton that allows for perfect walls cannot be physically universal.
This means that there’s no ground truth Cartesian boundary available that persists for all time, which breaks the abstraction of an agent separated from it’s environment, which means reward corruption and self-modification can happen.
Thus, we have to replace it by a theory that can handle shifts in boundaries. Ideally, the boundary should either be arbitrarily shiftable or not exist at all, but this creates problems since physical universality is way less studied than computational universality, and their interaction is not studied at all.
The I/O part is not about the lack of such a channel, but rather the lack of a channel that is invulnerable to hacking/modification, such that the channel can be assumed to only come from a certain source.
Some thoughts on the embedded agents part today, now that I’m inspired to have thoughts on it.
On unrealized implications, I don’t think this is exactly an embedded agent problem so much as a problem of limited computational abilities.
More seriously, I suspect it’s possible for an infinite agent to be both embedded within the structure of it’s universe and also be logically/computationally omniscient, but if we do impose a condition of finiteness, the unrealized implications part comes back.
So in that sense, I think it’s not exactly a problem of being in the world, but rather being finite.
But the finiteness condition is fine for now, so I’ll talk about other things.
A lot of embedded agency problems, IMO are either created or are significantly enhanced via physical universality, which is semi-plausible for our universe, and in particular, a big thing that physical universality does for embedded agency is you can no longer create a perfect isolator, because the environment can always revitalize an isolated area, and this is why any reversible cellular automaton that allows for perfect walls cannot be physically universal.
This means that there’s no ground truth Cartesian boundary available that persists for all time, which breaks the abstraction of an agent separated from it’s environment, which means reward corruption and self-modification can happen.
Thus, we have to replace it by a theory that can handle shifts in boundaries. Ideally, the boundary should either be arbitrarily shiftable or not exist at all, but this creates problems since physical universality is way less studied than computational universality, and their interaction is not studied at all.
The I/O part is not about the lack of such a channel, but rather the lack of a channel that is invulnerable to hacking/modification, such that the channel can be assumed to only come from a certain source.