Just to clarify, why are both A and Y treated as random variables? The way I understand it:
G is our Goal, so the string we want to get
O is whatever bits of the world we actually observe
A is the action we choose (depending on what we observe)
Y is what we get as an outcome of the action
O being called “observation”, I feel like I’d assume it to be fully known. If Y can still not be fully determined by A and O then there’s got to be some random noise (which we could represent as yet another string of bits, W, representing the part of the world that we can’t observe). What about A being probabilistic, is this an allowance for our algorithm being imperfect and making mistakes? I feel like we could introduce yet another random variable for that independent from everything, and make A and Y fully deterministic. Defined that way, the proposed theorem feels like it ought to be trivially true, though I haven’t given it nearly enough thought yet to call that an answer. I’ll give it a go later.
Just to clarify, why are both A and Y treated as random variables? The way I understand it:
G is our Goal, so the string we want to get
O is whatever bits of the world we actually observe
A is the action we choose (depending on what we observe)
Y is what we get as an outcome of the action
O being called “observation”, I feel like I’d assume it to be fully known. If Y can still not be fully determined by A and O then there’s got to be some random noise (which we could represent as yet another string of bits, W, representing the part of the world that we can’t observe). What about A being probabilistic, is this an allowance for our algorithm being imperfect and making mistakes? I feel like we could introduce yet another random variable for that independent from everything, and make A and Y fully deterministic. Defined that way, the proposed theorem feels like it ought to be trivially true, though I haven’t given it nearly enough thought yet to call that an answer. I’ll give it a go later.