Yeah, it seems that if Y isn’t uniquely determined by X, we can’t reversibly erase Y from X. Let’s say we flip two coins, X = the first, Y = the sum. Since S depends only on X and some randomness, knowing S is equivalent to knowing some distribution over X. If that distribution is non-informative about Y, it must be (1/2,1/2). But then we can’t reconstruct X from S and Y.
Nice example—I’m convinced that the bound isn’t always achievable. So the next questions are:
Is there a nice way to figure out how much information we can keep, in any particular problem, when Y is not observed?
Is there some standard way of constructing S which achieves optimal performance in general when Y is not observed?
My guess is that optimal performance would be achieved by throwing away all information contained in X about the distribution P[Y|X]. We always know that distribution just from observing X, and throwing away all info about that distribution should throw out all info about Y, and the minimal map interpretation suggests that that’s the least information we can throw out.
Yeah, it seems that if Y isn’t uniquely determined by X, we can’t reversibly erase Y from X. Let’s say we flip two coins, X = the first, Y = the sum. Since S depends only on X and some randomness, knowing S is equivalent to knowing some distribution over X. If that distribution is non-informative about Y, it must be (1/2,1/2). But then we can’t reconstruct X from S and Y.
Nice example—I’m convinced that the bound isn’t always achievable. So the next questions are:
Is there a nice way to figure out how much information we can keep, in any particular problem, when Y is not observed?
Is there some standard way of constructing S which achieves optimal performance in general when Y is not observed?
My guess is that optimal performance would be achieved by throwing away all information contained in X about the distribution P[Y|X]. We always know that distribution just from observing X, and throwing away all info about that distribution should throw out all info about Y, and the minimal map interpretation suggests that that’s the least information we can throw out.