Some interesting directions I think this opens up: Intuitively, given a set of variables X, we want natural latents to be approximately deterministic across a wide variety of (collections of) variables, and if a natural latent Y is approximately deterministic w.r.t a subset of variables S⊆X, then we want S to be as small as possible (e.g. strong redundancy is better than weak redundancy when the former is attainable)
The redundancy lattice seems natural for representing this: Given an element of the redundancy lattice α⊂P(X), we say Y is a redund over α if it’s approximately deterministic w.r.t each subset in α. E.g.Λ is weakly redundant over X if it’s a redund over {{¯¯¯¯¯¯Xi}|Xi∈X} (approximately deterministic function of each ¯¯¯¯¯¯Xi), and strongly redundant if it’s a redund over {{Xi}|Xi∈X}. If Y is a redund over α⊂P(X), our intuitive desiderata for natural latents correspond to α containing more subsets (more redundancy), and each subset Ai∈α being small (less “synergy”). Combine this with the mediation condition can probably give us a notion of pareto-optimality for natural latents.
Another thing we could do is when we construct pareto-optimal natural latents Y over X, we add them to the original set of variables to augment the redundancy lattice, so that new natural latents can be approximately deterministic functions over (collections of) existing natural latents, and this naturally allows us to represent the “hierarchical nature of abstractions” where lower-level abstractions makes it easier to compute higher-level ones.
A concrete setting where this can be useful is where a bunch of agents receive different but partially overlapping sets of observations and aims to predict partially overlapping domains. Having a fine grained collection of natural latents redundant across different elements of the redundancy lattice means we get to easily zoom in on the smaller subset of latent variables that’s (maximally) redundantly represented by all of the agents (& be able to tell which domains of predictions these latents actually mediate).
Congrats!
Some interesting directions I think this opens up: Intuitively, given a set of variables X, we want natural latents to be approximately deterministic across a wide variety of (collections of) variables, and if a natural latent Y is approximately deterministic w.r.t a subset of variables S⊆X, then we want S to be as small as possible (e.g. strong redundancy is better than weak redundancy when the former is attainable)
The redundancy lattice seems natural for representing this: Given an element of the redundancy lattice α⊂P(X), we say Y is a redund over α if it’s approximately deterministic w.r.t each subset in α. E.g.Λ is weakly redundant over X if it’s a redund over {{¯¯¯¯¯¯Xi}|Xi∈X} (approximately deterministic function of each ¯¯¯¯¯¯Xi), and strongly redundant if it’s a redund over {{Xi}|Xi∈X}. If Y is a redund over α⊂P(X), our intuitive desiderata for natural latents correspond to α containing more subsets (more redundancy), and each subset Ai∈α being small (less “synergy”). Combine this with the mediation condition can probably give us a notion of pareto-optimality for natural latents.
Another thing we could do is when we construct pareto-optimal natural latents Y over X, we add them to the original set of variables to augment the redundancy lattice, so that new natural latents can be approximately deterministic functions over (collections of) existing natural latents, and this naturally allows us to represent the “hierarchical nature of abstractions” where lower-level abstractions makes it easier to compute higher-level ones.
A concrete setting where this can be useful is where a bunch of agents receive different but partially overlapping sets of observations and aims to predict partially overlapping domains. Having a fine grained collection of natural latents redundant across different elements of the redundancy lattice means we get to easily zoom in on the smaller subset of latent variables that’s (maximally) redundantly represented by all of the agents (& be able to tell which domains of predictions these latents actually mediate).