[EDIT August 22 2025: This bounty is now 500$ as the earlier bounty has now been claimed. (By us.)]
A lot of our work involves “redunds”.[1] A random variable Γ is a(n exact) redund over two random variables X1,X2 exactly when both
X1→X2→Γ
X2→X1→Γ
Conceptually, these two diagrams say that X1 gives exactly the same information about Γ as all of X, and X2 gives exactly the same information about Γ as all of X; whatever information X contains about Γ is redundantly represented in X1 and X2. Unpacking the diagrammatic notation and simplifying a little, the diagrams say P[Γ|X1]=P[Γ|X2]=P[Γ|X] for all X such that P[X]>0.
The exact redundancy conditions are too restrictive to be of much practical relevance, but we are more interested in approximate redunds. Approximate redunds are defined by approximate versions of the same two diagrams:
Unpacking the diagrammatic notation, these two diagrams say
ϵred≥DKL(P[X,Γ]||P[X1]P[X2|X1]P[Γ|X2])
ϵred≥DKL(P[X,Γ]||P[X2]P[X1|X2]P[Γ|X1])
This bounty problem is about the existence of a(n approximate) maximal redund Ω: a redund which contains (approximately) all the information about X contained in any other (approximate) redund. Diagrammatically, a maximal redund Ω satisfies:
Finally, we’d like our maximal redund Ω to be an approximately deterministic function of X, i.e. ϵdet≥H(Ω|X), which is equivalent to the diagram
What We Want For The Bounty
This bounty pays out $500 for establishing that there always, over any two random variables X1, X2, exists an Ω satisfying the requirements above, i.e.
… with reasonable error bounds.
What counts as reasonable error bounds? We’re expecting all the other ϵ’s to scale with ϵ2 in the diagrams above, and ideally be small constant multiples of ϵ2. In other words: the higher the approximation error on allowable redunds Γ, the more approximation error we allow for all the conditions on Ω. Reasonableness of error bound is partially a judgement call on our part, but small constant (like e.g. 2 or 3) multiples of ϵ2 would definitely suffice.
Bounds should be global, i.e. apply even when ϵ2 is large. We’re not interested in e.g. first or second order approximations for small ϵ2 unless they provably apply globally.
If the error bounds are sufficiently reasonable, a solution to this problem will also solve our other open bounty problem, paying out another $500, for a total of $1000.
If people post major intermediate steps which are load-bearing for an eventual solution, we might allocate the bounty based on our judgement of different peoples’ relative contributions; the hope is to incentivize sharing intermediate results.
We might also partially pay out for counterexamples. That’s a judgement call depending on how thoroughly the counterexample kills hope of any theorem vaguely along the lines intended.
In terms of rigor and allowable assumptions, roughly the level of rigor and assumptions in the posts linked above is fine.
Some Intuition From The Exact Case
A proof in the exact case is straightforward; we’ll walk through it here to build some intuition.
The exact redundancy conditions say
P[Γ|X]=P[Γ|X1]=P[Γ|X2]
wherever P[X]>0. So, we can form a bipartite graph, where each vertex represents a value x1 of X1 or a value x2 of X2, and two vertices are connected iff P[X1=x1,X2=x2]>0. Then the definition of redundancy says that, for ANY redund Γ, P[Γ|X1] and P[Γ|X2] are the same for ANY values of X1 and X2 (respectively) in the same connected component.
So, let f(X) be defined as the connected component in which X falls. For any redund Γ, P[Γ|X] must be a function of f(X) (since two X-values in the same connected component have the same P[Γ|X]). Thus
X→f(X)→Γ
f(X) is therefore a(n exactly) deterministic function of X, and (exactly) mediates between X and any redund Γ. Thus, f(X) satisfies our requirements, in the exact case (i.e. when ϵ2=0).
Alas, this proof does not easily generalize to the approximate case.
Why We Want This
This is a conjecture which we’ve had on the back burner for a while, and it would be relevant in many places. It would solve our other open bounty problem, it would likely be our starting point for a new version of the Telephone Theorem which more explicitly connects to the natural latents framework, and it would directly simplify the problem of computing natural latents numerically. Probably there are other use-cases which we’re not thinking about right at the moment.
$500 + $500 Bounty Problem: Does An (Approximately) Deterministic Maximal Redund Always Exist?
[EDIT August 22 2025: This bounty is now 500$ as the earlier bounty has now been claimed. (By us.)]
A lot of our work involves “redunds”.[1] A random variable Γ is a(n exact) redund over two random variables X1,X2 exactly when both
X1→X2→Γ
X2→X1→Γ
Conceptually, these two diagrams say that X1 gives exactly the same information about Γ as all of X, and X2 gives exactly the same information about Γ as all of X; whatever information X contains about Γ is redundantly represented in X1 and X2. Unpacking the diagrammatic notation and simplifying a little, the diagrams say P[Γ|X1]=P[Γ|X2]=P[Γ|X] for all X such that P[X]>0.
The exact redundancy conditions are too restrictive to be of much practical relevance, but we are more interested in approximate redunds. Approximate redunds are defined by approximate versions of the same two diagrams:
Unpacking the diagrammatic notation, these two diagrams say
ϵred≥DKL(P[X,Γ]||P[X1]P[X2|X1]P[Γ|X2])
ϵred≥DKL(P[X,Γ]||P[X2]P[X1|X2]P[Γ|X1])
This bounty problem is about the existence of a(n approximate) maximal redund Ω: a redund which contains (approximately) all the information about X contained in any other (approximate) redund. Diagrammatically, a maximal redund Ω satisfies:
Finally, we’d like our maximal redund Ω to be an approximately deterministic function of X, i.e. ϵdet≥H(Ω|X), which is equivalent to the diagram
What We Want For The Bounty
This bounty pays out $500 for establishing that there always, over any two random variables X1, X2, exists an Ω satisfying the requirements above, i.e.
… with reasonable error bounds.
What counts as reasonable error bounds? We’re expecting all the other ϵ’s to scale with ϵ2 in the diagrams above, and ideally be small constant multiples of ϵ2. In other words: the higher the approximation error on allowable redunds Γ, the more approximation error we allow for all the conditions on Ω. Reasonableness of error bound is partially a judgement call on our part, but small constant (like e.g. 2 or 3) multiples of ϵ2 would definitely suffice.
Bounds should be global, i.e. apply even when ϵ2 is large. We’re not interested in e.g. first or second order approximations for small ϵ2 unless they provably apply globally.
If the error bounds are sufficiently reasonable, a solution to this problem will also solve our other open bounty problem, paying out another $500, for a total of $1000.
If people post major intermediate steps which are load-bearing for an eventual solution, we might allocate the bounty based on our judgement of different peoples’ relative contributions; the hope is to incentivize sharing intermediate results.
We might also partially pay out for counterexamples. That’s a judgement call depending on how thoroughly the counterexample kills hope of any theorem vaguely along the lines intended.
In terms of rigor and allowable assumptions, roughly the level of rigor and assumptions in the posts linked above is fine.
Some Intuition From The Exact Case
A proof in the exact case is straightforward; we’ll walk through it here to build some intuition.
The exact redundancy conditions say
P[Γ|X]=P[Γ|X1]=P[Γ|X2]
wherever P[X]>0. So, we can form a bipartite graph, where each vertex represents a value x1 of X1 or a value x2 of X2, and two vertices are connected iff P[X1=x1,X2=x2]>0. Then the definition of redundancy says that, for ANY redund Γ, P[Γ|X1] and P[Γ|X2] are the same for ANY values of X1 and X2 (respectively) in the same connected component.
So, let f(X) be defined as the connected component in which X falls. For any redund Γ, P[Γ|X] must be a function of f(X) (since two X-values in the same connected component have the same P[Γ|X]). Thus
X→f(X)→Γ
f(X) is therefore a(n exactly) deterministic function of X, and (exactly) mediates between X and any redund Γ. Thus, f(X) satisfies our requirements, in the exact case (i.e. when ϵ2=0).
Alas, this proof does not easily generalize to the approximate case.
Why We Want This
This is a conjecture which we’ve had on the back burner for a while, and it would be relevant in many places. It would solve our other open bounty problem, it would likely be our starting point for a new version of the Telephone Theorem which more explicitly connects to the natural latents framework, and it would directly simplify the problem of computing natural latents numerically. Probably there are other use-cases which we’re not thinking about right at the moment.
A lot of our work involves “redunds”.