Huh, I had vaguely considered that but I expected any P[X|Δ(X)]=0 terms to be counterbalanced by P[X,Δ(X)]=0 terms, which together contribute nothing to the KL-divergence. I’ll check my intuitions though.
I’m honestly pretty stumped at the moment. The simplest test case I’ve been using is for X1 and X2 to be two flips of a biased coin, where the bias is known to be either k or 1−k with equal probability of either. As k varies, we want to swap from Δ≅Λ to the trivial case |Δ|=1 and back. This (optimally) happens at around k=0.08 and k=0.92. If we swap there, then the sum of errors for the three diagrams of Δ does remain less than 2(ϵ+ϵ+ϵ) at all times.
Likewise, if we do try to define Δ(X), we need to swap from a Δ which is equal to the number of heads, to |Δ|=1, and back.
In neither case can I find a construction of Δ(X) or Δ(Λ) which swaps from one phase to the other at the right time! My final thought is for Δ to be some mapping Λ→P(Λ) consisting of a ball in probability space of variable radius (no idea how to calculate the radius) which would take k→{k} at k≈1 and k→{k,1−k} at k≈0.5. Or maybe you have to map Λ→P(X) or something like that. But for now I don’t even have a construction I can try to prove things for.
Perhaps a constructive approach isn’t feasible, which probably means I don’t have quite the right skillset to do this.
Huh, I had vaguely considered that but I expected any P[X|Δ(X)]=0 terms to be counterbalanced by P[X,Δ(X)]=0 terms, which together contribute nothing to the KL-divergence. I’ll check my intuitions though.
I’m honestly pretty stumped at the moment. The simplest test case I’ve been using is for X1 and X2 to be two flips of a biased coin, where the bias is known to be either k or 1−k with equal probability of either. As k varies, we want to swap from Δ≅Λ to the trivial case |Δ|=1 and back. This (optimally) happens at around k=0.08 and k=0.92. If we swap there, then the sum of errors for the three diagrams of Δ does remain less than 2(ϵ+ϵ+ϵ) at all times.
Likewise, if we do try to define Δ(X), we need to swap from a Δ which is equal to the number of heads, to |Δ|=1, and back.
In neither case can I find a construction of Δ(X) or Δ(Λ) which swaps from one phase to the other at the right time! My final thought is for Δ to be some mapping Λ→P(Λ) consisting of a ball in probability space of variable radius (no idea how to calculate the radius) which would take k→{k} at k≈1 and k→{k,1−k} at k≈0.5. Or maybe you have to map Λ→P(X) or something like that. But for now I don’t even have a construction I can try to prove things for.
Perhaps a constructive approach isn’t feasible, which probably means I don’t have quite the right skillset to do this.