To check my understanding: for random variables A,B, the stochastic error of a latent Λ is the maximum among I(A;B|Λ),I(A;Λ|B),I(B;Λ|A). The deterministic error is the maximum among I(A;B∣Λ),H(Λ|A),H(Λ|B). If so, the claim in my original comment holds—I also wrote code (manually) to verify. Here’s the fixed claim:
Let X∼Ber(p). With probability r, set Z:=X, and otherwise draw Z∼Ber(p). Let Y∼Ber(1/2). Let A=X⊕Y and B=Y⊕Z. We will investigate latents for (A,B). Let ϵ be the stochastic error of latent Λ:=Y. Now compute the deterministic errors of each of the latents X, Y, Z, A, B, A⊕B, X⊕Y⊕Z. Then for p:=0.9,r:=0.44, all of these latents have deterministic error greater than 5ϵ.
It should be easy to modify the code to consider other latents. I haven’t thought much about proving that there aren’t any other latents better than these, though.
On this particular example you can achieve deterministic error ≈2.5ϵ with latent A∧B, but it seems easy to find other examples with ratio > 5 (including over latents A∧B,A∨B) in the space of distributions over (X,Y,Z) with a random-restart hill-climb. Anyway, my takeaway is that if you think you can derandomize latents in general you should probably try to derandomize the latent Λ:=Y for variables A:=X⊕Y, B:=Z⊕Y for distributions over boolean variables X,Y,Z.
My impression is that prior discussion focused on discretizing Λ. Λ is already boolean here, so if the hypothesis is true then it’s for a different reason.
To check my understanding: for random variables A,B, the stochastic error of a latent Λ is the maximum among I(A;B|Λ),I(A;Λ|B),I(B;Λ|A). The deterministic error is the maximum among I(A;B∣Λ),H(Λ|A),H(Λ|B). If so, the claim in my original comment holds—I also wrote code (manually) to verify. Here’s the fixed claim:
Let X∼Ber(p). With probability r, set Z:=X, and otherwise draw Z∼Ber(p). Let Y∼Ber(1/2). Let A=X⊕Y and B=Y⊕Z. We will investigate latents for (A,B). Let ϵ be the stochastic error of latent Λ:=Y. Now compute the deterministic errors of each of the latents X, Y, Z, A, B, A⊕B, X⊕Y⊕Z. Then for p:=0.9,r:=0.44, all of these latents have deterministic error greater than 5ϵ.
It should be easy to modify the code to consider other latents. I haven’t thought much about proving that there aren’t any other latents better than these, though.
On this particular example you can achieve deterministic error ≈2.5ϵ with latent A∧B, but it seems easy to find other examples with ratio > 5 (including over latents A∧B,A∨B) in the space of distributions over (X,Y,Z) with a random-restart hill-climb. Anyway, my takeaway is that if you think you can derandomize latents in general you should probably try to derandomize the latent Λ:=Y for variables A:=X⊕Y, B:=Z⊕Y for distributions over boolean variables X,Y,Z.
(edited to fix typo in definition of B)
My impression is that prior discussion focused on discretizing Λ. Λ is already boolean here, so if the hypothesis is true then it’s for a different reason.