Set Λ:=Y, then note that the stochastic error ϵ:=I(A;Y|B)) because Y induces perfect conditional independence and symmetry of A and B.
I don’t think Y induces perfect conditional independence? Conditional on Y, we have:
(Probability r) A = B, else
(Probability 1 - r) A and B are independent
… which means that learning the value of A tells me something about the value of B, conditional on Y (specifically, B is more likely to have the same value A had).
Am I missing something here?
(Also, for purposes of me tracking how useful LLMs are for research: assuming I’m not missing something and this was a mistake, was the mistake originally made by you or an LLM?)
Yeah, my comment went through a few different versions and that statement doesn’t apply to the final setting. I should’ve checked it better before hitting submit, sorry. I only used LLMs for writing code for numerical calculations, so the error is mine. [1]
I think that I didn’t actually use this claim in the numerical calculations, so I’d hope that the rest of the comment continues to hold. I had hoped to verify that before replying, but given that it’s been two weeks already, I don’t know when I’ll manage to get to it.
To check my understanding: for random variables A,B, the stochastic error of a latent Λ is the maximum among I(A;B|Λ),I(A;Λ|B),I(B;Λ|A). The deterministic error is the maximum among I(A;B∣Λ),H(Λ|A),H(Λ|B). If so, the claim in my original comment holds—I also wrote code (manually) to verify. Here’s the fixed claim:
Let X∼Ber(p). With probability r, set Z:=X, and otherwise draw Z∼Ber(p). Let Y∼Ber(1/2). Let A=X⊕Y and B=Y⊕Z. We will investigate latents for (A,B). Let ϵ be the stochastic error of latent Λ:=Y. Now compute the deterministic errors of each of the latents X, Y, Z, A, B, A⊕B, X⊕Y⊕Z. Then for p:=0.9,r:=0.44, all of these latents have deterministic error greater than 5ϵ.
It should be easy to modify the code to consider other latents. I haven’t thought much about proving that there aren’t any other latents better than these, though.
On this particular example you can achieve deterministic error ≈2.5ϵ with latent A∧B, but it seems easy to find other examples with ratio > 5 (including over latents A∧B,A∨B) in the space of distributions over (X,Y,Z) with a random-restart hill-climb. Anyway, my takeaway is that if you think you can derandomize latents in general you should probably try to derandomize the latent Λ:=Y for variables A:=X⊕Y, B:=Z⊕Y for distributions over boolean variables X,Y,Z.
My impression is that prior discussion focused on discretizing Λ. Λ is already boolean here, so if the hypothesis is true then it’s for a different reason.
I don’t think Y induces perfect conditional independence? Conditional on Y, we have:
(Probability r) A = B, else
(Probability 1 - r) A and B are independent
… which means that learning the value of A tells me something about the value of B, conditional on Y (specifically, B is more likely to have the same value A had).
Am I missing something here?
(Also, for purposes of me tracking how useful LLMs are for research: assuming I’m not missing something and this was a mistake, was the mistake originally made by you or an LLM?)
Yeah, my comment went through a few different versions and that statement doesn’t apply to the final setting. I should’ve checked it better before hitting submit, sorry. I only used LLMs for writing code for numerical calculations, so the error is mine. [1]
I think that I didn’t actually use this claim in the numerical calculations, so I’d hope that the rest of the comment continues to hold. I had hoped to verify that before replying, but given that it’s been two weeks already, I don’t know when I’ll manage to get to it.
I did try to see if it could write a message explaining the claim, but didn’t use that
To check my understanding: for random variables A,B, the stochastic error of a latent Λ is the maximum among I(A;B|Λ),I(A;Λ|B),I(B;Λ|A). The deterministic error is the maximum among I(A;B∣Λ),H(Λ|A),H(Λ|B). If so, the claim in my original comment holds—I also wrote code (manually) to verify. Here’s the fixed claim:
Let X∼Ber(p). With probability r, set Z:=X, and otherwise draw Z∼Ber(p). Let Y∼Ber(1/2). Let A=X⊕Y and B=Y⊕Z. We will investigate latents for (A,B). Let ϵ be the stochastic error of latent Λ:=Y. Now compute the deterministic errors of each of the latents X, Y, Z, A, B, A⊕B, X⊕Y⊕Z. Then for p:=0.9,r:=0.44, all of these latents have deterministic error greater than 5ϵ.
It should be easy to modify the code to consider other latents. I haven’t thought much about proving that there aren’t any other latents better than these, though.
On this particular example you can achieve deterministic error ≈2.5ϵ with latent A∧B, but it seems easy to find other examples with ratio > 5 (including over latents A∧B,A∨B) in the space of distributions over (X,Y,Z) with a random-restart hill-climb. Anyway, my takeaway is that if you think you can derandomize latents in general you should probably try to derandomize the latent Λ:=Y for variables A:=X⊕Y, B:=Z⊕Y for distributions over boolean variables X,Y,Z.
(edited to fix typo in definition of B)
My impression is that prior discussion focused on discretizing Λ. Λ is already boolean here, so if the hypothesis is true then it’s for a different reason.