Arjun Pitchanathan comments on $500 Bounty Problem: Are (Approximately) Deterministic Natural Latents All You Need?

Arjun Pitchanathan 29 May 2025 12:22 UTC
3 points
0
Epistemic status: Quick dump of something that might be useful to someone. o3 and Opus 4 independently agree on the numerical calculations for the bolded result below, but I didn’t check the calculations myself in any detail.
When we say “roughly”, e.g. $2 ϵ$ or $3 ϵ$ would be fine; it may be a judgement call on our part if the bound is much larger than that.
Let $Y \sim Ber (p)$ . With probability $r$ , set $Z := X$ , and otherwise draw $Z \sim Ber (p)$ . Let $Y \sim Ber (1 / 2)$ . Let $A = X \oplus Y$ and $B = Y \oplus Z$ . We will investigate latents for $(A, B)$ .
Set $Λ := Y$ , then note that the stochastic error $ϵ := I (A; Y | B)$ ) because $Y$ induces perfect conditional independence and symmetry of $A$ and $B$ . Now compute the deterministic errors of $Λ := Y$ , $Λ := 0$ , $Λ := A$ , which are equal to $H (Y ∣ A), I (A; B), H (A | B)$ respectively.
Then it turns out that with $p := 0.9, r := 0.44$ , all of these latents have error greater than $5 ϵ$ , if you believe this claude opus 4 artifact (full chat here, corroboration by o3 here). Conditional on there not being some other kind of latent that gets better deterministic error, and the calculations being correct, I would expect that a bit more fiddling around could produce much better bounds, say $10 ϵ$ or more, since I think I’ve explored very little of the search space.
e.g. one could create more As and Bs by either adding more Ys, or more Xs and Zs. Or one could pick the probabilities $p, r$ out of some discrete set of possibilities instead of having them be fixed.
- johnswentworth 8 Jul 2025 23:57 UTC
  5 points
  3
  Parent
  Set $Λ := Y$ , then note that the stochastic error $ϵ := I (A; Y | B)$ ) because $Y$ induces perfect conditional independence and symmetry of $A$ and $B$ .
  I don’t think Y induces perfect conditional independence? Conditional on Y, we have:
  - (Probability r) A = B, else
  - (Probability 1 - r) A and B are independent
  … which means that learning the value of A tells me something about the value of B, conditional on Y (specifically, B is more likely to have the same value A had).
  Am I missing something here?
  (Also, for purposes of me tracking how useful LLMs are for research: assuming I’m not missing something and this was a mistake, was the mistake originally made by you or an LLM?)
  - Arjun Pitchanathan 22 Jul 2025 15:27 UTC
    4 points
    0
    Parent
    Yeah, my comment went through a few different versions and that statement doesn’t apply to the final setting. I should’ve checked it better before hitting submit, sorry. I only used LLMs for writing code for numerical calculations, so the error is mine. ^[1]
    I think that I didn’t actually use this claim in the numerical calculations, so I’d hope that the rest of the comment continues to hold. I had hoped to verify that before replying, but given that it’s been two weeks already, I don’t know when I’ll manage to get to it.
    ^
    I did try to see if it could write a message explaining the claim, but didn’t use that
  - Arjun Pitchanathan 25 Jul 2025 17:31 UTC
    3 points
    0
    Parent
    To check my understanding: for random variables $A, B$ , the stochastic error of a latent $Λ$ is the maximum among $I (A; B | Λ), I (A; Λ | B), I (B; Λ | A)$ . The deterministic error is the maximum among $I (A; B ∣ Λ), H (Λ | A), H (Λ | B)$ . If so, the claim in my original comment holds—I also wrote code (manually) to verify. Here’s the fixed claim:
    Let $X \sim Ber (p)$ . With probability $r$ , set $Z := X$ , and otherwise draw $Z \sim Ber (p)$ . Let $Y \sim Ber (1 / 2)$ . Let $A = X \oplus Y$ and $B = Y \oplus Z$ . We will investigate latents for $(A, B)$ . Let $ϵ$ be the stochastic error of latent $Λ := Y$ . Now compute the deterministic errors of each of the latents $X$ , $Y$ , $Z$ , $A$ , $B$ , $A \oplus B$ , $X \oplus Y \oplus Z$ . Then for $p := 0.9, r := 0.44$ , all of these latents have deterministic error greater than $5 ϵ$ .
    It should be easy to modify the code to consider other latents. I haven’t thought much about proving that there aren’t any other latents better than these, though.
    - Arjun Pitchanathan 28 Jul 2025 15:09 UTC
      3 points
      0
      Parent
      On this particular example you can achieve deterministic error $\approx 2.5 ϵ$ with latent $A \land B$ , but it seems easy to find other examples with ratio > 5 (including over latents $A \land B, A \lor B$ ) in the space of distributions over $(X, Y, Z)$ with a random-restart hill-climb. Anyway, my takeaway is that if you think you can derandomize latents in general you should probably try to derandomize the latent $Λ := Y$ for variables $A := X \oplus Y$ , $B := Z \oplus Y$ for distributions over boolean variables $X, Y, Z$ .
      (edited to fix typo in definition of $B$ )
      - Arjun Pitchanathan 28 Jul 2025 22:22 UTC
        3 points
        0
        Parent
        My impression is that prior discussion focused on discretizing $Λ$ . $Λ$ is already boolean here, so if the hypothesis is true then it’s for a different reason.