Would this still give us guarantees on the conditional distribution P(X|Λ)?
E.g. Mediation: DKL(P(X1,X2,Λ)∥P(X1|Λ)P(X2|Λ)P(Λ))=DKL(P(X1,X2|Λ)P(Λ)∥P(X1|Λ)P(X2|Λ)P(Λ))=DKL(P(X1,X2|Λ)∥P(X1|Λ)P(X2|Λ))
is really about the expected error conditional on individual values of Λ, & it seems like there are distributions with high mediation error but low error when the latent is marginalized inside DKL, which could be load-bearing when the agents cast out predictions on observables after updating on Λ
Would this still give us guarantees on the conditional distribution P(X|Λ)?
E.g. Mediation: DKL(P(X1,X2,Λ)∥P(X1|Λ)P(X2|Λ)P(Λ))=DKL(P(X1,X2|Λ)P(Λ)∥P(X1|Λ)P(X2|Λ)P(Λ))=DKL(P(X1,X2|Λ)∥P(X1|Λ)P(X2|Λ))
is really about the expected error conditional on individual values of Λ, & it seems like there are distributions with high mediation error but low error when the latent is marginalized inside DKL, which could be load-bearing when the agents cast out predictions on observables after updating on Λ