This is fantastic. I was a bit annoyed by the pareto optimality section and felt that surely there must be a way to skip that part of the proof. I tried a number of simple transformations on Λ′ that intuitively I thought would makes the X1|λi‘s equal for the appropriate i’s. None worked. Lesson learned (again), test out ideas first before trying to prove them correct.
How did you work out you could use pareto-optimality? I’m guessing you got it from looking at properties of optimized and unoptimized empirical latents?
stochastic natural latents are relatively easy to test for in datasets
Why is it that stochastic natural latents are easier to test for than deterministic? It is that you can use just use the variables themselves as the latent and quickly compute mediation error?
How did you work out you could use pareto-optimality? I’m guessing you got it from looking at properties of optimized and unoptimized empirical latents?
No, actually. The magic property doesn’t always hold for pareto optimal latents; the resampling step is load bearing. So when we numerically experimented with optimizing the latents, we often got latents which didn’t have the structure leveraged in the proof (though we did sometimes get the right structure, but we didn’t notice that until we knew to look for it).
We figured it out by algebraically playing around with the first-order conditions for pareto optimality, generally trying to simplify them, and noticed that if we assumed zero error on one resampling condition (which at the time we incorrectly thought we had already proven was a free move), then it simplified down a bunch and gave the nice form.
It is that you can use just use the variables themselves as the latent and quickly compute mediation error?
How did you work out you could use pareto-optimality? I’m guessing you got it from looking at properties of optimized and unoptimized empirical latents?
No, actually. The magic property doesn’t always hold for pareto optimal latents; the resampling step is load bearing. So when we numerically experimented with optimizing the latents, we often got latents which didn’t have the structure leveraged in the proof (though we did sometimes get the right structure, but we didn’t notice that until we knew to look for it).
We figured it out by algebraically playing around with the first-order conditions for pareto optimality, generally trying to simplify them, and noticed that if we assumed zero error on one resampling condition (which at the time we incorrectly thought we had already proven was a free move), then it simplified down a bunch and gave the nice form.
It is that you can use just use the variables themselves as the latent and quickly compute mediation error?
This is fantastic. I was a bit annoyed by the pareto optimality section and felt that surely there must be a way to skip that part of the proof. I tried a number of simple transformations on Λ′ that intuitively I thought would makes the X1|λi‘s equal for the appropriate i’s. None worked. Lesson learned (again), test out ideas first before trying to prove them correct.
How did you work out you could use pareto-optimality? I’m guessing you got it from looking at properties of optimized and unoptimized empirical latents?
Why is it that stochastic natural latents are easier to test for than deterministic? It is that you can use just use the variables themselves as the latent and quickly compute mediation error?
No, actually. The magic property doesn’t always hold for pareto optimal latents; the resampling step is load bearing. So when we numerically experimented with optimizing the latents, we often got latents which didn’t have the structure leveraged in the proof (though we did sometimes get the right structure, but we didn’t notice that until we knew to look for it).
We figured it out by algebraically playing around with the first-order conditions for pareto optimality, generally trying to simplify them, and noticed that if we assumed zero error on one resampling condition (which at the time we incorrectly thought we had already proven was a free move), then it simplified down a bunch and gave the nice form.
Yup.
No, actually. The magic property doesn’t always hold for pareto optimal latents; the resampling step is load bearing. So when we numerically experimented with optimizing the latents, we often got latents which didn’t have the structure leveraged in the proof (though we did sometimes get the right structure, but we didn’t notice that until we knew to look for it).
We figured it out by algebraically playing around with the first-order conditions for pareto optimality, generally trying to simplify them, and noticed that if we assumed zero error on one resampling condition (which at the time we incorrectly thought we had already proven was a free move), then it simplified down a bunch and gave the nice form.
Yup.