tailcalled comments on Science in a High-Dimensional World

tailcalled 5 Mar 2022 21:58 UTC
4 points
As a special case of this, we can also handle noise using repeated experiments. If I roll a die, I can’t predict the outcome perfectly, so I can’t rule out influences from all the billions of variables in the universe. But if I roll a die a few thousand times, then I can approximately-perfectly predict the distribution of die-rolls (including the mean, variance, etc). So, even though I don’t know what influences any one particular die roll, I do know that nothing else in the universe is relevant to the overall distribution of repeated rolls (at least to within some small error margin).
I’m not sure I fully understand this, so I wanna try to sketch out an example to see.
Suppose you’ve got a family of unknown variables $X_{0}$ , $X_{1}$ , $X_{2}$ , … which each influence the observable variables $Y_{0}$ , $Y_{1}$ , $Y_{2}$ , …. Given some observations for some of the $Y_{i}$ s, you can learn some summary statistics $Y_{g}$ that you can use to predict others $Y_{i}$ s.
I think the counterintuitive thing about this view then, is that $Y_{i}$ is not independent of $X_{i}$ given $Y_{g}$ . So what have we really learned? It doesn’t immediately tell us anything about the $X_{i}$ / $Y_{i}$ relationship. So where’s the science?
I think my answer to this question is that while $Y_{g}$ doesn’t tell us anything about the $X_{i}$ s, it does tell us things about the $Y_{i}$ s. (And $Y_{g}$ would essentially be a measure of the common causes underlying the $Y_{i}$ s, I suppose.) Which is useful if you care about things that are downstream from the $Y_{i}$ s. But I don’t really see what determinism buys you here.
My model of you says that you’d mention something about the KPD theorems. But I don’t know what.
- tailcalled 23 Mar 2022 21:02 UTC
  2 points
  Parent
  Or should this more be understood in a nested sense?
  That is, if you’ve got $Y_{00}$ , $Y_{01}$ , $Y_{10}$ , $Y_{11}$ , … then you can form $Y_{g 0}$ , $Y_{g 1}$ , …, and if $Y_{g i}$ is then deterministically predictable from $X_{i}$ , you know you’re onto something?