Hmm. I may be currently looking at it from the wrong angle, but I’m skeptical that it’s the right frame for defining abstractions. It seems to group low-level variables based on raw distance, rather than the detailed environment structure? Which seems like a very weak constraint. That is,
By further iteration, we can conclude that any number of sets of variables which are all separated by a distance of 2T are independent given X0. That’s the full Lightcone Theorem.
We can make literally any choice of those sets subject to this condition: we can draw the boundaries any way we want. Which means the abstractions we’d recover are not going to be convergent: there’s a free parameter of the boundary choice.
Ah, no, I suppose that part is supposed to be handled by whatever approximation process we define for Λ? That is, the “correct” definition of the “most minimal approximate summary” would implicitly constrain the possible choices of boundaries for which Λ is equivalent to X0?
The eigendecomposition/mesoscale-approximation/gKPD approaches seem like they might move in that direction, though I admit I don’t see their implications at a first glance.
If we ignore the sketchy part—i.e. pretend that regions X0R1,...,X0Rm cover all of X0 and are all independent given X - then gKPD would say roughly: ifΛ can be represented as n/2 dimensional or smaller
Ah, no, I suppose that part is supposed to be handled by whatever approximation process we define for Λ? That is, the “correct” definition of the “most minimal approximate summary” would implicitly constrain the possible choices of boundaries for which Λ is equivalent to X0?
Almost. The hope/expectation is that different choices yield approximately the same Λ, though still probably modulo some conditions (like e.g. sufficiently large T).
By the way, do we need the proof of the theorem to be quite this involved? It seems we can just note that for for any two (sets of) variables X1, X2 separated by distance D, the earliest sampling-step at which their values can intermingle (= their lightcones intersect) is D/2 (since even in the “fastest” case, they can’t do better than moving towards each other at 1 variable per 1 sampling-step).
Almost. The hope/expectation is that different choices yield approximately the same Λ, though still probably modulo some conditions (like e.g. sufficiently large T).
Can you elaborate on this expectation? Intuitively, Λ should consist of a number of higher-level variables as well, and each of them should correspond to a specific set of lower-level variables: abstractions and the elements they abstract over. So for a given Λ, there should be a specific “correct” way to draw the boundaries in the low-level system.
But if ~any way of drawing the boundaries yields the same Λ, then what does this mean?
Or perhaps the “boundaries” in the mesoscale-approximation approach represent something other than the factorization of X into individual abstractions?
Sure, but isn’t the goal of the whole agenda to show that Λdoes have a certain correct factorization, i. e. that abstractions are convergent?
I suppose it may be that any choice of low-level boundaries results in the same Λ, but the Λ itself has a canonical factorization, and going from Λ back to XT reveals the corresponding canonical factorization of XT? And then depending on how close the initial choice of boundaries was to the “correct” one, Λ is easier or harder to compute (or there’s something else about the right choice that makes it nice to use).
Hmm. I may be currently looking at it from the wrong angle, but I’m skeptical that it’s the right frame for defining abstractions. It seems to group low-level variables based on raw distance, rather than the detailed environment structure? Which seems like a very weak constraint. That is,
We can make literally any choice of those sets subject to this condition: we can draw the boundaries any way we want. Which means the abstractions we’d recover are not going to be convergent: there’s a free parameter of the boundary choice.
Ah, no, I suppose that part is supposed to be handled by whatever approximation process we define for Λ? That is, the “correct” definition of the “most minimal approximate summary” would implicitly constrain the possible choices of boundaries for which Λ is equivalent to X0?
The eigendecomposition/mesoscale-approximation/gKPD approaches seem like they might move in that direction, though I admit I don’t see their implications at a first glance.
What’s the n/2 here? Is it meant to be m/2?
Almost. The hope/expectation is that different choices yield approximately the same Λ, though still probably modulo some conditions (like e.g. sufficiently large T).
System size, i.e. number of variables.
By the way, do we need the proof of the theorem to be quite this involved? It seems we can just note that for for any two (sets of) variables X1, X2 separated by distance D, the earliest sampling-step at which their values can intermingle (= their lightcones intersect) is D/2 (since even in the “fastest” case, they can’t do better than moving towards each other at 1 variable per 1 sampling-step).
Yeah, that probably works.
Can you elaborate on this expectation? Intuitively, Λ should consist of a number of higher-level variables as well, and each of them should correspond to a specific set of lower-level variables: abstractions and the elements they abstract over. So for a given Λ, there should be a specific “correct” way to draw the boundaries in the low-level system.
But if ~any way of drawing the boundaries yields the same Λ, then what does this mean?
Or perhaps the “boundaries” in the mesoscale-approximation approach represent something other than the factorization of X into individual abstractions?
Λ is conceptually just the whole bag of abstractions (at a certain scale), unfactored.
Sure, but isn’t the goal of the whole agenda to show that Λ does have a certain correct factorization, i. e. that abstractions are convergent?
I suppose it may be that any choice of low-level boundaries results in the same Λ, but the Λ itself has a canonical factorization, and going from Λ back to XT reveals the corresponding canonical factorization of XT? And then depending on how close the initial choice of boundaries was to the “correct” one, Λ is easier or harder to compute (or there’s something else about the right choice that makes it nice to use).
Yes, there is a story for a canonical factorization of Λ, it’s just separate from the story in this post.