Cool. I’ve had the same idea, that we want something like “synergistic information present in each random subset of the system’s constituents”, and yeah, it doesn’t work out-of-the-box.
Some other issues there:
If we’re actually sampling random individual atoms all around the dog’s body, it seems to me that we’d need an incredibly large amount of them to decode anything useful. Much fewer than if we were sampling random small connected chunks of atoms.
More intuitive example: Suppose we want to infer a book’s topic. What’s the smallest N such that we can likely infer the topic from a random string of length N? Comparatively, what’s the smallest M such that we can infer it from M letters randomly and independently sampled from the book’s text? It seems to me that N≪M.
But introducing “chunks of nearby variables” requires figuring out what “nearby” is, i. e., defining some topology for the low-level representation. How does that work?
Further, the size of the chunk needed depends a lot on which part of the system we sample, so just going “a flat % of all constituents” doesn’t work. Consider happening to land on a DNA string vs. some random part of the interior of the dog’s stomach.
Actually, dogs are kind of a bad example, animals do have DNA signatures spread all around them. A complex robot, then. If we have a diverse variety of robots, inferring the specific type is easy if we sample e. g. part of the hardware implementing its long-term memory, but not if we sample a random part of an appendage.
Or a random passage from the book vs. the titles of the book’s chapters. Or even just “a sample of a particularly info-dense paragraph” vs. “a sample from an unrelated anecdote from the author’s life”. % of the total letter count just doesn’t seem like the right notion of “smallness”.
On the flip side, sometimes it’s reversed: sometimes we do want to sample random unconnected atoms. E. g., the nanomachine example: if we happen to sample the “chunk” corresponding to appendage#12, we risk learning nothing about the high-level state, whereas if we sample three random atoms from different parts of it, that might determine the high-level state uniquely. So now the desired topology of the samples is different: we want non-connected chunks.
I’m currently thinking this is solved by abstraction hierarchies. Like, maybe the basic definition of an abstraction is of the “redundant synergistic variable” type, and the lowest-level abstractions are defined over the lowest-level elements (molecules over atoms). But then higher-level abstractions are redundant-synergistic over lower-level abstractions (rather than actual lowest-level elements), and up it goes. The definitions of the lower-level abstractions provide the topology + sizing + symmetries, which higher-level abstractions then hook up to. (Note that this forces us to actually step through the levels, either bottom-up or top-down.)
As examples:
The states of the nanomachines’ modules are inferable from any subset of the modules’ constituent atoms, and the state of the nanomachine itself is inferable from the states of any subset of the modules. But there’s no such neat relationships between atoms and the high-level state.
“A carbon atom” is synergistic information about a chunk of voxels (baking-in how that chunk could vary, e. g. rotations, spatial translations); “a DNA molecule” is synergistic information about a bunch of atoms (likewise defining custom symmetries under which atom-compositions still count as a DNA molecule); “skin tissue” is synergistic over molecules; and somewhere up there we have “a dog” synergistic over custom-defined animal-parts.
Or something vaguely like that; this doesn’t exactly work either. I’ll have more to say about this once I finish distilling my notes for external consumption instead of expanding them, which is going to happen any… day… now...
I’m currently thinking this is solved by abstraction hierarchies. (...) Or something vaguely like that; this doesn’t exactly work either. I’ll have more to say about this once I finish distilling my notes for external consumption instead of expanding them, which is going to happen any… day… now...
Could you say more about why this attempt to solve the problem (by a hierarchy of abstractions) doesn’t work? Even if your thoughts are very unfinished.
Cool. I’ve had the same idea, that we want something like “synergistic information present in each random subset of the system’s constituents”, and yeah, it doesn’t work out-of-the-box.
Some other issues there:
If we’re actually sampling random individual atoms all around the dog’s body, it seems to me that we’d need an incredibly large amount of them to decode anything useful. Much fewer than if we were sampling random small connected chunks of atoms.
More intuitive example: Suppose we want to infer a book’s topic. What’s the smallest N such that we can likely infer the topic from a random string of length N? Comparatively, what’s the smallest M such that we can infer it from M letters randomly and independently sampled from the book’s text? It seems to me that N≪M.
But introducing “chunks of nearby variables” requires figuring out what “nearby” is, i. e., defining some topology for the low-level representation. How does that work?
Further, the size of the chunk needed depends a lot on which part of the system we sample, so just going “a flat % of all constituents” doesn’t work. Consider happening to land on a DNA string vs. some random part of the interior of the dog’s stomach.
Actually, dogs are kind of a bad example, animals do have DNA signatures spread all around them. A complex robot, then. If we have a diverse variety of robots, inferring the specific type is easy if we sample e. g. part of the hardware implementing its long-term memory, but not if we sample a random part of an appendage.
Or a random passage from the book vs. the titles of the book’s chapters. Or even just “a sample of a particularly info-dense paragraph” vs. “a sample from an unrelated anecdote from the author’s life”. % of the total letter count just doesn’t seem like the right notion of “smallness”.
On the flip side, sometimes it’s reversed: sometimes we do want to sample random unconnected atoms. E. g., the nanomachine example: if we happen to sample the “chunk” corresponding to appendage#12, we risk learning nothing about the high-level state, whereas if we sample three random atoms from different parts of it, that might determine the high-level state uniquely. So now the desired topology of the samples is different: we want non-connected chunks.
I’m currently thinking this is solved by abstraction hierarchies. Like, maybe the basic definition of an abstraction is of the “redundant synergistic variable” type, and the lowest-level abstractions are defined over the lowest-level elements (molecules over atoms). But then higher-level abstractions are redundant-synergistic over lower-level abstractions (rather than actual lowest-level elements), and up it goes. The definitions of the lower-level abstractions provide the topology + sizing + symmetries, which higher-level abstractions then hook up to. (Note that this forces us to actually step through the levels, either bottom-up or top-down.)
As examples:
The states of the nanomachines’ modules are inferable from any subset of the modules’ constituent atoms, and the state of the nanomachine itself is inferable from the states of any subset of the modules. But there’s no such neat relationships between atoms and the high-level state.
“A carbon atom” is synergistic information about a chunk of voxels (baking-in how that chunk could vary, e. g. rotations, spatial translations); “a DNA molecule” is synergistic information about a bunch of atoms (likewise defining custom symmetries under which atom-compositions still count as a DNA molecule); “skin tissue” is synergistic over molecules; and somewhere up there we have “a dog” synergistic over custom-defined animal-parts.
Or something vaguely like that; this doesn’t exactly work either. I’ll have more to say about this once I finish distilling my notes for external consumption instead of expanding them, which is going to happen any… day… now...
Could you say more about why this attempt to solve the problem (by a hierarchy of abstractions) doesn’t work? Even if your thoughts are very unfinished.