I have a couple of questions/points. Might stem from not understanding the math.
1) The very first example shows that absolutely arbitrary things (e.g. arbitrary green lines) can be “natural latents”. Does it mean that “natural latents” don’t capture the intuitive idea of “natural abstractions”? That all natural abstractions are natural latents, but not all natural latents are natural abstractions. You seem to be confirming this interpretation, but I just want to be sure:
So we view natural latents as a foundational tool. The plan is to construct more expressive structures out of them, rich enough to capture the type signatures of the kinds of concepts humans use day-to-day, and then use the guarantees provided by natural latents to make similar isomorphism claims about those more-complicated structures. That would give a potential foundation for crossing the gap between the internal ontologies of two quite different minds.
Is there any writing about what those “more expressive structures” could be?
2) Natural latents can describe both things which propagate through very universal, local physical laws (e.g. heat) and any commonalities in made up categories (e.g. “cakes”). Natural latents seem very interesting in the former case, but I’m not sure about the latter. Not sure the analogy between the two gives any insight. I’m still not seeing any substantial similarity between cakes and heat or Ising models. I.e. I see that an analogy can be made, but I don’t feel that this analogy is “grounded” in important properties of reality (locality, universality, low-levelness, stability, etc). Does this make sense?
3) I don’t understand what “those properties can in-principle be well estimated by intensive study of just one or a few mai tais” (from here) means. To me a natural latent is something like ~”all words present in all of 100 books”, it’s impossible to know unless you read every single book.
If I haven’t missed anything major, I’d say core insights about abstractions are still missing.
EDIT 17/07: I did miss at least one major thing. I haven’t understood the independence condition. If you take all words present in all 100 books, it doesn’t guarantee that those words make the books or their properties independent.
The very first example shows that absolutely arbitrary things (e.g. arbitrary green lines) can be “natural latents”. Does it mean that “natural latents” don’t capture the intuitive idea of “natural abstractions”?
I think what’s arbitrary here isn’t the latent, but the objects we’re abstracting over. They’re unrelated to anything else, useless to reason about.
Imagine, instead, if Alice’s green lines were copied not just by Bob, but by a whole lot of artists, founding an art movement whose members drew paintings containing this specific set of green lines. Now imagine if Donald, Ethan, and Frank each drew paintings with different green lines, and each of those also sparked an art movement. Then suppose all these art movements ended up associated with different demographics: Alice’s lines are preferred by business owners, Donald’s lines are preferred by programmers, etc. Then, suddenly, the exact shape of green lines you see in a painting in someone’s house becomes very important information about them, and it makes practical sense to carry the corresponding natural latent around in your head.
To me a natural latent is something like ~”all words present in all of 100 books”, it’s impossible to know unless you read every single book.
You can’t know what information may be missing from the other books, so generating a latent from a subset may result in “overshooting” it: you end up defining a variable that is larger than the actual minimal latent. That indeed means you haven’t learned the definition of that latent. However, looking at any one book still lets you know all information that is shared between all other books, so you still learn all information present in that latent. (The same way a granite cube contains all possible sculptures it could be chiseled into, I guess.)
(Edit: Wait, I made a mistake in the paragraph above, I confused natural latents with redundant information. You can’t know what information is redundant across all 100 books without looking at every book, because the one-hundredth book may be missing a datum present in every other. But if the set of books has a natural latent, then you can infer its definition from looking at any two books, because if a natural latent exists, then the information shared by any two books is exactly the information shared by all books. Consider the opposite: if any two books share information not shared with the other books, then a variable which contains only the information redundantly present in all books can’t make all books independent of each other.)
A more practical example: “trees” vs. “fir trees”. If the only trees growing in your region are firs, your abstraction for “a tree” is going to contain all information shared among firs. In a meaningful sense, you haven’t derived a “tree” abstraction: if you see an acacia tree, you wouldn’t be able to match it with your “fir tree” concept. But “a fir tree” does still contain all information shared between fir trees and acacia trees, so studying just the fir trees still lets you know everything there is to know about trees-in-general.
Note something here, though: “a fir tree” doesn’t function as a natural latent over acacia trees. For one, it doesn’t contain all information shared between individual acacia trees (the specific characteristics of that tree genus), so it fails to make them independent of each other. “An acacia tree” has the same issues when applied to fir trees. In fact, it turns out that the set of all trees doesn’t have a valid natural latent at all: there is no variable that both (1) has only the information present in any one tree, and (2) makes all trees independent of each other. So: for what would “a tree” be a valid natural latent?
Intuitively: for the set of natural latents representing tree types. Suppose we have a set of random variables consisting of “a fir tree”, “an acacia tree”, “an elm tree”, and so on. Then, the information that is shared between trees within a tree type, but which isn’t shared between tree types, is reframed as unique information belonging to an individual natural-latent variable. Then “a tree” is a “second-order” natural latent for this set: it contains all information shared between tree-type variables, so it (1) can be derived from any one natural latent for a tree-type set, and it (2) causes independence between the set of these natural latents.
Naturally, there could be arbitrarily many levels in the hierarchy: rather than having a three-level “individual trees”, “tree types”, “trees-in-general”, it could go “individual tree”, “tree subtype”, “tree type”, “trees”, “plants”, “living organisms”, “physical objects”, etc.
Which ties into:
Is there any writing about what those “more expressive structures” could be?
I think the above is one example of this. The two-level model with “low-level variables” and “the natural latent over them” is just the bare-bones setup. Abstractions form hierarchies that are much more rich and complicated.
Some other features we’d expect from a full theory of abstraction:
“Sibling layers”: some low-level variables can be part of two different abstract hierarchies. For example, a specific physical gear can be part of both a gear taxonomy (an instance of the “gear” concept) and a low-level element in some machine (which might itself be a component of some larger installation, etc.). Abstractions in the gear taxonomy would not meaningfully be higher or lower than the abstractions in the machinery. Paraphrasing: sets of abstractions seem to be partially ordered, not totally ordered. We’ll need some tools to navigate that.
Humans can compose abstractions in various complicated ways, such as cross-breeding “a lightbulb” with “a triangle” to get the mental representation of “a triangular lightbulb” – even if they’d never seen such a thing in real life. How does that work?
Humans can instantiate whole abstract environments where we plug different abstractions into each other and then hit “run” to simulate those environments. Likewise, how does that work?
Ultimately, the abstractions assemble into a full-scale world-model, which has all of the above features and functionality: partially ordered abstract hierarchies, the ability to compose, modify, and instantiate abstractions, and a lot of other things. What’s the actual formal way to capture all this?
It’s plausible that natural latents are a basic “atom” of the abstraction theory[1], but they assemble into various more complex structures, and have various sophisticated functions defined over them.
Natural latents can describe both things which propagate through very universal, local physical laws (e.g. heat) and any commonalities in made up categories (e.g. “cakes”).
Yeah, there seems to be some qualitative difference between those. See some discussion of that here: it’s possible that it’s the difference between “adjectives” and “nouns”, or something like that.
One thing that makes me suspicious, here, is the zero-one-infinity rule. If we know there are two “types” of abstractions, we should expect there to be many different types of abstractions, not just “adjectives” and “nouns”. So what other types exist, besides “instances of a class” and “features/subsystems”? (@johnswentworth, do you have any off-the-cuff thoughts on that? Naively, I guess we should also expect “verbs” to show up somewhere, and then maybe all things from this list...)
If I haven’t missed anything major, I’d say core insights about abstractions are still missing.
Oh, almost certainly. See e. g. the discussion here about the ??? role of synergistic information.
I have a couple of questions/points. Might stem from not understanding the math.
1) The very first example shows that absolutely arbitrary things (e.g. arbitrary green lines) can be “natural latents”. Does it mean that “natural latents” don’t capture the intuitive idea of “natural abstractions”? That all natural abstractions are natural latents, but not all natural latents are natural abstractions. You seem to be confirming this interpretation, but I just want to be sure:
Is there any writing about what those “more expressive structures” could be?
2) Natural latents can describe both things which propagate through very universal, local physical laws (e.g. heat) and any commonalities in made up categories (e.g. “cakes”). Natural latents seem very interesting in the former case, but I’m not sure about the latter. Not sure the analogy between the two gives any insight. I’m still not seeing any substantial similarity between cakes and heat or Ising models. I.e. I see that an analogy can be made, but I don’t feel that this analogy is “grounded” in important properties of reality (locality, universality, low-levelness, stability, etc). Does this make sense?
3) I don’t understand what “those properties can in-principle be well estimated by intensive study of just one or a few mai tais” (from here) means. To me a natural latent is something like ~”all words present in all of 100 books”, it’s impossible to know unless you read every single book.
If I haven’t missed anything major, I’d say core insights about abstractions are still missing.
EDIT 17/07: I did miss at least one major thing. I haven’t understood the independence condition. If you take all words present in all 100 books, it doesn’t guarantee that those words make the books or their properties independent.
My takes on those are:
I think what’s arbitrary here isn’t the latent, but the objects we’re abstracting over. They’re unrelated to anything else, useless to reason about.
Imagine, instead, if Alice’s green lines were copied not just by Bob, but by a whole lot of artists, founding an art movement whose members drew paintings containing this specific set of green lines. Now imagine if Donald, Ethan, and Frank each drew paintings with different green lines, and each of those also sparked an art movement. Then suppose all these art movements ended up associated with different demographics: Alice’s lines are preferred by business owners, Donald’s lines are preferred by programmers, etc. Then, suddenly, the exact shape of green lines you see in a painting in someone’s house becomes very important information about them, and it makes practical sense to carry the corresponding natural latent around in your head.
You can’t know what information may be missing from the other books, so generating a latent from a subset may result in “overshooting” it: you end up defining a variable that is larger than the actual minimal latent. That indeed means you haven’t learned the definition of that latent. However, looking at any one book still lets you know all information that is shared between all other books, so you still learn all information present in that latent. (The same way a granite cube contains all possible sculptures it could be chiseled into, I guess.)
(Edit: Wait, I made a mistake in the paragraph above, I confused natural latents with redundant information. You can’t know what information is redundant across all 100 books without looking at every book, because the one-hundredth book may be missing a datum present in every other. But if the set of books has a natural latent, then you can infer its definition from looking at any two books, because if a natural latent exists, then the information shared by any two books is exactly the information shared by all books. Consider the opposite: if any two books share information not shared with the other books, then a variable which contains only the information redundantly present in all books can’t make all books independent of each other.)
A more practical example: “trees” vs. “fir trees”. If the only trees growing in your region are firs, your abstraction for “a tree” is going to contain all information shared among firs. In a meaningful sense, you haven’t derived a “tree” abstraction: if you see an acacia tree, you wouldn’t be able to match it with your “fir tree” concept. But “a fir tree” does still contain all information shared between fir trees and acacia trees, so studying just the fir trees still lets you know everything there is to know about trees-in-general.
Note something here, though: “a fir tree” doesn’t function as a natural latent over acacia trees. For one, it doesn’t contain all information shared between individual acacia trees (the specific characteristics of that tree genus), so it fails to make them independent of each other. “An acacia tree” has the same issues when applied to fir trees. In fact, it turns out that the set of all trees doesn’t have a valid natural latent at all: there is no variable that both (1) has only the information present in any one tree, and (2) makes all trees independent of each other. So: for what would “a tree” be a valid natural latent?
Intuitively: for the set of natural latents representing tree types. Suppose we have a set of random variables consisting of “a fir tree”, “an acacia tree”, “an elm tree”, and so on. Then, the information that is shared between trees within a tree type, but which isn’t shared between tree types, is reframed as unique information belonging to an individual natural-latent variable. Then “a tree” is a “second-order” natural latent for this set: it contains all information shared between tree-type variables, so it (1) can be derived from any one natural latent for a tree-type set, and it (2) causes independence between the set of these natural latents.
Naturally, there could be arbitrarily many levels in the hierarchy: rather than having a three-level “individual trees”, “tree types”, “trees-in-general”, it could go “individual tree”, “tree subtype”, “tree type”, “trees”, “plants”, “living organisms”, “physical objects”, etc.
Which ties into:
I think the above is one example of this. The two-level model with “low-level variables” and “the natural latent over them” is just the bare-bones setup. Abstractions form hierarchies that are much more rich and complicated.
Some other features we’d expect from a full theory of abstraction:
“Sibling layers”: some low-level variables can be part of two different abstract hierarchies. For example, a specific physical gear can be part of both a gear taxonomy (an instance of the “gear” concept) and a low-level element in some machine (which might itself be a component of some larger installation, etc.). Abstractions in the gear taxonomy would not meaningfully be higher or lower than the abstractions in the machinery. Paraphrasing: sets of abstractions seem to be partially ordered, not totally ordered. We’ll need some tools to navigate that.
Humans can compose abstractions in various complicated ways, such as cross-breeding “a lightbulb” with “a triangle” to get the mental representation of “a triangular lightbulb” – even if they’d never seen such a thing in real life. How does that work?
Humans can instantiate whole abstract environments where we plug different abstractions into each other and then hit “run” to simulate those environments. Likewise, how does that work?
Ultimately, the abstractions assemble into a full-scale world-model, which has all of the above features and functionality: partially ordered abstract hierarchies, the ability to compose, modify, and instantiate abstractions, and a lot of other things. What’s the actual formal way to capture all this?
It’s plausible that natural latents are a basic “atom” of the abstraction theory[1], but they assemble into various more complex structures, and have various sophisticated functions defined over them.
Yeah, there seems to be some qualitative difference between those. See some discussion of that here: it’s possible that it’s the difference between “adjectives” and “nouns”, or something like that.
One thing that makes me suspicious, here, is the zero-one-infinity rule. If we know there are two “types” of abstractions, we should expect there to be many different types of abstractions, not just “adjectives” and “nouns”. So what other types exist, besides “instances of a class” and “features/subsystems”? (@johnswentworth, do you have any off-the-cuff thoughts on that? Naively, I guess we should also expect “verbs” to show up somewhere, and then maybe all things from this list...)
Oh, almost certainly. See e. g. the discussion here about the ??? role of synergistic information.
Though I have some doubts about this, I think they can be broken down further.