the two chunks are independent given the pressure and temperature of the gas
I’d be careful here: If the two chunks of gas are in a (closed) room which e.g. was previously colder on one side and warmer on the other and then equilibriated to same temperature everywhere, the space of microscopic states it can have evolved into is much smaller than the space of microscopic states that meet the temperature and pressure requirements (since the initial entropy was lower and physics is deterministic). Therefore in this case (or generally in cases in our simple universe rather than thought experiments where states are randomly sampled) a hypercomputer could see more mutual information between the chunks of gas than just pressure and temperature. I wouldn’t call the chunks approximately independent either, the point is that we with our bounded intellects are not able to keep track of the other mutual information.
Main comment:
(EDIT: I might’ve misunderstood the motivation behind natural latents in what I wrote below.)
I assume you want to use natural latents to formalize what a natural abstraction is.
The ”Λ induces independence between all Xi” criterion seems too strong to me.
IIUC you want that if we have an abstraction like “human”, you want all the individual humans to share approximately no mutual information conditioned on the “human” abstraction. Obviously, there are subclusters of humans (e.g. women, children, ashkenazi jews, …) where members share more properties (which I’d say is the relevant sense of “mutual information” here) than properties that are universally shared among humans. So given what I intuitively want the “human” abstraction to predict, there would be lots of mutual information between many humans. However, (IIUC,) your definition of natural latents permits there to be waaayyy more information encoded in the “human” abstraction, s.t. it can predict all the subclusters of humans that exist on earth, since it only needs to be insensitive to removing one particular human from the dataset. This complex human abstraction does render all individual humans approximately independent, but I would say this abstraction seems very ugly and not what I actually want.
I don’t think we need this conditional independence condition, but rather something else that finds clusters of thingies which share unusually much (relevant) mutual information. I like to think of abstractions as similarity clusters. I think it would be nice if we find a formalization of what a cluster of thingies is without needing to postulate an underlying thingspace / space of possible properties, and instead find a natural definition of “similarity cluster” based on (relevant) mutual information. But not sure, haven’t really thought about it.
(But possibly I misunderstood sth. If it already exists, feel free to invite me to a draft of the conceptual story behind natural latents.)
First a note:
I’d be careful here: If the two chunks of gas are in a (closed) room which e.g. was previously colder on one side and warmer on the other and then equilibriated to same temperature everywhere, the space of microscopic states it can have evolved into is much smaller than the space of microscopic states that meet the temperature and pressure requirements (since the initial entropy was lower and physics is deterministic). Therefore in this case (or generally in cases in our simple universe rather than thought experiments where states are randomly sampled) a hypercomputer could see more mutual information between the chunks of gas than just pressure and temperature. I wouldn’t call the chunks approximately independent either, the point is that we with our bounded intellects are not able to keep track of the other mutual information.
Main comment:
(EDIT: I might’ve misunderstood the motivation behind natural latents in what I wrote below.)
I assume you want to use natural latents to formalize what a natural abstraction is.
The ”Λ induces independence between all Xi” criterion seems too strong to me.
IIUC you want that if we have an abstraction like “human”, you want all the individual humans to share approximately no mutual information conditioned on the “human” abstraction.
Obviously, there are subclusters of humans (e.g. women, children, ashkenazi jews, …) where members share more properties (which I’d say is the relevant sense of “mutual information” here) than properties that are universally shared among humans.
So given what I intuitively want the “human” abstraction to predict, there would be lots of mutual information between many humans.
However, (IIUC,) your definition of natural latents permits there to be waaayyy more information encoded in the “human” abstraction, s.t. it can predict all the subclusters of humans that exist on earth, since it only needs to be insensitive to removing one particular human from the dataset. This complex human abstraction does render all individual humans approximately independent, but I would say this abstraction seems very ugly and not what I actually want.
I don’t think we need this conditional independence condition, but rather something else that finds clusters of thingies which share unusually much (relevant) mutual information.
I like to think of abstractions as similarity clusters. I think it would be nice if we find a formalization of what a cluster of thingies is without needing to postulate an underlying thingspace / space of possible properties, and instead find a natural definition of “similarity cluster” based on (relevant) mutual information. But not sure, haven’t really thought about it.
(But possibly I misunderstood sth. If it already exists, feel free to invite me to a draft of the conceptual story behind natural latents.)