Referential Containment

This is an idea I am toying around with for understanding resolutionally adjusted causal modeling—this is just a bunch of intuitions and pointing towards a somewhat clear framing of a fundamental thing. I am sure there are already plenty of accounts for how to approach this kind of task, but I like to figure stuff out by myself, at least initially.
This could definitely benefit from some visual aids, but I prioritized publishing.

“Referential containment” relates to how one might sensibly divide a system into parts or how a system with many parts might be understood as a system with fewer parts—it relates to the translation between different resolutions.

The mathematical intuition here is that if we think of a physical system as a complex causal graph with many edges and nodes, the nodes being information units and the edges representing the relationships of that information, then if we are tasked with dividing this graph into two areas, the dividing line should cross as few edges as possible.

The two resulting pieces will then have the fewest possible edges between them, so the explanation to account for these edges will also be the shortest. Most “references” are “contained” within the respective objects. This also works for hyperedges, and we might want to assign weights to different kinds of edges to indicate how “costly” they are to cross.

To expand this a bit towards more practical examples we could think of material objects and their surface and permanence. Solid objects are usually referentially contained in their locality, which partially guides our understanding of how to differentiate such objects from their surroundings. If I make a cut through an apple, the pieces are no longer connected as strongly in their locality so it becomes easier to think of them as two objects.

However, this is contextual. The locality of the apple halves is not entirely statistically independent, it rarely happens that two apple halves are moved more than a few meters apart before being eaten. Also, I chose the context of locality here, but one could choose among a great number of contexts that will provide different referential boundaries according to which I can detect objects/​phenomena/​patterns.

In practice, there will likely be too many dimensions of references to track to begin with (as they can be almost arbitrarily constructed), and my selection of which references/​relationships to pay attention to will be functional and subjective.

It is interesting to consider the interplay of objectivity and subjectivity here:

If the understanding of the space before me is subjectively important for me, I can devote a certain amount of processing capacity to it, as storage and attention etc go. But maybe there are just 4 very referentially contained clusters/​categories—even if I want to devote more resources to understanding this space, forming a fifth object is a way less efficient/​useful step compared to forming the initial four.

Practically, it makes more sense to get more precise about the nature of these objects and their relationships with each other, perhaps dividing them further on the next layer of resolution. Our world seems to supply us with hierarchies of efficient categorization into separate things as very functional/​efficient models.

Fundamentally, there may be more salient layers of resolution than can be expressed by one hierarchy.

One mental image I have is to apply some “zooming out” pressure to a causal graph, until certain subgraphs “snap” together into nodes, combining their edges to be less precise/​specific as well. The order in which subgraphs “snap together” or “bubble apart” in the reverse operation, and learning processes over that, are the main subjects of study here.

Filtering Information

Basically, I think there are two principles by which to sensibly reduce/​filter the amount of information that an agent has to process/​store:

The first principle is “computational reducibility” (introduced by Stephen Wolfram). This is about finding a subset/​smaller representation of some data, such that it contains all the information that is contained in the data (about some other system).
A basic example of this would be to strip away redundancies in the data. A more advanced example would be that if we can treat a complex system as reliably simulating a simpler system, we can just represent/​track the simpler system and retain our ability to make predictions/​statements about the complex system.
For a useful application of this, one would have to distinguish between efficiency in storing information vs efficiency in processing information, and also consider a more probabilistic notion of computational reducibility, e.g. such that our simplification retains 95% of the information contained in the original data, rather than insisting on perfect accuracy. This is obviously contextual.

The second principle is sometimes called “relevance realization” (introduced by John Varveake afaik) and refers to the ability to distinguish relevant information from irrelevant information. This is highly agent-, environment- and goal-dependent and also ends up being related to the previous concept of computational reducibility. Datapoints that have little predictive value when computing the future states of a given system tend to be less relevant than those that have a larger sway over the future.

So, for a given agent tasked with filtering/​representing the information arriving through its sensory interface, one could say that it is tasked with figuring out the computational reducibility and relevance of the data. I am not using either of these terms precisely as their authors might, but I am not sure if I should just make up new terms in that case.
Anyway, the agent is processing- and storage-constrained (relative to the environment), and is running on a computational substrate that favors some mathematical operations over others, all of which makes certain representations more useful.


Can you figure out how referential containment is(/​should be) related to these tasks and constraints?