Three Kinds Of Ontological Foundations
Why does a water bottle seem like a natural chunk of physical stuff to think of as “A Thing”, while the left half of the water bottle seems like a less natural chunk of physical stuff to think of as “A Thing”? More abstractly: why do real-world agents favor some ontologies over others?
At various stages of rigor, an answer to that question looks like a story, an argument, or a mathematical proof. Regardless of the form, I’ll call such an answer an ontological foundation.
Broadly speaking, the ontological foundations I know of fall into three main clusters.
Translatability Guarantees
Suppose an agent wants to structure its world model around internal representations which can translate well into other world models. An agent might want translatable representations for two main reasons:
Language: in order for language to work at all, most words need to point to internal representations which approximately “match” (in some sense) across the two agents communicating.
Correspondence Principle: it’s useful for an agent to structure its world model and goals around representations which will continue to “work” even as the agent learns more and its world model evolves.
Guarantees of translatability are the type of ontological foundation presented in our paper Natural Latents: Latent Variables Stable Across Ontologies. The abstract of that paper is a good high-level example of what an ontological foundation based on translatability guarantees looks like:
Suppose two Bayesian agents each learn a generative model of the same environment. We will assume the two have converged on the predictive distribution (i.e. distribution over some observables in the environment), but may have different generative models containing different latent variables. Under what conditions can one agent guarantee that their latents are a function of the other agent’s latents?
We give simple conditions under which such translation is guaranteed to be possible: the natural latent conditions. We also show that, absent further constraints, these are the most general conditions under which translatability is guaranteed.
Environment Structure
A key property of an ideal gas is that, if we have even just a little imprecision in our measurements of its initial conditions, then chaotic dynamics quickly wipes out all information except for a few summary statistics (like e.g. temperature and pressure); the best we can do to make predictions about the gas is to use a Boltzman distribution with those summary statistics. This is a fact about the dynamics of the gas, which makes those summary statistics natural ontological Things useful to a huge range of agents.
Looking at my own past work, the Telephone Theorem is aimed at ontological foundations based on environment structure. It says, very roughly:
When information is passed through many layers, one after another, any information not nearly-perfectly conserved through nearly-all the “messages” is lost.
A more complete ontological foundation based on environment structure might say something like:
Information which propagates over long distances (as in the Telephone Theorem) must (approximately) have a certain form.
That form factors cleanly (e.g. in the literal sense of a probability distribution factoring over terms which each involve only a few variables)
Mind Structure
Toward Statistical Mechanics Of Interfaces Under Selection Pressure talks about the “APIs” used internally by a neural-net-like system. The intuition is that, in the style of stat mech or singular learning theory, the exponential majority of parameter-values which produce low loss will use APIs for which a certain entropic quantity is near-minimal. Insofar as that’s true (which it might not be!), a natural prediction would be that a wide variety of training/selection processes for the same loss would produce a net using those same APIs internally.
That would be the flavor of an ontological foundation based on mind structure. An ideal ontological foundation based on mind structure would prove that a wide variety of mind structures, under a wide variety of training/selection pressures, with a wide variety of training/selection goals, converge on using “equivalent” APIs or representations internally.
All Of The Above?
Of course, the real ideal for a program in search of ontological foundations would be to pursue all three of these types of ontological foundations, and then show that they all give the same answer. That would be strong evidence that the ontological foundations found are indeed natural.
For translatability guarantees, we also want an answer for why agents have distinct concepts for different things, and the criteria for carving up the world model into different concepts. My sketch of an answer is that different hypotheses/agents will make use of different pieces of information under different scenarios, and having distinct reference handles to different types of information allows the hypotheses/agents to access the minimal amount of information they need.
For environment structure, we’d like an answer for what it means for there to be an object that persists through time, or for there to be two instances of the same object. One way this could work is to look at probabilistic predictions of an object over its Markov blanket, and require some sort of similarity in probabilistic predictions when we “transport” the object over spacetime
I’m less optimistic about the mind structure foundation because the interfaces that are the most natural to look at might not correspond to what we call “human concepts”, especially when the latter requires a level of flexibility not supported by the former. For instance, human concepts have different modularity structures with each other depending on context (also known as shifting structures), which basically rules out any simple correspondence with interfaces that have fixed computational structure over time. How we want to decompose a world model is an additional degree of freedom to the world model itself, and that has to come from other ontological foundations.
Structures of Optimal Understandability
(In this text foundation(s) refers to the OP’s definition.)
Something is missing. I think there is another foundation of “Optimal Abstraction Structure for understanding” (simply understandability in the remaining text).
Intuitively, a model of the world can be organized in such a way that it can be understood and reasoned about as efficiently as possible.
Consider a spaghetti codebase with very long functions that do 10 different things each, and have lots of duplication.
Now consider another codebase that performs the same tasks. Probably each function now does one thing, most functions are pure, and there are probably significant changes to the underlying approach. E.g. we might create a boundary between display and business logic.
The point is that for any outward-facing program behavior, there are many codebases that implement it. These codebases can vary wildly in terms of how easy they are to understand.
This generalizes. Any kind of structure, including any type of model of a world, can be represented in multiple. Different representations score differently on how easy the data can be comprehended and reasoned about.
When looking at spaghetti code, it’s ugly, but not primarily because of the idiosyncrasies of human aesthetics. I expect there is a true name that can quantify how optimally some data is arranged, for the purpose of understanding and reasoning about it.
Spaghetti code would rank lower than carefully crafted code.
Even a superintelligent programmer still wouldn’t “like” spaghetti code when it needs to do a lot of reasoning about the code.
Understandability seems not independent from your three foundations, but…
Mind Structure
“Mind structure” depends directly on task performance. It’s about understanding how minds will tend to be structured after they have been trained and have achieved a high score.
But unless the task performance increases when the agent introspects, and the agent is smart enough to do this, I expect mind structures with optimal loss to score poorly on understandability.
Environment Structure
It feels like there are many different models that capture environment structure, which score wildly differently in terms of how easy they are to comprehend.
In particular, in any complex world, we want to create domain-specific models, i.e. heavily simplified models that are valid for a small bounded region of phase space.
E.g. an electrical engineer models a transistor as having a constant voltage. But give too much voltage and it explodes.
Translatability
A model being translatable seems like a much weaker condition than being easily understandable.
Understandability seems to imply translatability. If you have understood something, you have translated it into your own ontology. At least this is a vague intuition I have.
Translatability says: It is possible to translate this.
Optimal understandability says: You can translate this efficiently (and probably there is a single general and efficient translation algorithm).
Closing
It seems there is another foundation of understandability. In some contexts real-world agents prefer having understandable ontologies (which may include their own source code). But this isn’t universal, and can even be anti-natural.
Even so understandability seems an extremely important foundation. It might not neccesaily be important to an agent performing a task, but it’s important to anyone trying to understand and reason about that agent. Like a human trying to understand if the agent is misaligned.