The Pragmascope Idea

Pragma (Greek): thing, object.

A “pragmascope”, then, would be some kind of measurement or visualization device which shows the “things” or “objects” present.

I currently see the pragmascope as the major practical objective of work on natural abstractions. As I see it, the core theory of natural abstractions is now 80% nailed down, I’m now working to get it across the theory-practice gap, and the pragmascope is the big milestone on the other side of that gap.

This post introduces the idea of the pragmascope and what it would look like.

Background: A Measurement Device Requires An Empirical Invariant

First, an aside on developing new measurement devices.

Why The Thermometer?

What makes a thermometer a good measurement device? Why is “temperature”, as measured by a thermometer, such a useful quantity?

Well, at the most fundamental level… we stick a thermometer in two different things. Then, we put those two things in contact. Whichever one showed a higher “temperature” reading on the thermometer gets colder, whichever one showed a lower “temperature” reading on the thermometer gets hotter, all else equal (i.e. controlling for heat exchanged with other things in the environment). And this is robustly true across a huge range of different things we can stick a thermometer into.

It didn’t have to be that way! We could imagine a world (with very different physics) where, for instance, heat always flows from red objects to blue objects, from blue objects to green objects, and from green objects to red objects. But we don’t see that in practice. Instead, we see that each system can be assigned a single number (“temperature”), and then when we put two things in contact, the higher-number thing gets cooler and the lower-number thing gets hotter, regardless of which two things we picked.

Underlying the usefulness of the thermometer is an empirical fact, an invariant: the fact that which-thing-gets-hotter and which-thing-gets-colder when putting two things into contact can be predicted from a single one-dimensional real number associated with each system (i.e. “temperature”), for an extremely wide range of real-world things.

Generalizing: a useful measurement device starts with identifying some empirical invariant. There needs to be a wide variety of systems which interact in a predictable way across many contexts, if we know some particular information about each system. In the case of the thermometer, a wide variety of systems get hotter/colder when in contact, in a predictable way across many contexts, if we know the temperature of each system.

So what would be an analogous empirical invariant for a pragmascope?

The Role Of The Natural Abstraction Hypothesis

The natural abstraction hypothesis has three components:

Chunks of the world generally interact with far-away chunks of the world via relatively-low-dimensional summaries
A broad class of cognitive architectures converge to use subsets of these summaries (i.e. they’re instrumentally convergent)
These summaries match human-recognizable “things” or “concepts”

For purposes of the pragmascope, we’re particularly interested in claim 2: a broad class of cognitive architectures converge to use subsets of the summaries. If true, that sure sounds like an empirical invariant!

So what would a corresponding measurement device look like?

What would a pragmascope look like, concretely?

The “measurement device” (probably a python function, in practice) should take in some cognitive system (e.g. a trained neural network) and maybe its environment (e.g. simulator/data), and spit out some data structure representing the natural “summaries” in the system/environment. Then, we should easily be able to take some other cognitive system trained on the same environment, extract the natural “summaries” from that, and compare. Based on the natural abstraction hypothesis, we expect to observe things like:

A broad class of cognitive architectures trained on the same data/environment end up with subsets of the same summaries.
Two systems with the same summaries are able to accurately predict the same things on new data from the same environment/distribution.
On inspection, the summaries correspond to human-recognizable “things” or “concepts”.
A system is able to accurately predict things involving the same human-recognizable concepts the pragmascope says it has learned, and cannot accurately predict things involving human-recognizable concepts the pragmascope says it has not learned.

It’s these empirical observations which, if true, will underpin the usefulness of the pragmascope. The more precisely and robustly these sorts of properties hold, the more useful the pragmascope. Ideally we’d even be able to prove some of them.

What’s The Output Data Structure?

One obvious currently-underspecified piece of the picture: what data structures will the pragmascope output, to represent the “summaries”? I have some current-best-guesses based on the math, but the main answer at this point is “I don’t know yet”. I expect looking at the internals of trained neural networks will give lots of feedback about what the natural data structures are.

Probably the earliest empirical work will just punt on standard data structures, and instead focus on translating internal-concept-representations in one net into corresponding internal-concept-representations in another. For instance, here’s one experiment I recently proposed:

Train two nets, with different architectures (both capable of achieving zero training loss and good performance on the test set), on the same data.
Compute the small change in data dx which would induce a small change in trained parameter values d\theta along each of the narrowest directions of the ridge in the loss landscape (i.e. eigenvectors of the Hessian with largest eigenvalue).
Then, compute the small change in parameter values d\theta in the second net which would result from the same small change in data dx.
Prediction: the d\theta directions computed will approximately match the narrowest directions of the ridge in the loss landscape of the second net.

Conceptually, this sort of experiment is intended to take all the stuff one network learned, and compare it to all the stuff the other network learned. It wouldn’t yield a full pragmascope, because it wouldn’t say anything about how to factor all the stuff a network learns into individual concepts, but it would give a very well-grounded starting point for translating stuff-in-one-net into stuff-in-another-net (to first/second-order approximation).