An Undergraduate Reading Of: Semantic information, autonomous agency and non-equilibrium statistical physics

This is a recent paper by Artemy Kolchinsky and David H. Wolpert, from the Santa Fe Institute. It was published in The Royal Society Interface on Oct 19. They propose a formal theory of semantic information, which is to say how to formally describe meaning. I am going over it in the style proposed here and shown here, approximately.

I will go through the sections in-line at first, and circle back if appropriate. Mostly this is because when I pasted the table of contents it very conveniently kept the links to the direct sections of the paper, which is an awesome feature.

  • Abstract: as is my custom, skipped.

  • 1. Introduction

    • Semantic information is meaningful to a system, as distinct from syntactical information, which is correlational.

    • They cite the importance of the idea in these fields: biology, cognitive science, artificial intelligence, information theory, and philosophy.

    • Question one: can it be defined formally and generally?

    • Question two: can that definition be used in any physical system (rocks, hurricanes, cells, people)?

    • They claim the answer to both questions is yes. They define semantic information:

the information that a physical system has about its environment that is causally necessary for the system to maintain its own existence over time
    • Most of the time we study syntactic information, using Shannon’s theory.

    • Shannon explicitly avoided addressing what meaning a message over a telecommunication line might have.

    • One approach to address this is to assume an idealized system that optimizes some function, e.g. utility.

    • Under this approach, semantic information helps the system achieve its goal.

    • The problem with it is the goal is defined exogenously. Therefore meaning is to the scientists who impute goals to the system, not the system itself.

    • We want meaning based on the intrinsic properties of the system.

    • In biology the goal of an organism is fitness maximization, which leads to the teleosemantic approach, which roughly says a trait has meaning if at some time in the past it correlated with states of the environment (and therefore had a bearing on fitness).

    • Example: frogs snap their tongues at black spots in their visual field. This is semantic information because eating flies is good for frogs, and correlated with flies in the past.

    • The problem with teleosemantics is it defines meaning in terms of the past history of the system; an ahistorical definition that relies only on the dynamics of the system in a given environment is the goal.

    • Another approach is autonomous agents, which maintain their own existence in an environment. This has self-preservation as the goal, and does not rely on history.

    • Autonomous agents get information about the environment, and then respond in ‘appropriate’ ways. Example:

For instance, a chemotactic bacterium senses the direction of chemical gradients in its particular environment and then moves in the direction of those gradients, thereby locating food and maintaining its own existence.
    • Research suggests the information used for self-maintenance is meaningful, but this concept has remained informal. In particular, there is no formal way to quantify the semantic information an agent has, or to determine the meaning of a particular state.

    • Their contribution:

We propose a formal, intrinsic definition of semantic information, applicable to any physical system coupled to an external environment
    • A footnote here says that the method should generalize to any dynamical system, but they focus on physical ones in the paper. This is an interesting claim to me.

    • There is ‘the system X’ and ‘the environment Y’; at some initial time t = 0, they are jointly distributed according to some initial distribution p(x0, y0); they undergo coupled (possibly stochastic) dynamics until time τ, where τ is some timescale of interest.

    • There is a viability function, which is the negative Shannon entropy of the distribution over the states of system X. This quantifies the ‘degree of existence’ at any given time. More information about the viability function in section 4.

    • Shannon entropy is used because it provides an upper bound on the probability of states of interest; it also has a well-developed connection to thermodynamics, which links them to non-equilibrium statistical physics.

    • Semantic information is a subset of syntactic information which causally contributes to the continued existence of the system. This maintains the value of the viability function.

    • They draw from Pearl, and use a form of interventions in order to quantify:

To quantify the causal contribution, we define counter-factual intervened distributions in which some of the syntactic information between the system and its environment is scrambled.
    • Figure 1 has some graphical examples.

    • They give three verbal examples for the scrambling procedure: switching rocks between fields, switching hurricanes between oceans, and switching birds between environments. This section is a little suspect to me; the hurricane and the rock were described as “low viability value of information” when scrambling consisted of putting them in very similar environments, but then the bird was “high viability value of information” when scrambling put them in random environments. Further, the rock and bird were on year timelines, and the hurricane only an hour. This might just be sloppy explanation. In the main, I would expect the lifespan of a system to be inversely correlated with viability value of information overall, so I would have thought hurricane>bird>rock.

    • They use ‘coarse-graining’ methods from information theory to formalize transforming the actual distribution into intervened distributions.

    • The intervention which has the same (or greater?) viability as the actual distribution, but has the least syntactical information, is called the viability-optimal intervention.

    • They interpret all of the syntactic information of the optimal intervention to be semantic information, because any further scrambling changes the viability.

    • Semantic efficiency is the ratio of semantic to syntactic information. It quantifies how tuned the system is to only gather information relevant to its existence.

    • Semantic content of a system state x is the conditional distribution, under the optimal intervention, given state x. This can tell us the correlations relevant to maintaining the system.

    • They claim to be able to do point-wise semantic information as well.

    • The framework is not tied to the Shannon notion of syntactic information; from different measures of syntactic information they can derive appropriate measures of semantic information, e.g. thermodynamics through statistical physics.

    • Measures of semantic information are defined relative to the choice of:

(1) the particular division of the physical world into ‘the system’ and ‘the environment’;
(2) the timescale τ; and
(3) the initial probability distribution over the system and environment.
    • They suggest implications for an intrinsic definition of autonomous agency.

  • 2. Non-equilibrium statistical physics: Body—skipped for now.

  • 3. Preliminaries and physical set-up: Body—skipped for now.

  • 4. The viability function: Body—mostly skipped, but I did go in to find the actual function:

    • define the viability function as the negative of the Shannon entropy of the marginal distribution of system x at time τ,

  • 5. Semantic information via interventions: Body—skipped for now.

  • 6. Automatic identification of initial distributions, timescales and decompositions of interest: Body—skipped for now.

  • 7. Conclusion and discussion

    • Semantic information is syntactic information that is causally necessary for the system to continue.

    • It can be stored (mutual information between system and environment) and observed (transfer entropy exchanged between system and environment).

    • Semantic information can misrepresent the world. This shows up as a negative viability value.

    • Semantic information is asymmetrical between system and environment.

    • No need to decompose the system into different degrees of freedom (sensors/​effectors, body/​brain, membrane/​interior).

    • Side-steps the question of internal models or representations entirely.

    • The framework does not assume organisms, but it may be useful for offering quantitative and formal definitions of life.

    • They suggest that high semantic information may be a necessary, though not sufficient, condition for being alive.

Note: I have left the links below for completeness, and to make it easy to interrogate the funding/​associations of the authors. The appendices have some examples they develop.

End: I am putting this up before delving into the body sections in any detail, not least for length and readability. If there is interest, I can summarize those in the comments.