The two conceptions of Active Inference: an intelligence architecture and a theory of agency

I think much of the confusion about Active Inference arises because the term is used to refer to two related, yet distinct concepts (one can also call them abstractions, theories, or ontics): an intelligence architecture and a theory of agency. In the literature on Active Inference, little or no attention is given to this distinction, which leads to misinterpretations, such as when the writer used the term to refer to one concept, and the reader understood it as referring to another, or, even more likely, not realising the distinction in the first place.

In Active Inference, Parr, Pezzulo, and Friston have called these two conceptions the “Low Road” and the “High Road to Active Inference”, and described them in chapters 2 and 3 of the book, respectively.

Active Inference as an architecture for intelligent agents

First, Active Inference can be seen as an architecture for autonomous intelligent agents. The main pillars of this architecture are variational Bayesian inference, planning-as-inference and action control-as-inference, and specification of preferences as a distribution over future observations (or hidden world states). This architecture can be implemented as an AI agent.

Parr, Pezzulo, and Friston see the process model of human intelligence on the neurobiological level as implementing this architecture. I think this position should be thought of as a subjective (or if you want, philosophical) view, rather than a falsifiable scientific claim.

Seth (2021) also classifies his “controlled and controlling hallucination” process theory of the brain as a species of Active Inference.

This conception of Active Inference is directly comparable to Reinforcement Learning, LeCun’s vision for the architecture of autonomous machine intelligence, the architecture of Gato, ReduNets, and most cognitive architectures.

Normative engineering science tells us that architectures should be compared with respect to a set of desirable properties (characteristics) of the engineered physical objects that have these architectures, i. e., with respect to a concrete engineering task. In the realm of biological evolution, genotypes and memotypes of some species of living organisms can be evolved through evolution, and the biological “architectures” they entail should be compared to each other based on their fitness to a particular ecological niche or based on some other evolutionary metric of success, such as the total time of the survival of the respective phylogenetic subtree, the total number of organisms in the tree that have ever lived, the total number of positive-valence experiences those organisms have had, etc.).

Theories of agency

We should call the most universal descriptions of the desirable characteristics of agents theories of agency. These theories should be seen as both physical and normative.

Theories of agency are physical theories because there is nothing they can be based on than the broadest theories of physics that describe the environments in which these agents live. This is statistical mechanics for practically all non-virtual environments, though, at least in principle, we can imagine something like cosmic-scale agents, or agents inside black holes, the theories for which should be based on other theories of physics because statistical mechanics may be inapplicable to the respective spatiotemporal timescales or environments. Collective agency may also be conceptualised on a substrate which is best described by some theory other than statistical mechanics.

Theories of agency should also be seen as normative theories because they describe the “ideal” the agents should strive for and benchmark themselves against. In other words, a normative theory of agency defines what agency is.

Normative physical theories of agency are falsifiable only from the perspective of ethical naturalism.

Active Inference as a theory of agency

Active Inference can be seen as a theory of agency, or a “normative framework” for living organisms, as Parr, Pezzulo, and Friston call it:

Active Inference is a normative framework to characterize Bayes-optimal behavior and cognition in living organisms. Its normative character is evinced in the idea that all facets of behavior and cognition in living organisms follow a unique imperative: minimizing the surprise of their sensory observations.

As a physical theory, Active Inference is based on statistical mechanics (Friston et al. 2021). This means that in principle, this theory is falsifiable in the naturalistic meta-ethical paradigm, though it would be very hard to conduct an appropriate experiment in the real world, and this experiment may need to take evolutionary time (millions of years) to complete.

The proposition that human behaviour conforms to Active Inference as a physical theory of agency (if one accepts this theory at all) trivially follows from the proposition that humans are agents (or: humans are biological organisms). As such, this is an obvious proposition, unlike the proposition mentioned above, namely that the process model of human intelligence on the neurobiological level implements Active Inference as an intelligence architecture.

Active Inference is not the only physical theory of agency which exists: Kolchinsky and Wolpert make an alternative proposal (see Kolchinsky and Wolpert 2018). Chan, Yu, and You’s principle of Maximal Coding Rate Reduction (MCR $^{2}$ ) also looks to me like a physical theory, although the authors don’t call it as such (Chan, Yu, You et al. 2022).

If one subscribes to normative Active Inference, intelligence architectures should be scored against their fitness to the framework. In this comparison, Active Inference as an architecture is, of course, unbeatable because the mathematics of these two conceptions of Active Inference (as an architecture and as a normative theory) are exactly the same.

However, if one desires an agent to have some extra characteristics beyond simply conforming to Active Inference as a normative theory (which means, beyond being simple instrumental convergence, because normative Active Inference exactly equals instrumental convergence; but note: not instead instrumental convergence, but instrumental convergence and something else), Active Inference as an intelligence architecture can become suboptimal: other architectures can conform to Active Inference as a general normative theory, but also provide those extra desired characteristics.

For example, in its most general form, normative Active Inference lefts unspecified where prior preferences come from; in a less general form, it states that prior preferences are beliefs provided via variational Bayesian inference, just as any other beliefs. The extra characteristics that engineers may want agents to have could concern the source of prior preferences.

References

Chan, Kwan Ho Ryan, Yaodong Yu, Chong You, Haozhi Qi, John Wright, and Yi Ma. “Redunet: A white-box deep network from the principle of maximizing rate reduction.” Journal of Machine Learning Research 23, no. 114 (2022): 1-103.

Friston, Karl J., Lancelot Da Costa, and Thomas Parr. “Some interesting observations on the free energy principle.” Entropy 23, no. 8 (2021): 1076.

Kolchinsky, Artemy, and David H. Wolpert. “Semantic information, autonomous agency and non-equilibrium statistical physics.” Interface focus 8, no. 6 (2018): 20180041.

Seth, Anil. Being you: A new science of consciousness. Penguin, 2021.

Parr, Thomas, Giovanni Pezzulo, and Karl J. Friston. Active inference: the free energy principle in mind, brain, and behavior. MIT Press, 2022.