Free Energy Principle

TagLast edit: 29 Oct 2025 22:53 UTC by Ariel Cheng

The Free Energy Principle (FEP) states that self-organizing systems which maintain a separation from their environments via a Markov blanket—including the brain and other physical systems—minimize their variational free energy (VFE) and expected free energy (EFE) via perception and action respectively^[1]. Unlike in other theories of agency, under FEP, action and perception are unified as inference problems under similar objectives. In some cases, variational free energy reduces to prediction error, which is the difference between the predictions made about the environment and the actual outcomes experienced. The mathematical content of FEP is based on Variational Bayesian methods.

Although FEP has an extremely broad scope, it makes a number of very specific assumptions^[2] that may restrict its applicability to real-world systems. Ongoing theoretical work attempts to reformulate the theory to hold under more realistic assumptions. Some progress has been made: newer formulations of FEP, unlike their predecessors, do not assume a constant Markov blanket (but rather, some Markov blanket trajectory)^[3] and do not assume the existence of a non-equilibrium steady state^[4].

FEP has been influential in neuroscience and neuropsychology and more recently has been used to describe systems on all spatiotemporal scales, from cells and biological species to AIs and societies.

Process theories

Since FEP is an unfalsifiable mathematical principle, it does not make sense to ask whether FEP is true (because it is true mathematically given the assumptions.) Rather, it makes sense to ask whether its assumptions hold for a given system, and, if so, how that system implements the minimization of VFE and EFE. Unlike the FEP itself, a proposal of how some particular system minimizes VFE and EFE—a process theory—is falsifiable.

There are two FEP process theories most relevant to neuroscience.^[5] Predictive processing is a process theory of how VFE is minimized in brains during perception. Active Inference (AIF) is a process theory of the “action” part of FEP, which can also be seen as an agent architecture.

It has been argued^[6] that AIF as an agent architecture manages the model complexity (i.e. the bias-variance tradeoff) and the exploration-exploitation tradeoff in a principled way; favours explicit, disentangled, and hence more interpretable belief representations; and is amenable for working within hierarchical systems of collective intelligence (which are seen as Active Inference agents themselves^[7]). Building ecosystems of hierarchical collective intelligence can be seen as a proposed solution for and an alternative conceptualization of the general problem of alignment.

Connections to other theories

While some proponents of AIF believe that it is a more principled rival to Reinforcement Learning (RL), it has been shown that AIF is formally equivalent to the control-as-inference formulation of RL.^[8] Additionally, AIF also recovers Bayes-optimal reinforcement learning, optimal control theory, and Bayesian Decision Theory (aka EDT) under different simplifying assumptions^[9]^[10].

AIF is an energy-based model of intelligence. This likens FEP/Active Inference to Bengio’s GFlowNets^[11] and LeCun’s Joint Embedding Predictive Architecture (JEPA)^[12], which are also energy-based.

References

^
EFE is closely related to, and can be derived from, VFE. Action does not always minimize EFE; in some cases, it minimizes generalized free energy (a closely related quantity). See this figure for a brief overview.
^
E.g. (1) sensory, active, internal and external states have independent random fluctuations; (2) there exists an injective map between the mode of internal states and mode of external states; …etc.
^
Beck, Jeff, and Ramstead, Maxwell JD. “Dynamic Markov Blanket Detection for Macroscopic Physics Discovery.” arXiv preprint arXiv:2502.21217 (2025).
^
Friston, K., Da Costa, L., Sakthivadivel, D. A., Heins, C., Pavliotis, G. A., Ramstead, M., & Parr, T. (2023). “Path integrals, particular kinds, and strange things.” Physics of Life Reviews, 47, 35-62.
^
See “The free energy principle for action and perception: A mathematical review” (Buckley et al., 2017) for technical details.
^
Friston, Karl J., Maxwell JD Ramstead, Alex B. Kiefer, Alexander Tschantz, Christopher L. Buckley, Mahault Albarracin, Riddhi J. Pitliya et al. “Designing Ecosystems of Intelligence from First Principles.” arXiv preprint arXiv:2212.01354 (2022).
^
Kaufmann, Rafael, Pranav Gupta, and Jacob Taylor. “An active inference model of collective intelligence.” Entropy 23, no. 7 (2021): 830.
^
Millidge, Beren, Alexander Tschantz, Anil K. Seth, and Christopher L. Buckley. “On the relationship between active inference and control as inference.” In International workshop on active inference, pp. 3-11. Cham: Springer International Publishing, 2020.
^
Parr, Thomas, Giovanni Pezzulo, and Karl J. Friston. Active inference: the free energy principle in mind, brain, and behavior. MIT Press, 2022.
^
Friston, Karl, Lancelot Da Costa, Danijar Hafner, Casper Hesp, and Thomas Parr. “Sophisticated inference.” Neural Computation 33, no. 3 (2021): 713-763.
^
Bengio, Yoshua. “GFlowNet Tutorial.” (2022).
^
LeCun, Yann. “A path towards autonomous machine intelligence.” preprint posted on openreview (2022).