Price’s equation for neural networks

Price’s equation is a fundamental equation in genetics, which can be used to predict how traits will change due to evolution. It can be phrased in many ways, but for the current post I will use the following simplified continuous-time variant:

Here, represents some genetic trait, represents the fitness of the organism, represents the genes of an organism, and represents the genetic covariance between the trait and the fitness. Usually people only use the part of the equation[1], but I’ve written out the definition

because that will make the analogy to neural networks easier to see.

Neural network training and Price’s equation

Suppose we train a neural network’s weights using the following equation, where represents the loss for the network:

In that case, if we have some property of the network (e.g. could represent how a classifier labels an image, or how an agent acts in a specific situation, or similar), then we can derive an equation for ’s evolution over time:

Similar to how we have a concept of genetic covariance to represent the covariance linked to genes, we should probably also introduce a covariance concept linked to neural network weights, to make it cleaner to talk about. I’ll call that (short for neural tangent covariance), defined as:

Furthermore, to make it closer to being analogous, we might replace with , yielding the following equation for predicting the evolution of any property with training under gradient descent:

This makes a bunch of idealistic assumptions about the training process, e.g. that we have an exact measure of the full gradient. It might be worth relaxing the math to more realistic assumptions, and check how much still applies. But for now, let’s just charge ahead with the unrealistic assumptions.

Covariance niceties

Covariances play nicely with linear causal effects. If and are linear transformations, then .

For instance, suppose you have a reinforcement learner that has learned to drink juice when close to it. Suppose further that now the main determinant for whether it gets reward is whether it approaches juice when it sees juice. We might formalize that effect as , where is the reward given to the agent, is the frequency at which it sees juice that it can approach, and is its likelihood of approaching juice if it sees it.

We can then compute: .

is a special quantity which we could call the neural tangent variance . It represents the degree to which is sensitive to the neural network parameters. For common situations, this may be dependent on the structure of the network, but also more directly on the nature and value of .

For instance, if is the expectation of a binary variable with a probability for being 1, then I bet there is probably going to be a Bernoulli distribution aspect to it, such that is approximately proportional to , but likely with a scale factor that depends on the network architecture or parameters, rather than being entirely equal to it.

In particular, this means that if is very low (in the juice example, if it is exceedingly rare for the agent to approach juice it sees), then will also be very low, and this will make low and therefore also make low.

  1. ^

    And usually people also put in other terms too to account for various distortions.