Fixing The Good Regulator Theorem

Conant & Ashby’s “Every Good Regulator Of A System Must Be A Model Of That System” opens with:

The design of a complex regulator often includes the making of a model of the system to be regulated. The making of such a model has hitherto been regarded as optional, as merely one of many possible ways.

In this paper a theorem is presented which shows, under very broad conditions, that any regulator that is maximally both successful and simple must be isomorphic with the system being regulated. (The exact assumptions are given.) Making a model is thus necessary.

This may be the most misleading title and summary I have ever seen on a math paper. If by “making a model” one means the sort of thing people usually do when model-making—i.e. reconstruct a system’s variables/​parameters/​structure from some information about them—then Conant & Ashby’s claim is simply false.

What they actually prove is that every regulator which is optimal and contains no unnecessary noise is equivalent to a regulator which first reconstructs the variable-values of the system it’s controlling, then chooses its output as a function of those values (ignoring the original inputs). This does not mean that every such regulator actually reconstructs the variable-values internally. And Ashby & Conant’s proof has several shortcomings even for this more modest claim.

This post presents a modification of the Good Regulator Theorem, and provides a reasonably-general condition under which any optimal minimal regulator must actually construct a model of the controlled system internally. The key idea is conceptually similar to some of the pieces from Risks From Learned Optimization. Basically: an information bottleneck can force the use of a model, in much the same way that an information bottleneck can force the use of a mesa-optimizer. Along the way, we’ll also review the original Good Regulator Theorem and a few minor variants which fix some other problems with the original theorem.

The Original Good Regulator Theorem

We’re interested mainly in this causal diagram:

The main goal is to choose the regulator policy to minimize the entropy of outcome . Later sections will show that this is (roughly) equivalent to expected utility maximization.

After explaining this problem, Conant & Ashby replace it with a different problem, which is not equivalent, and they do not bother to point out that it is not equivalent. They just present roughly the diagram above, and then their actual math implicitly uses this diagram instead:

Rather than choosing a regulator policy , they instead choose a policy . In other words: they implicitly assume that the regulator has perfect information about the system state (and their proof does require this). Later, we’ll talk about how the original theorem generalizes to situations where the regulator does not have perfect information. But for now, I’ll just outline the argument from the paper.

We’ll use two assumptions:

  • The entropy-minimizing distribution of is unique (i.e. if two different policies both achieve minimum entropy, they both produce the same -distribution). This assumption avoids a bunch of extra legwork which doesn’t really add any substance to the theorem.

  • is a deterministic function of . Note that we can always make this hold by including any nondeterministic inputs to in itself (though that trick only works if we allow to have imperfect information about , which violates Conant & Ashby’s setup… more on that later).

The main lemma then says: for any optimal regulator , is a deterministic function of . Equivalently: all -values with nonzero probability (for a given -value ) must give the same .

Intuitive argument: if the regulator could pick two different -values (given ), then it can achieve strictly lower entropy by always picking whichever one has higher probability (unconditional on ). Even if the two have the same , always picking one or the other gives strictly lower entropy (since the one we pick will end up with higher once we pick it more often). If the regulator is optimal, then achieving strictly lower entropy is impossible, hence it must always pick the same -value given the same -value. For that argument unpacked into a formal proof, see the paper.

With the lemma nailed down, the last step in Conant & Ashby’s argument is that any remaining nondeterminism in is “unnecessary complexity”. All -values chosen with nonzero probability for a given -value must yield the same anyway, so there’s no reason to have more than one of them. We might as well make a deterministic function of .

Thus: every “simplest” optimal regulator (in the sense that it contains no unnecessary noise) is a “model” of the system (in the sense that the regulator output is a deterministic function of the system state ).

The Problems

There are two immediate problems with this theorem:

  • The notion of “model” is rather silly—e.g. the system could be quite complex, but the regulator could be an identity function, and it would count as a “model”

  • The regulator is assumed to have perfect knowledge of the system state (i.e. second diagram rather than first)

Also, though I don’t consider it a “problem” so much as a choice which I think most people here will find more familiar:

  • The theorem uses entropy-minimization as its notion of optimality, rather than expected-utility-maximization

We’ll address all of these in the next few sections. Making the notion of “model” less silly will take place in two steps—the first step to make it a little less silly while keeping around most of the original’s meaning, the second step to make it a lot less silly while changing the meaning significantly.

Making The Notion Of “Model” A Little Less Silly

The notion of “model” basically says “ is a model of iff is a deterministic function of ”—the idea being that the regulator needs to reconstruct the value of from its inputs in order to choose its outputs. But the proof-as-written-in-the-paper assumes that takes as an input directly (i.e. the regulator chooses ), so really the regulator doesn’t need to “model” in any nontrivial sense in order for to be a deterministic function of . For instance, the regulator could just be the identity function: it takes in and returns . This does not sound like a “model”.

Fortunately, we can make the notion of “model” nontrivial quite easily:

  • Assume that is a deterministic function of

  • Assume that the regulator takes as input, rather than itself

The whole proof actually works just fine with these two assumptions, and I think this is what Conant & Ashby originally intended. The end result is that the regulator output must be a deterministic function of , even if the regulator only takes as input, not itself (assuming is a deterministic function of , i.e. the regulator has enough information to perfectly reconstruct ).

Note that this still does not mean that every optimal, not-unnecessarily-nondeterministic regulator actually reconstructs internally. It only shows that any optimal, not-unnecessarily-nondeterministic regulator is equivalent to one which reconstructs and then chooses its output as a deterministic function of (ignoring ).

Minimum Entropy → Maximum Expected Utility And Imperfect Knowledge

I think the theorem is simpler and more intuitive in a maximum expected utility framework, besides being more familiar.

We choose a policy function to maximize expected utility. Since there’s no decision-theoretic funny business in this particular setup, we can maximize for each -value independently:

Key thing to note: when two -values yield the same distribution function , the maximization problem

… is exactly the same for those two -values. So, we might as well choose the same optimal distribution , even if there are multiple optimal options. Using different optima for different , even when the maximization problems are the same, would be “unnecessary complexity” in exactly the same sense as Conant & Ashby’s theorem.

So: every “simplest” (in the sense that it does not have any unnecessary variation in decision distribution) optimal (in the sense that it maximizes expected utility) regulator is a deterministic function of the posterior distribution of the system state . In other words, there is some equivalent regulator which first calculates the Bayesian posterior of given , then throws away and computes its output just from that distribution.

This solves the “imperfect knowledge” issue for free. When input data is not sufficient to perfectly estimate the system state , our regulator output is a function of the posterior distribution of , rather than itself.

When system state can be perfectly estimated from inputs, the distribution is itself a deterministic function of , therefore the regulator output will also be a deterministic function of .

Important note: I am not sure whether this result holds for minimum entropy. It is a qualitatively different problem, and in some ways more interesting—it’s more like an embedded agency problem, since decisions for one input-value can influence the optimal choice given other -values.

Making The Notion Of “Model” A Lot Less Silly

Finally, the main event. So far, we’ve said that regulators which are “optimal” and “simple” in various senses are equivalent to regulators which “use a model”—i.e. they first estimate the system state, then make a decision based on that estimate, ignoring the original input. Now we’ll see a condition under which “optimal” and “simple” regulators are not just equivalent to regulators which use a model, but in fact must use a model themselves.

Here’s the new picture:

Our regulator now receives two “rounds” of data (, then ) before choosing the output . In between, it chooses what information from to keep around—the retained information is the “model” . The interesting problem is to prove that, under certain conditions, will have properties which make the name “model” actually make sense.

Conceptually, “chooses which game” the regulator will play. In order to achieve optimal play across all “possible games” might choose, has to keep around any information relevant to any possible game. However, each game just takes as input (not directly), so at most has to keep around all the information relevant to . So: with a sufficiently rich “set of games” , we expect that will have to contain all information from relevant to .

On the flip side, we want this to be an information bottleneck: we want to contain as little information as possible (in an information-theoretic sense), while still achieving optimality. Combining this with the previous paragraph: we want to contain as little information as possible, while still containing all information from relevant to . That’s exactly the condition for the Minimal Map Theorem: must be (isomorphic to) the Bayesian distribution .

That’s what we’re going to prove: if is a minimum-information optimal summary of , for a sufficiently rich “set of games”, then is isomorphic to the Bayesian posterior distribution on given , i.e. . That’s the sense in which is a “model”.

As in the previous section, we can independently optimize for each -value:

Conceptually, our regulator sees the -value, then chooses a strategy , i.e. it chooses the distribution from which will be drawn for each value.

We’ll start with a simplifying assumption: there is a unique optimal regulator . (Note that we’re assuming the full black-box optimal function of the regulator is unique; there can still be internally-different optimal regulators with the same optimal black-box function, e.g. using different maps .) This assumption is mainly to simplify the proof; the conclusion survives without it, but we would need to track sets of optimal strategies everywhere rather than just “the optimal strategy”, and the minimal-information assumption would ultimately substitute for uniqueness of the optimal regulator.

If two -values yield the same Bayesian posterior , then they must yield the same optimal strategy . Proof: the optimization problems are the same, and the optimum is unique, so the strategy is the same. (In the non-unique case, picking different strategies would force to contain strictly more information—i.e. - so the minimal-information optimal regulator will pick identical strategies whenever it can do so. Making this reasoning fully work with many optimal -values takes a bit of effort and doesn’t produce much useful insight, but it works.)

The next step is more interesting: given a sufficiently rich set of games, not only is the strategy a function of the posterior, the posterior is a function of the strategy. If two -values yield the same strategy , then they must yield the same Bayesian posterior . What do we mean by “sufficiently rich set of games”? Well, given two different distributions and , there must be some particular -value for which the optimal strategy is different from . The key is that we only need one -value for which the optimal strategies differ between and .

So: by “sufficiently rich set of games”, we mean that for every pair of -values with different Bayesian posteriors , there exists some -value for which the optimal strategy differs. Conceptually: “sufficiently rich set of games” means that for each pair of two different possible posteriors , can pick at least one “game” (i.e. optimization problem) for which the optimal policy is different under the two posteriors.

From there, the proof is easy. The posterior is a function of the strategy, the strategy is a function of , therefore the posterior is a function of : two different posteriors and must have two different “models” and . On the other hand, we already know that the optimal strategy is a function of , so in in order for to be information-minimal it must not distinguish between -values with the same posterior . Thus: if-and-only-if . The “model” is isomorphic to the Bayesian posterior .


When should a regulator use a model internally? We have four key conditions:

  • The regulator needs to make optimal decisions (in an expected utility sense)

  • Information arrives in more than one timestep/​chunk (, then ), and needs to be kept around until decision time

  • Keeping/​passing information is costly: the amount of information stored/​passed needs to be minimized (while still achieving optimal control)

  • Later information can “choose many different games”—specifically, whenever the posterior distribution of system-state given two possible values is different, there must be at least one value under which optimal play differs for the two values.

Conceptually, because we don’t know what game we’re going to play, we need to keep around all the information potentially relevant to any possible game. The minimum information which can be kept, while still keeping all the information potentially relevant to any possible game, is the Bayesian posterior on system state . There’s still a degree of freedom in how we encode the posterior on (that’s the “isomorphism” part), but the “model” M definitely has to store exactly the posterior.