Optimization Concepts in the Game of Life

Abstract: We define robustness and retargetability (two of Flint’s measures of optimization) in Conway’s Game of Life and apply the definitions to a few examples. The same approach likely works in most embedded settings, and provides a frame for conceptualizing and quantifying these aspects of agency. We speculate on the relationship between robustness and retargetability, and identify various directions for future work.


We would like to better understand the fundamental principles of agency (and related phenomena including optimization and goal-directedness). We focus on agency because we believe agency is a core source of risk from AI systems, especially in worlds with one (or few) most-capable systems. The goals of the most competent consequence-driven systems are more likely to be achieved, because trying outperforms not trying or less competent trying. We do not want to create a world where such systems are working against us. By better understanding agency, we hope to improve our ability to avoid mistakenly building systems working capably against us, and to correct course if we do.

A rich source of confusions about agency comes from attending to the fact that goal-directed systems are part of – embedded in – the environment that their goals are about. Most practical work on AI avoids the confusions of embedded agency by constructing and enforcing a Cartesian boundary between agent and environment, using frameworks such as reinforcement learning (RL) that define an interaction protocol. We focus on embedded agency because we expect not to be able to enforce a Cartesian boundary for highly capable agents in general domains, and, as a particularly strong instance of this, because agents may emerge unexpectedly in systems where we did not design how they interface with the rest of the world.

Our approach to deconfusion in this post is to identify concepts that seem relevant to embedded agency but do not have technical definitions, to propose some definitions, and see how they fare on some examples. More generally, we are interested in analyzing small examples of agency-related phenomena in the hope that some examples will be simple enough to yield insight while retaining essential features of the phenomenon.

Optimization in the Game of Life


We draw two concepts from Alex Flint’s essay The Ground of Optimization. Flint defines an optimizing system as a system that evolves towards a small set of target configurations from a broad basin of attraction, despite perturbations. The essay introduces measures for quantifying optimization systems. One is robustness: how robust to perturbations is the process of reaching the target set, e.g. the number of dimensions on which perturbations can be made or the magnitude of the perturbations. Another measure is retargetability: whether the system can be transformed into another optimizing system with a different target configuration set via a small change.

Here, we develop more precise definitions of these concepts by concentrating on a particular concrete domain: Conway’s Game of Life. This is a natural setting for studying embedded agency because it is a deterministic environment with no pre-specified Cartesian boundaries, which is rich enough to support emergent goal-directed behavior, yet simple enough to define the concepts above explicitly.


Before getting to the definitions, let’s look at how we might draw analogies between some of the examples of systems (including optimizing systems) from the Ground of Optimization post and structures in the Game of Life.

The Ground of OptimizationGame of LifeOptimizing system?
Bottle cap


Satellite in orbit


Ball in a valley


Ball in a valley with robot Mobile eater (hypothetical)Yes

A block is like a bottle cap in that it has been designed (or selected) to stay in place and not spontaneously disintegrate, but it does not robustly produce more specific outcomes than simply existing, and can easily be perturbed away from this state.

A glider is like a satellite in orbit: it can be redirected but does not recover its original trajectory on perturbation.

An eater is like a ball in a valley in the sense that it ends up in the same state from a variety of starting configurations. This is the state with the eater alone on the board, analogous to the state with the ball at the bottom of the valley.

We can imagine a hypothetical “mobile eater” that walks around looking for other patterns to consume. This would be more robust than the regular eater, similarly to a ball in a valley with a robot, which is more robust than just a ball in a valley.

EDIT: Note that any finite pattern in Life (such as the empty board) is robust to introducing non-viable collections of cells in the empty areas of the pattern. We originally thought that this would make the empty board an optimizing system, but by this criterion any finite pattern is an optimizing system, which is not very interesting.

Preliminary Definitions

Like any embedded setting, Life does not come with a privileged Cartesian boundary. Instead we will define an operation, instantiation, that combines an agent with an environment, and thereby substantiates counterfactual questions such as “What would this agent do in a different context?” that are otherwise meaningless in a deterministic non-Cartesian world.

What kinds of things are agents and environments? We start with a very general mathematical object, a pattern, which we define as simply a state of the Game of Life world. That is, a pattern is an infinite two-dimensional Boolean grid, or equivalently a function of type ℤxℤ→{true, false}, indicating which cells are alive and which are dead. A pattern is finite if it has only finitely many cells alive.

We represent an agent as a finite pattern and an environment as a context (formally defined as a pattern). Thus, agents and environments have the same type signature, since they are made of the same “stuff” in an embedded setting.

To put the two together, we make use of a third concept, also formally represented by a pattern, which we call a mask, which specifies which parts of the context are the “holes” the agent is supposed to fit into (and replace whatever else was there). As mentioned above, the operation that combines agent and environment is instantiation:

Definition. The instantiation of in context using mask is the pattern
where is a context, is a mask, and is a pattern (the “agent”).

Instantiating in results in the pattern that is the same as wherever the mask is true, and the same as everywhere else. By default we take the mask to be the padding mask of one cell around all the agent’s live cells: .

In any deterministic discrete dynamical system, if we have an operation like instantiation that can combine two states of the system to produce another, then we can similarly represent potential agents and their surroundings by system states. This might allow these definitions to be generalized to other settings besides the Game of Life.

We’ll use the following notation for computations and properties in discrete dynamical systems:

  • Given a state (we use because our Life states are patterns), is one step of evolution according to the system’s dynamics.

  • The sequence , i.e., , is the computation seeded at (or a “trajectory” in dynamical systems terminology).

  • A property is a set of states (patterns).

  • A property is achieved by a computation s if there exists some number of steps such that . A property is fixed by a computation if for all above some bound.


Defining robustness

We have defined patterns very generally. Which patterns are optimizing systems? As Flint noted, an optimizing system has a measure of robustness to perturbations. We can characterize this formally by considering the optimization target as a set of states P (target configurations), and the set C of possible contexts in which a pattern p might be placed.

Definition (robustness):

A pattern is robust for within iff for all , the computation seeded at achieves .

In this way, the variation within represents perturbations the system faces, and can recover from, when optimizing for the target configuration represented by .


  • Eater. An eater p is robust for within any context that contains spaceships traveling in the direction of the eater (and nothing else on the board). In these contexts, the eater eventually achieves a board empty apart from itself.

  • Periodic patterns. An oscillator or spaceship with period is robust for (for any ) within the empty board (where is equivalence up to translation). This includes still lifes (), blinkers (), gliders (), etc.

Our use of contexts to represent perturbations is a little different from the intuitive notion. In particular, we do not directly consider perturbations that happen during the computation, that is, interventions on the state of the board at some step after the initial state . One could consider this kind of external perturbation in an alternative definition, which may also be illuminating. An advantage of our approach is that it recognises that many perturbations can be achieved within the Game of Life computation itself – one might call these embedded perturbations. Specifically, one can include in a context that contains a pattern that is “going to perturb after timesteps” (e.g., a glider that is going to collide with after timesteps).

The more robust a system is, and the more restrictive its target is, the more it seems like an optimizing system. These two axes correspond to the “size” of the two components of our formal robustness definition: the contexts and the target . If is “larger”, the system is robust to more variation, and if is “smaller”, the target is more restrictive. We will leave quantification of size unspecified for now, since there are various candidate definitions but we haven’t found a clearly correct one yet.

Definitions building on robustness

Definition (basin of attraction):

The basin of attraction for a pattern and a property is the largest context set such that is robust for within .


  • Eater. Let be an eater and . is the context set containing spaceships moving in the direction of the eater and nothing else (in any other context, the contents of the board don’t get consumed by the eater).

  • Any pattern. Let be an arbitrary pattern and . Then is the set of all contexts: is achieved immediately by .

Definition (minimal property):

If we keep fixed and vary instead, we can define the minimal property of a pattern within a context set as the “smallest” property such that is robust for within .

We will discuss some options for quantifying the size of a property in the next section. For now, we consider some examples of minimal properties using set cardinality as the measure of size.


  • Still life. Let be a still life and (still lifes not overlapping with ). Then P = {still life q | q=c(p) for some c} (since is a still life that is different for every context).

  • Eater. Let be an eater and be the context set containing spaceships moving in the direction of the eater. Then .

The concept of a minimal property is related to the idea of a behavioral objective: a goal that a system appears to be optimizing for given its behavior in a set of situations. Given a pattern and a context set , the set of properties that is robust for within corresponds to the set of possible behavioral objectives for within the set of situations . We may be interested in the simplest behavioral objectives, corresponding to the set of minimal properties that is robust for within .

Options for robustness definitions

How might we quantify size in our definitions above? Primarily, we seek a notion of size for a property, which is a set of patterns. Set cardinality is one option, which ignores the contents of the patterns, counting each of them equally. Another option would be to combine (e.g., take an average of) sizes of the constituent patterns. Natural possibilities in Life for the size of a pattern include number of live cells or size of the smallest rectangle bounding the live cells. A different option, that may capture the sense needed better, is a complexity-based definition such as Kolmogorov complexity (of the whole property) or Levin complexity. It remains to be worked out whether any of these notions of size give our definitions above a natural semantics, or whether we need a different notion of size.

We defined robustness in terms of achieving a property. We could have defined it instead in terms of fixing a property, which is a stronger condition (any computation that fixes a property also achieves it, but not vice versa). However, the two definitions are equivalent if we restrict attention to stable properties that satisfy whenever . We can stabilize a property by unioning it with all the states any elements produce after any number of steps.


The orthogonality thesis states that “more or less any level of intelligence could in principle be combined with more or less any final goal”, suggesting the idea that the capability to achieve goals in general (intelligence) is separate from the particular goal being pursued. Not all optimizing systems satisfy this separation, as Flint’s examples show, but those that do should score more highly on his measures of duality and retargetability. We think duality and retargetability are hard to distinguish concepts, and will focus on the latter.

To get more precise about retargetability, let’s use the definition of robustness above for the aspect of retargetability that requires a notion of goal pursuit.

Definition (retargetability):

A pattern is retargetable for a set of properties (the “possible goals”) if there exists a context set , such that for any property in there is a pattern that is a “small” change from , such that is robust for within .

The degree of retargetability depends on the size of the set (more, or more interesting, possible goals are better), the size of the changes (smaller, or less complex changes required for retargeting are better), and the size of the context set (larger is better).

This definition is again dependent on a way to measure sizes, for example, the size of the change between and . Some candidates include: Kolmogorov complexity of the change, the number of cells changed, and the size of the area in which changes are made.


  • Glider gun. Let be a glider gun positioned at on the board. Let the target set of goals be , where is the property that there is a glider in the th quadrant of the board. Namely, , where is a glider and is a pattern covering the th quadrant of the board.
    Then for any , we obtain by rotating the glider gun to fire into the quadrant , which is a small change by the complexity definition. Let be the set of still life contexts that don’t overlap with the glider gun or its firing path for any of the four rotations of the glider gun. Then is robust for within , so is retargetable for the target set .

  • Turing machine. Let be a pattern implementing a Turing machine computing some function , e.g., “given input , compute ”. For any input , let be the set of board states where the output tape of the Turing machine contains . Let the target set of goals be .
    Then for any we can obtain by placing on the input tape of the Turing machine, which is a small change by all the definitions of size we considered (number of cells, size of area, and complexity). Let be the set of still life contexts that don’t overlap with the Turing machine. Then is robust for within , so is retargetable for the target set .

Our setup suggests a possible relationship between robustness and retargetability: it may be difficult for a pattern to be both robust and retargetable. A retargetable pattern needs to be “close to” robustly achieving many targets, but this may be in tension with robustly achieving a single target property. The reason is that a context may cause retargeting via an embedded perturbation, and the new target property may not overlap with the original target property. For example, since the Turing machine is retargetable by changing the input, it’s not robust to contexts that change its input.

Conclusions and open questions

We have proposed some definitions for robustness and retargetability in Conway’s Game of Life, and shown examples of how they work. Our definitions are not fully specified—they lack a good specification of how to quantify sizes of patterns and sets of patterns. We hope they nevertheless illustrate an interesting way of looking at optimizing systems in a concrete deterministic setting.

Here are some open questions that we would be excited to get your input on:

  • To what extent is there a tradeoff between robustness and retargetability?

  • Is robustness or retargetability of a system a greater concern from the alignment perspective?

  • It seems feasible to extend our definitions from the Game of Life to other environments where instantiation can be defined. We’d be interested in your suggestions of interesting environments to consider.

  • More examples of interesting robust patterns. What could they tell us about the properties that C should have in the definition of robustness?

  • Possible theorems restricting the size or content of robust patterns. E.g., for some class of contexts do you need to be “agent-like” in some way, such as doing something like perception, in order to be robust?

  • No-free-lunch type theorems on what kinds of combinations of context set C and property P are impossible for any robust pattern.