# Introduction to Cartesian Frames

This is the first post in a sequence on **Cartesian frames**, a new way of modeling agency that has recently shaped my thinking a lot.

Traditional models of agency have some problems, like:

They treat the “agent” and “environment” as primitives with a simple, stable input-output relation. (See “Embedded Agency.”)

They assume a particular way of carving up the world into variables, and don’t allow for switching between different carvings or different levels of description.

Cartesian frames are a way to add a first-person perspective (with choices, uncertainty, etc.) on top of a third-person “here is the set of all possible worlds,” in such a way that many of these problems either disappear or become easier to address.

The idea of Cartesian frames is that we take as our basic building block a binary function which combines a choice from the agent with a choice from the environment to produce a world history.

We don’t think of the agent as having inputs and outputs, and we don’t assume that the agent is an object persisting over time. Instead, we only think about a set of possible choices of the agent, a set of possible environments, and a function that encodes what happens when we combine these two.

This basic object is called a Cartesian frame. As with dualistic agents, we are given a way to separate out an “agent” from an “environment.” But rather than being a basic feature of the world, this is a “frame” — a particular way of conceptually carving up the world.

We will use the combinatorial properties of a given Cartesian frame to derive versions of inputs, outputs and time. One goal here is that by making these notions derived rather than basic, we can make them more amenable to approximation and thus less dependent on exactly how one draws the Cartesian boundary. Cartesian frames also make it much more natural to think about the world at multiple levels of description, and to model agents as having subagents.

Mathematically, Cartesian frames are exactly Chu spaces. I give them a new name because of my specific interpretation about agency, which also highlights different mathematical questions.

Using Chu spaces, we can express many different relationships between Cartesian frames. For example, given two agents, we could talk about their sum (), which can choose from any of the choices available to either agent, or we could talk about their tensor (), which can accomplish anything that the two agents could accomplish together as a team.

Cartesian frames also have duals () which you can get by swapping the agent with the environment, and and have De Morgan duals ( and respectively), which represent taking a sum or tensor of the environments. The category also has an internal hom, , where can be thought of as ” with a -shaped hole in it.” These operations are very directly analogous to those used in linear logic.

## 1. Definition

Let be a set of possible worlds. A Cartesian frame over is a triple , where represents a set of possible ways the agent can be, represents a set of possible ways the environment can be, and is an evaluation function that returns a possible world given an element of and an element of .

We will refer to as the agent, the elements of as possible agents, as the environment, the elements of as possible environments, as the world, and elements of as possible worlds.

**Definition:** A Cartesian frame over a set is a triple , where and are sets and . If is a Cartesian frame over , we say , , , and .

A finite Cartesian frame is easily visualized as a matrix, where the rows of the matrix represent possible agents, the columns of the matrix represent possible environments, and the entries of the matrix are possible worlds:

.

E.g., this matrix tells us that if the agent selects and the environment selects , then we will end up in the possible world .

Because we’re discussing an agent that has the freedom to choose between multiple possibilities, the language in the definition above is a bit overloaded. You can think of as representing the agent before it chooses, while a particular represents the agent’s state after making a choice.

Note that I’m specifically *not* referring to the elements of as “actions” or “outputs”; rather, the elements of are possible ways the agent can choose to be.

Since we’re interpreting Cartesian frames as first-person perspectives tacked onto sets of possible worlds, we’ll also often phrase things in ways that identify a Cartesian frame with its agent. E.g., we will say ” is a subagent of ” as a shorthand for “‘s agent is a subagent of ’s agent.”

We can think of the environment as representing the agent’s uncertainty about the set of counterfactuals, or about the game that it’s playing, or about “what the world is as a function of my behavior.”

A Cartesian frame is effectively a way of factoring the space of possible world histories into an agent and an environment. Many different Cartesian frames can be put on the same set of possible worlds, representing different ways of doing this factoring. Sometimes, a Cartesian frame will look like a subagent of another Cartesian frame. Other times, the Cartesian frames may look more like independent agents playing a game with each other, or like agents in more complicated relationships.

## 2. Normal-Form Games

When viewed as a matrix, a Cartesian frame looks much like the normal form of a game, but with possible worlds rather than pairs of utilities as entries.

In fact, given a Cartesian frame over , and a function from to a set , we can construct a Cartesian frame over by composing them in the obvious way. Thus, if we had a Cartesian frame and a pair of utility functions and , we could construct a Cartesian frame over , given by , where . This Cartesian frame will look exactly like the normal form of a game. (Although it is a bit weird to think of the environment set as having a utility function.)

We can use this connection with normal-form games to illustrate three features of the ways in which we will use Cartesian frames.

**2.1. Coarse World Models**

First, note that we can talk about a Cartesian frame over , even though one would not normally think of as a set of possible worlds.

In general, we will often want to talk about Cartesian frames over “coarse” models of the world, models that leave out some details. We might have a world model that fully specifies the universe at the subatomic level, while also wanting to talk about Cartesian frames over a set of high-level descriptions of the world.

We will construct Cartesian frames over by composing Cartesian frames over with the function from to that sends more refined, detailed descriptions of the universe to coarser descriptions of the same universe.

In this way, we can think of an element of as the coarse, high-level possible world given by “Those possible worlds for which and .”

**Definition: **Given a Cartesian frame over , and a function , let denote the Cartesian frame over , , where .

**2.2. Symmetry**

Second, normal-form games highlight the symmetry between the players.

We do not normally think about this symmetry in agent-environment interactions, but this symmetry will be a key aspect of Cartesian frames. Every Cartesian frame has a dual which swaps and and transposes the matrix.

**2.3. Relation to Extensive-Form Games**

Third, much of what we’ll be doing with Cartesian frames in this sequence can be summarized as “trying to infer extensive-form games from normal-form games” (ignoring the “games” interpretation and just looking at what this would entail formally).

Consider the ultimatum game. We can represent this game in extensive form:

Given any game in extensive form, we can then convert it to a game in normal form. In this case:

The strategies in the normal-form game are the policies in the extensive-form game.

If we then delete the labels, so now we just have a bunch of combinatorial structure about which things send you to the same place, I want to know when we can infer properties of the original extensive-form game, like time and information states.

Although we’ve used games to note some features of Cartesian frames, we should be clear that Cartesian frames aren’t about utilities or game-theoretic rationality. We are not trying to talk about what the agent does, or what the agent should do. In fact, we are effectively taking as our most fundamental building block that an agent can freely choose from a set of available actions.

The theory of Cartesian frames is trying to understand what agents’ options are. Utility functions and facts about what the agent actually does can possibly later be placed on top of the Cartesian frame framework, but for now we will be focusing on building up a calculus of what the agent *could* do.

## 3. Controllables

We would like to use Cartesian frames to reconstruct ideas like “an agent persisting over time,” inputs (or “what the agent can learn”), and outputs (or “what the agent can do”), by taking as basic:

an agent’s ability to “freely choose” between options;

a collection of possible ways those options can correspond to world histories; and

a notion of when world histories are considered the same in some coarse world model.

In this way, we hope to find new ways of thinking about partial and approximate versions of these concepts.

Instead of thinking of the agent as an object with outputs, I expect a more embedded view to think of all the facts about the world that the agent can force to be true or false.

This includes facts of the form “I output foo,” but it also includes facts that are downstream from immediate outputs. Since we’re working with “what can I make happen?” rather than “what is my output?”, the theory becomes less dependent on precisely answering questions like “Is my output the way I move my mouth, or is it the words that I say?”

We will call the analogue of outputs in Cartesian frames **controllables**. The types of our versions of “outputs” and “inputs” are going to be subsets of , which we can think of as properties of the world. E.g., might be the set of worlds in which woolly mammoths exist; we could then think of “controlling ” as “controlling whether or not mammoths exist.”

We’ll define what an agent can control as follows. First, given a Cartesian frame over , and a subset of , we say that is *ensurable* in if there exists an such that for all , we have . Equivalently, we say that is ensurable in if at least one of the rows in the matrix only contains elements of .

**Definition: **.

If an agent can ensure , then regardless of what the environment does — and even if the agent doesn’t know what the environment does, or its behavior isn’t a function of what the environment does — the agent has some strategy which makes sure that the world ends up in . (In the degenerate case where the agent is empty, the set of ensurables is empty.)

Similarly, we say that is *preventable* in if at least one of the rows in the matrix contains *no *elements of .

**Definition: **.

If is both ensurable and preventable in , we say that is controllable in .

**Definition: **.

**3.1. Closure Properties**

Ensurability and preventability, and therefore also controllability, are closed under adding possible agents to and removing possible environments from .

**Claim:** If and , and if for all and we have , then .

**Proof:** Trivial.

Ensurables are also trivially closed under supersets. If I can ensure some set of worlds, then I can ensure some larger set of worlds representing a weaker property (like “mammoths exist *or *cave bears exist”).

**Claim:** If , and , then .

**Proof:** Trivial.

is similarly closed under subsets. need not be closed under subsets or supersets.

Since and will often be large, we will sometimes write them using a minimal set of generators.

**Definition:** Let denote the the closure of under supersets. Let denote the closure of under subsets.

**3.2. Examples of Controllables**

Let us look at some simple examples. Consider the case where there are two possible environments, for rain, and for sun. The agent independently chooses between two options, for umbrella, and for no umbrella. and . There are four possible worlds, . We interpret as the world where the agent has an umbrella and it is raining, and similarly for the other worlds. The Cartesian frame, , looks like this:

.

, or

and , or

Therefore .

The elements of are not actions, but subsets of : rather than assuming a distinction between “actions” and other events, we just say that the agent can guarantee that the actual world is drawn from the set of possible worlds in which it has an umbrella (), and it can guarantee that the actual world is drawn from the set of possible worlds in which it doesn’t have an umbrella ().

Next, let’s modify the example to let the agent see whether or not it is raining before choosing whether or not to carry an umbrella. The Cartesian frame will now look like this:

.

The agent is now larger, as there are two new possibilities: it can carry the umbrella if and only if it rains, or it can carry the umbrella if and only if it is sunny. will also be larger than . .

Under one interpretation, the new options and feel different from the old ones and . It feels like the agent’s basic options are to either carry an umbrella or not, and the new options are just incorporating and into more complicated policies.

However, we could instead view the agent’s “basic options” as a choice between “I want my umbrella-carrying to match when it rains” and “I want my umbrella-carrying to match when it’s sunny.” This makes and feel like the conditional policies, while and feel like the more basic outputs. Part of the point of the Cartesian frame framework is that we are not privileging either interpretation.

Consider now a third example, where there is a third possible environment, , for meteor. In this case, a meteor hits the earth before the agent is even born, and there isn’t a question about whether the agent has an umbrella. There is a new possible world, which we will also call , in which the meteor strikes. The Cartesian frame will look like this:

.

, and . As a consequence, .

This example illustrates that nontrivial agents may be unable to control the world’s state. Because the agent can’t prevent the meteor, the agent in this case has no controllables.

This example also illustrates that agents may be able to ensure or prevent some things, even if there are possible worlds in which the agent was never born. While the agent of cannot ensure that it exists, the agent can ensure that *if* there is no meteor, then it carries an umbrella ().

If we wanted to, we could instead consider the agent’s ensurables (or its ensurables and preventables) its “outputs.” This lets us avoid the counter-intuitive result that agents have no outputs in worlds where their existence is contingent.

I put the emphasis on controllables because they have other nice features; and as we’ll see later, there is an operation called “assume” which we can use to say: “The agent, *under the assumption that there’s no meteor*, has controllables.”

## 4. Observables

The analogue of inputs in the Cartesian frame model is **observables**. Observables can be thought of as a closure property on the agent. If an agent is able to observe , then the agent can take policies that have different effects depending on .

Formally, let be a subset of . We say that the agent of a Cartesian frame is able to observe whether if for every pair , there exists a single element which implements the conditional policy that copies in possible worlds in (i.e., for every , if , then ) and copies in possible worlds outside of .

When implements the conditional policy “if then do , and if not then do ” in this way, we will say that is in the set .

**Definition:** Given , a Cartesian frame over , a subset of , and , let denote the set of all such that for all , and.

Agents in this setting observe events, which are true or false, not variables in full generality. We will say that ’s observables, , are the set of all such that ’s agent can observe whether .

**Definition: **.

Another option for talking about what the agent can observe would be to talk about when ’s agent can distinguish between two disjoint subsets and . Here, we would say that the agent of can distinguish between and if for all , there exists an such that for all , either or , and whenever , , and whenever , . This more general definition would treat our observables as the special case . Perhaps at some point we will want to use this more general notion, but in this sequence, we will stick with the simpler version.

**4.1. Closure Properties**

**Claim: **Observability is closed under Boolean combinations, so if then , , and are also in .

**Proof: **Assume** **. We can see easily that by swapping and . It suffices to show that , since an intersection can be constructed with complements and union.

Given and , since , there exists an such that for all , we have . Then, since , there exists an such that for all , we have . Unpacking and combining these, we get for all , . Since we could construct such an from an arbitrary , we know that .

This highlights a key difference between our version of “inputs” and the standard version. Agent models typically draw a strong distinction between the agent’s immediate sensory data, and other things the agent might know. Observables, on the other hand, include all of the information that *logically follows* from the agent’s observations.

Similarly, agent models typically draw a strong distinction between the agent’s immediate motor outputs, and everything else the agent can control. In contrast, if an agent can ensure an event , it can also ensure everything that logically follows from .

Since will often be large, we will sometimes write it using a minimal set of generators under union. Since is closed under Boolean combinations, such a minimal set of generators will be a partition of (assuming is finite).

**Definition: **Let denote the the closure of under union (including , the empty union).

Just like what’s controllable, what’s observable is closed under removing possible environments.

**Claim:** If , and if for all and we have , then .

**Proof:** Trivial.

It is interesting to note, however, that what’s observable is not closed under adding possible agents to .

**4.2. Examples of Observables**

Let’s look back at our three examples from earlier. The first example, , looked like this:

.

. This is the smallest set of observables possible. The agent can act, but it can’t change its behavior based on knowledge about the world.

The second example looked like:

.

Here, . The agent can observe whether or not it’s raining. One can verify that for any pair of rows, there is a third row (possibly equal to one or both of the first two) that equals the first if it is or , and equals the second otherwise.

The third example looked like:

.

Here, , which is

This example has an odd feature: the agent is said to be able to “observe” whether the meteor strikes, even though the agent is never instantiated in worlds in which it strikes. Since the agent has no control when the meteor strikes, the agent can vacuously implement conditional policies.

Let’s look at two more examples. First, let’s modify to represent the point of view of a powerless bystander:

.

Here, the agent has no decisions, and everything is in the hands of the environment.

Alternatively, we can modify to represent the point of view of the agent from and environment from together. The resulting frame looks like this:

.

and , so . Meanwhile, .

On the other hand, , and are the closure of under supersets and subsets respectively, and .

In the first case, the agent’s ability to observe the world is maximal and its ability to control the world is minimal; while in the second case, observability is minimal and controllability is maximal. An agent with full control over what happens will not be able to observe anything, while an agent that can observe everything can change nothing.

This is perhaps counter-intuitive. If** ** meant “I can go look at something to check whether we’re in an world,” then one might look at and say: “This agent is all-powerful. It can do *anything*. Shouldn’t we then think of it as all-seeing and all-knowing, rather than saying it ‘can’t observe anything’?” Similarly, one might look at and say: “This agent’s choices can’t change the world at all. But then it seems bizarre to say that everything is ‘observable’ to the agent. Shouldn’t we rather say that this agent is powerless *and* blind?”

The short answer is that, when working with Cartesian frames, we are in a very “What choices can you make?” paradigm, and in that kind of paradigm, the thing closest to an “input” is “What can I condition my choices on?” (Which is a closure property on the agent, rather than a discrete action like “turning on the Weather Channel.”)

In that context, an agent with only one option automatically has maximal “inputs” or “knowledge,” since it can vacuously implement every conditional policy. At the same time, an agent with too many options can’t have any “inputs,” since it could then use its high level of control to diagonalize against the observables it is conditioning on and make them false.

## 5. Controllables and Observables Are Disjoint

A maximally observable frame has minimal controllables, and vice versa. This turns out to be a special case of our first interesting result about Cartesian frames: an agent can’t observe what it controls, and can’t control what it observes.

To see this, first consider the following frame:

.

Here, if , then would not be able to be either or . If it were , then it would have to copy , and . But if it were , then it would have to copy , and . So is empty in this case.

Notice that in this example, isn’t empty merely because our lacks the right to implement the conditional policy. Rather, the conditional policy is impossible to implement even in principle.

Fortunately, before checking whether ’s agent can observe , we can perform a simpler check to rule out these problematic cases. It turns out that if , then every column in consists either entirely of elements of or entirely of elements outside of . (This is a necessary* *condition for being observable, not a sufficient one.)

**Definition: **Given a Cartesian frame over , and a subset of , let denote the subset .

**Lemma:** If , then for all , it is either the case that or .

**Proof:** Take , and assume for contradiction that there exists an in neither nor . Thus, there exists an such that and an such that . Since , there must exist an such that . Consider whether or not . If , then . However, if , then . Either way, this is a contradiction.

This lemma immediately gives us the following theorem showing that in nontrivial Cartesian frames, observables and controllables are disjoint.

**Theorem: **Let be a Cartesian frame over , with nonempty. Then,.

**Proof: **Let , and suppose for contradiction that . Since , there exists an such that . Since , there exists an such that . This contradicts our lemma above.

**5.1. Properties That Are Both Observable and Ensurable Are Inevitable**

We also have a one-sided result showing that if is both observable and ensurable in , then must be inevitable — i.e., the entire matrix must be contained in .

We’ll first define a Cartesian frame’s image, which is the subset of containing every possible world that is actually hit by the evaluation function — the set of worlds that show up in the matrix.

**Definition:** .

can be thought of as a degenerate form of either or , where in the first case, the agent must make it the case that , and in the second case the agent can do conditional policies because the condition is never realized.^{1} Conversely, if an agent can both observe and ensure , then the observability and the ensurability must both be degenerate.

**Theorem:** if and only if and is nonempty.

**Proof: **Let** ** be a Cartesian frame over .** **First, if , then , since for all . If is also nonempty, then , there exists an , and for all , .

Conversely, if is empty, is empty, so . If , then there exist and such that . Then , since if , there exists an such that in particular , so is in neither nor , which implies .

**Corollary:** If is nonempty, .

**Proof:** Trivial.

**5.2. Controllables and Observables in Degenerate Frames**

All of the results so far have shown that an agent’s observables and controllables cannot simultaneously be too large. We also have some results that in some extreme cases, and cannot be too small. In particular, if there are few possible agents, observables must be large, and if there are few possible environments, controllables must be large.

**Claim:** If , .

**Proof: **If is empty, for all vacuously. If is a singleton, then for all , because .

**Claim:** If is nonempty and is empty, then . If is nonempty and is a singleton, and .

**Proof:** If is nonempty and is empty, for all vacuously.

If is nonempty and is a singleton, every that intersects nontrivially is in , since if , there must be some such that , this satisfies for all . Conversely, if and are disjoint, no can satisfy this property. The result for then follows trivially from the result for .

**5.3. A Suggestion of Time**

Cartesian frames as we’ve been discussing them are agnostic about time. Possible agents, environments, and worlds could represent snapshots of a particular moment in time, or they could represent lengthy processes.

The fact that an agent’s controllables and observables are disjoint, however, suggests a sort of arrow of time, where facts an agent can observe must be “before” the facts that agent has control over. This hints that we may be able to use Cartesian frames to formally represent temporal relationships.

One reason it would be nice to represent time is that we could model agents that repeatedly learn things, expanding their set of observables. Suppose that in some frame includes choices the agent makes over an entire calendar year. ’s observables would only include the facts the agent can condition on at the start of the year, when it’s first able to act; we haven’t defined a way to formally represent the agent learning new facts over the course of the year.

It turns out that this additional temporal structure *can* be elegantly expressed using Cartesian frames. We will return to this topic in the very last post in this sequence. For now, however, we only have this hint that particular Cartesian frames have something like a “before” and “after.”

## 6. Why Cartesian Frames?

The goal of this sequence will be to set up the language for talking about problems using Cartesian frames.

Concretely, I’m writing this sequence because:

I’ve recently found that I have a new perspective to bring to a lot of other MIRI researchers’ work. This perspective seems to me to be captured in the mathematical structure of Cartesian frames, but it’s the new perspective rather than the mathematical structure per se that seems important to me. I want to try sharing this mathematical object and the accompanying philosophical interpretation, to see if it successfully communicates the perspective.

I want collaborators to work with on Cartesian frames. If you’re a math person who finds the things in this sequence exciting, I’d be interested in talking about it more. You can comment, PM, or email me.

I want help with paradigm-building, but I also want there to be an ecosystem where people do normal science within this paradigm. I would consider it a good outcome if there existed a decent-sized group of people on the AI Alignment Forum and LessWrong for whom it makes just as much sense to pull out the Cartesian frames paradigm as it makes to pull out the cybernetic agent paradigm.

Below, I will say more about the cybernetic agent model and other ideas that helped motivate Cartesian frames, and I will provide an overview of upcoming posts in the sequence.

**6.1. Cybernetic Agent Model**

The cybernetic agent model describes an agent and an environment interacting over time:

In “Embedded Agency,” Abram Demski and I noted that cybernetic agents like Marcus Hutter’s AIXI are dualistic, whereas real-world agents will be embedded in their environment. Like a Cartesian soul, AIXI is crisply separated from its environment.

The dualistic model is often useful, but it’s clearly a simplification that works better in some contexts than in others. One thing it would be nice to have is a way to capture the useful things about this simplification, while treating it as a high-level approximation with known limitations — rather than treating it as ground truth.

Cartesian frames carve up the world into a separate “agent” and “environment,” and thereby adopt the basic conceit of dualistic Hutter-style agents. However, they treat this as a “frame” imposed on a more embedded, naturalistic world.^{2}

Cartesian frames serve the same sort of intellectual function as the cybernetic agent model, and are intended to supersede this model. Our hope is that a less discrete version of ideas like “agent,” “action,” and “observation” will be better able to tolerate edge cases. E.g., we want to be able to model weirder, loopier versions of “inputs” that operate across multiple levels of description.

We will also devote special attention in this sequence to subagents, which are very difficult to represent in traditional dualistic models. In game theory, for example, we carve the world into different “agent” and “non-agent” parts, but we can’t represent nontrivial agents that intersect other agents. A large part of the theory in this sequence will be giving us a language for talking about subagents.

**6.2. Deriving Functional Structure**

Another way of summarizing this sequence is that we’ll be applying *reasoning* like Pearl’s to *objects* like game theory’s, with a *motivation* like Hutter’s.

In Judea Pearl’s causal models, you are given a bunch of variables, and an enormous joint distribution over the variables. The joint distribution is a large object that has a relational structure as opposed to a functional structure.

You then deduce something that looks like time and causality out of the combinatorial properties of the joint distribution. This takes the form of causal diagrams, which give you functions and counterfactuals.

This has some similarities to how we’ll be using Cartesian frames, even though the formal objects we’ll be working with are very different from Pearl’s. We want a model that can replace the cybernetic agent model with something more naturalistic, and our plan for doing this will involve deriving things like time from the combinatorial properties of possible worlds.

We can imagine the real world as an enormous static object, and we can imagine zooming in on different levels of the physical world and sometimes seeing things that look like local functions. (“Ah, no matter what the rest of the world looks like, I can compute the state of from the state of , relative to my uncertainty.”) Switching which part of the world we’re looking at, or switching which things we’re lumping together versus splitting, can then change which things look like functions.

Agency itself, as we normally think about it, is functional in this way: there are multiple “possible” inputs, and whichever “option” we pick yields a deterministic result.

We want an approach to agency that treats this functional behavior less like a unique or fundamental feature of the world, and more like a special case of the world’s structure in general — and one that may depend on what we’re splitting or lumping together.

“We want to apply Pearl-like methods to Cartesian frames” is also another way of saying “we want to do the formal equivalent of inferring extensive-form games from normal-form games,” our summary from before. The analogy is:

base information | derived information | |

causality | joint probability distribution | causal diagram |

games | normal-form game | extensive-form game |

agency | Cartesian frame | control, observation, subagents, time, etc. |

The game theory analogy is more relevant formally, while the Pearl analogy better explains why we’re interested in this derivation.

Just as notions of time and information state are basic in causal diagrams and extensive-form games, so are they basic in the cybernetic agent model; and we want to make these aspects of the cybernetic agent model derived rather than basic, where it’s possible to derive them. We also want to be able to represent things like subagents that are entirely missing from the cybernetic agent model.

Because we aren’t treating high-level categories like “action” or “observation” as primitives, we can hope to end up with an agent model that will let us model more edge cases and odd states of the system. A more derived and decomposable notion of time, for example, might let us better handle settings where two agents are both trying to reach a decision based on their model of the other agent’s future behavior.

We can also hope to distinguish features of agency that are more description-invariant from features that depend strongly on how we carve up the world.

One philosophical difference between our approach and Pearl’s is that we will avoid the assumption that the space of possible worlds factors nicely into variables that are given to the agent. We want to instead just work with a space of possible worlds, and derive the variables for ourselves; or we may want to work in an ontology that lets us reason with multiple incompatible factorizations into variables.^{3}

**6.3. Contents**

The rest of the sequence will cover these topics:

**2. ****Additive Operations on Cartesian Frames****—****W**e talk about the category of Chu spaces, and introduce two additive operations one can do on Cartesian frames: sum , and product . We talk about how to interpret these operations philosophically, in the context of agents making choices to affect the world. We also introduce the small Cartesian frame , and its dual .

**3. ****Biextensional Equivalence**—We define homotopy equivalence for Cartesian frames, and introduce the small Cartesian frames , , and .

**4. ****Controllables and Observables, Revisited**—We use our new language to redefine controllables and observables.

**5. ****Functors and Coarse Worlds**—We show how to compare frames over a detailed world model and frames over a coarse version of that world model . We demonstrate that observability is a function not only of the observer and the observed, but of the level of description of the world.

**6. Subagents of Cartesian Frames**—We introduce the notion of a frame whose agent is the subagent of a frame , written . A subagent is an agent playing a game whose stakes are another agent’s possible choices. This notion turns out to yield elegant descriptions of a variety of properties of agents.

**7. Multiplicative Operations on Cartesian Frames**—We introduce three new binary operations on Cartesian frames: tensor , par , and lollipop .

**8. Sub-Sums and Sub-Tensors**—We discuss spurious environments, and introduce variants of sum, , and tensor, , that can remove some (but not too many) spurious environments.

**9. Additive and Multiplicative Subagents**—We discuss the difference between additive subagents, which are like future versions of the agent after making some commitment; and multiplicative subagents, which are like agents acting within a larger agent.

**10. Committing, Assuming, Externalizing, and Internalizing**—We discuss the additive notion of producing subagents and sub-environments by *committing* or *assuming*, and the multiplicative notion of *externalizing* (moving part of the agent into the environment) and *internalizing* (moving part of the environment into the agent).

**11. Eight Definitions of Observability—**We use our new tools to provide additional definitions and interpretations of observables. We talk philosophically about the difference between defining what’s observable using product and defining what’s observable using tensor, which corresponds to the difference between updateful and updateless observations.

**12. Time in Cartesian Frames—**We show how to formalize temporal relations with Cartesian frames.

I’ll be releasing new posts most non-weekend days between now and November 11.

As Ben noted in his announcement post, I’ll be giving talks and holding office hours this Sunday at 12-2pm PT and the following three Sundays at 2-4pm PT, to answer questions and discuss Cartesian frames. Everyone is welcome.

The online talks, covering much of the content of this sequence, will take place** this Sunday at 12pm PT** (~~Zoom link~~ added: recording of the talk) and **next Sunday at 2pm PT**.

This sequence is communicating ideas I have been developing slowly over the last year. Thus, I have gotten a lot of help from conversation with many people. Thanks to Alex Appel, Rob Bensinger, Tsvi Benson-Tilsen, Andrew Critch, Abram Demski, Sam Eisenstat, David Girardo, Evan Hubinger, Edward Kmett, Alexander Gietelink Oldenziel, Steve Rayhawk, Nate Soares, and many others.

## Footnotes

1. This assumes a non-empty . Otherwise, could be empty and therefore a subset of , even though is not ensurable (because you need an element of in order to ensure anything). ↩

2. This is one reason for the name “Cartesian frames.” Another reason for the name is to note the connection to Cartesian products. In linear algebra, a frame of an inner product space is a generalization of a basis of a vector space to sets that may be linearly dependent. With Cartesian frames, then, we have a Cartesian product that projects onto the world, not necessarily injectively. (Cartesian frames aren’t actually “frames” in the linear-algebra sense, so this is only an analogy.) ↩

3. This, for example, might let us talk about a high-level description of a computation being “earlier” in some sort of logical time than the exact details of that same computation.

Problems like agent simulates predictor make me think that we shouldn’t treat the world as factorizing into a single “true” set of variables at all, though I won’t attempt to justify that claim here. ↩

- Introduction to Cartesian Frames by 22 Oct 2020 13:00 UTC; 142 points) (
- What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs) by 31 Mar 2021 23:50 UTC; 139 points) (
- MIRI: 2020 Updates and Strategy by 23 Dec 2020 21:27 UTC; 76 points) (
- Additive Operations on Cartesian Frames by 26 Oct 2020 15:12 UTC; 60 points) (
- Functors and Coarse Worlds by 30 Oct 2020 15:19 UTC; 48 points) (
- Sunday October 25, 12:00PM (PT) — Scott Garrabrant on “Cartesian Frames” by 21 Oct 2020 3:27 UTC; 48 points) (
- Subagents of Cartesian Frames by 2 Nov 2020 22:02 UTC; 47 points) (
- Time in Cartesian Frames by 11 Nov 2020 20:25 UTC; 46 points) (
- Biextensional Equivalence by 28 Oct 2020 14:07 UTC; 42 points) (
- EA Organization Updates: October 2020 by 22 Nov 2020 20:37 UTC; 38 points) (EA Forum;
- Eight Definitions of Observability by 10 Nov 2020 23:37 UTC; 33 points) (
- 29 Apr 2021 21:14 UTC; 33 points) 's comment on AMA: Paul Christiano, alignment researcher by (
- Sub-Sums and Sub-Tensors by 5 Nov 2020 18:06 UTC; 33 points) (
- Multiplicative Operations on Cartesian Frames by 3 Nov 2020 19:27 UTC; 33 points) (
- Controllables and Observables, Revisited by 29 Oct 2020 16:38 UTC; 33 points) (
- Committing, Assuming, Externalizing, and Internalizing by 9 Nov 2020 16:59 UTC; 30 points) (
- “Cartesian Frames” Talk #2 this Sunday at 2pm (PT) by 28 Oct 2020 13:59 UTC; 30 points) (
- AI Alignment, Philosophical Pluralism, and the Relevance of Non-Western Philosophy by 1 Jan 2021 0:08 UTC; 28 points) (
- Cartesian frames as generalised models by 16 Feb 2021 16:09 UTC; 20 points) (
- Additive and Multiplicative Subagents by 6 Nov 2020 14:26 UTC; 19 points) (
- 30 Apr 2021 21:13 UTC; 15 points) 's comment on AMA: Paul Christiano, alignment researcher by (
- 21 Mar 2021 21:09 UTC; 11 points) 's comment on Benito’s Shortform Feed by (
- 26 Oct 2020 20:37 UTC; 8 points) 's comment on Additive Operations on Cartesian Frames by (
- 3 Mar 2021 7:13 UTC; 8 points) 's comment on Weighted Voting Delenda Est by (
- 27 Dec 2020 8:03 UTC; 4 points) 's comment on Sequence introduction: non-agent and multiagent models of mind by (
- 27 Apr 2021 14:09 UTC; 3 points) 's comment on Beware over-use of the agent model by (

Planned summary (of the full sequence) for the Alignment Newsletter:

Planned opinion:

Looks like a pretty good summary to me.

Should that say B instead of A’, or have I misunderstood? (I haven’t read most of the sequence.)

It should, good catch, thanks!

The use of Chu spaces is very interesting. This is also a great introduction to Chu spaces.

I was able to formalize the example in the research automated theorem prover Avalog: https://github.com/advancedresearch/avalog/blob/master/source/chu_space.txt

It is still very basic, but shows potential. Perhaps Avalog might be used to check some proofs about Cartesian frames.

This is very exciting. Looking forward to the rest of the sequence.

As I was reading, I found myself reframing a lot of things in terms of the rows and columns of the matrix. Here’s my loose attempt to rederive most of the properties under this view.

The world is a set of states. One way to think about these states is by putting them in a matrix, which we call “cartesian frame.” In this frame, the rows of the matrix are possible “agents” and the columns are possible “environments”.

Note that you don’t have to put all the states in the matrix.

Ensurables are the part of the world that the agent can always ensure we end up in. Ensurables are the rows of the matrix, closed under supersets

Preventables are the part of the world that the agent can always ensure we don’t end up in. Preventables are the complements of the rows, closed under subsets

Controllables are parts of the world that are both ensurable and preventable. Controlables are rows (or sets of rows) for which there exists rows that are disjoint. [edit: previous definition of “contains elements not found in other rows” was wrong, see comment by crabman]

Observeables are parts of the environment that the agent can observe and act conditionally according to. Observables are columns such that for every pair of rows there is a third row that equals the 1st row if the environment is in that column and the 2nd row otherwise. This means that for every two rows, there’s a third row that’s made by taking the first row and swapping elements with the 2nd row where it intersects with the column.

Observables have to be sets of columns because if they weren’t, you can find a column that is partially observable and partially not. This means you can build an action that says something like “if I am observable, then I am not observable. If I am not observable, I am observable” because the swapping doesn’t work properly.

Observables are closed under boolean combination (note it’s sufficient to show closure under complement and unions):

Since swapping index 1 of a row is the same as swapping all non-1 indexes, observables are closed under complements.

Since you can swap indexes 1 and 2 by first swapping index 1, then swapping index 2, observables are closed under union.

This is equivalent to saying “If A or B, then a0, else a2” is logically equivalent to “if A, then a0, else (if B, then a0, else a2)”

Since controllables are rows with specific properties and observables are columns with specific properties, then nothing can be both controllable and observable. (The only possibility is the entire matrix, which is trivially not controllable because it’s not preventable)

This assumes that the matrix has at least one column

The image of a cartesian frame is the actual matrix part.

Since an ensurable is a row (or superset) and an observable is a column (or set of columns), then if something is ensurable and observable, then it must contain every column, so it must be the whole matrix (image).

If the matrix has 1 or 0 rows, then the observable constraint is trivially satisfied, so the observables are all possible sets of (possible) environment states (since

^{0}⁄_{1}length columns are the same as states).“0 rows” doesn’t quite make sense, but just pretend that you can have a 0 row matrix which is just a set of world states.

If the matrix has 0 columns, then the ensurable/preventable contraint is trivially satisfied, so the ensurables are the same as the preventables are the same as the controllables, which are all possible sets of (possible) environment states (since “length 0” rows are the same as states).

“0 columns doesn’t make that much sense either but pretend that you can have a 0 column matrix which is just a set of world state.

If the matrix has exactly 1 column, then the ensurable/preventable constraint is trivially satisfied

for states in the image (matrix), so the ensurables are all non-empty sets of states in the matrix (since length 1 columns are the same as states), closed under union with states outside the matrix. It should be easy to see that controllables are all possible sets of states that intersect the matrix non-trivially, closed under union with states outside the matrix.Constructing this more explicitly: Suppose that a1e∈S and a2e∈W∖S. Then if(S,a2,a1) must be empty. This is because for any action a3 in the set if(S,a2,a1), if a3e was in S then it would have to equal a2e which is not in S, and if a3e was not in S it would have to equal a1e which is in S.

Since if(S,a2,a1) is empty, S is not observable.

According to your interpretation of controllables, in C2, {ur,us} isn’t controllable, because it contains ur, which can be found in another row. By the original definition, it’s controllable.

Good point—I think the correct definition is something like “rows (or sets of rows) for which there exists a row which is disjoint”

I feel like this analogy should make it possible to compress the definition of some agents; for example the agent that consists of the intersection of two agents, I would expect to be able to be represented as some combination of the two rows representing those two agents. It’s not clear to me how to do that, in particular because the elements of the matrix are “outcomes” which don’t have any arithmetic structure.

I will be hosting a readthrough of this sequence on MIRIxDiscord again, PM for a link.

Do we lose much by restricting attention to finite Cartesian frames (i.e., with finite agent and environment)? I ask because I’m formalising these results in higher-order logic (HOL), and the category Chu(w) is too big to represent if it really must contain frames with infinite agents and also for any pair of frames the frame whose agents are the morphisms between them. The root problem is probably that I require any category’s class of objects to be a set, but it’s hard to avoid this requirement in HOL in a nice way. Everything should work out for finite frames though. (I haven’t come across any compelling examples of infinite frames, but I haven’t tried hard to think of them.)

I don’t think you lose much by focusing on finite Cartesian frames. I have mostly only been imagining finite cases.

I think there is some potential for later extending the theory to encompass game theory and probabilistic strategies, and then we might want to think of the infinite space of mixed strategies as the agent, but it wouldn’t surprise me if in doing this, we also put continuity into the system and want to assume compactness.

To see that

somerestriction is required here (not imposed by HOL), consider that if Chu(w) may contain arbitrary Cartesian frames over w then we would have an injection 2Chu(w)→Chu(w) that, for example, encodes a set S⊆Chu(w) as the Cartesian frame CS with Agent(CS)=S (the environment and evaluation function are unimportant), which runs afoul of Cantor’s theorem regarding the cardinality of Chu(w).I wouldn’t be surprised if a similar encoding/injection could be made using just the operations used to construct Cartesian frames that appear in this sequence—though I have not found one explicitly myself yet.

Curated.

I’m exceedingly excited about this sequence. The Embedded Agency sequence laid out a core set of confusions, and it seems like this is a formal system that deals with those issues far better than the current alternatives e.g. the cybernetics model. This post lays out the basics of Cartesian Frames clearly and communicates key parts of the overall approach (“

reasoninglike Pearl’s toobjectslike game theory’s, with amotivationlike Hutter’s”). I’ve also never seen math explained with as much helpful philosophical justification (e.g. “Part of the point of the Cartesian frame framework is that we are not privileging either interpretation”), and I appreciate all of that quite a bit.It seems likely that by the end of this sequence it will be on a list of my all-time favorite things posted to LessWrong 2.0. I’m looking forward to getting to grips with Cartesian Frames, understanding how they work, and to start applying those intuitions to my other discussions of agency.

I’m also curating it a little quickly to let people know that Scott is giving a talk on this sequence this Sunday at 12:00PM PT. Furthermore, Scott is holding weekly office hours (see the same link for more info) for people to ask questions, and Diffractor is running a reading group in the MIRIx Discord, which I recommend people PM him to get an invite to (I just did so myself, it’s a nice Discord server).

If every pair (a,e) led to a different world-state, this would be the boring case of complete factorizability, right? As in, you couldn’t distinguish this from the world having no dynamics at all, just a recording of the choices of a and e. Therefore it seems important that your dynamics send some pairs of choices to identical states.

But that’s not necessarily how the micro-scale laws of physics work. You can’t squish state space irreversibly like that. And so W can’t be the actual microphysical world, it has to be some macro-level abstract model of it, or else it’s boring.

So I’m a little confused about what you have in mind when you talk about putting different bases A and E onto the same W. What’s so great about keeping the same W, if it’s an abstraction of the microphysical world, tailor-made to help us model exactly this agent? I suspect that the answer is that you’re using this to model an agent that also has subagents, so I’m excited for that post :)

Your suspected answer right.

In 4.1:

I think there’s a typo here. Should be a3∈if(T,a0,a2), not a3∈if(S,a0,a2).

(also not sure how to copy latex properly).

Yep. Fixed. Thanks.

Printable PDF of whole sequence with comments

https://drive.google.com/file/d/1gW6btBWvk1mMCPItLt9wPYApc8caiUCa/view?usp=drivesdk

Did posts on generalised models as a category and how one can see Cartesian frames as generalised models.

I like it. I’ll think about how it fits with my ways of thinking (eg model splintering).

Great framework—feels like this is touching on something fundamental.

I’m curious: is the controllable / observable terminology intentionally borrowed from control theory? Or is that a coincidence?

coincidence

“Because we’re discussing an agent that has the freedom to choose between multiple possibilities”

Where is this freedom, exactly? Freedom meaning a lack of being utterly and totally constrained by prior causes.

First: https://www.lesswrong.com/posts/Mc6QcrsbH5NRXbCRX/dissolving-the-question and then: http://lesswrong.com/lw/r0/thou_art_physics/