Cartesian Boundary as Abstraction Boundary

Meta note: this post in­volves re­defin­ing sev­eral con­cepts in ways al­most, but not quite, the same as their usual defi­ni­tions—risk­ing quite a bit of con­fu­sion. I’m open to sug­ges­tions for bet­ter names for things.

In­tu­ition: a model of hu­mans (or AIs, or other agenty things) as “agents” sep­a­rate from their “en­vi­ron­ment” only makes sense at a zoomed-out, ab­stract level. When we look at the close-up de­tails, the agent-en­vi­ron­ment dis­tinc­tion doesn’t ex­ist; there is no “Carte­sian bound­ary” with in­ter­ac­tions be­tween the two sides me­di­ated by well-defined in­put/​out­put chan­nels.

Mean­while, back in the lab, some guy has been talk­ing about a gen­eral for­mal­iza­tion of ab­strac­tion. Ele­va­tor pitch ver­sion: far-apart com­po­nents of a low-level model are in­de­pen­dent given some high-level sum­mary data. For ex­am­ple, the dy­nam­ics of the plas­mas in far-apart stars are ap­prox­i­mately in­de­pen­dent given some high-level sum­mary of each star—speci­fi­cally its to­tal mass, mo­men­tum and cen­ter-of-mass po­si­tion. The low-level model in­cludes full plasma dy­nam­ics; the high-level ab­stract model just in­cludes one point mass rep­re­sent­ing each star.

This no­tion of ab­strac­tion sounds like it should fit our in­tu­ition about agency: we model hu­mans/​AIs/​etc as agents in some high-level model, but not nec­es­sar­ily in the un­der­ly­ing low-level model.

This post will flesh out that idea a bit. It will illus­trate the gen­eral pic­ture of an ab­strac­tion bound­ary serv­ing as a Carte­sian bound­ary for a high-level agent model.


To avoid pre­ma­turely drag­ging in con­fus­ing in­tu­itions about agency, we’ll start by talk­ing about a pro­gram run­ning on a lap­top.

The source code of a pro­gram speci­fies ex­actly what it should do at a high level; it speci­fies the ab­stract model. It does not spec­ify the low-level be­hav­ior, e.g. ma­chine code and reg­ister ad­dresses, or pre­cise voltages in each tran­sis­tor at each time, or the phys­i­cal lay­out of cir­cuit com­po­nents on the CPU. The job of com­piler/​op­er­at­ing sys­tem writ­ers and elec­tri­cal en­g­ineers is to de­sign and im­ple­ment a low-level sys­tem which cor­rectly ab­stracts into the high-level model speci­fied by the source code. We’ll call the whole sys­tem for im­ple­ment­ing pro­grams out of atoms a “com­piler stack”; it in­cludes not just com­pilers, but also the phys­i­cal hard­ware on which a pro­gram runs. It’s an end-to-end tool for turn­ing source code into phys­i­cal be­hav­ior.

What does it mean, math­e­mat­i­cally, to de­sign such a com­piler stack? How do we for­mal­ize that prob­lem?

Com­piler stack de­sign prob­lem: make this ab­strac­tion work, with the high-level model match­ing the speci­fi­ca­tion given by some source code.

Well, we have a model of the whole world, and we’d like to fac­tor out two high-level com­po­nents: the sys­tem (i.e. pro­gram run­ning on a lap­top) and ev­ery­thing “far away” from the sys­tem. In be­tween will be lots of noisy vari­ables which me­di­ate in­ter­ac­tions be­tween these two com­po­nents—air molecules, com­mu­ni­ca­tion in­fras­truc­ture, etc. Those noisy in­ter­me­di­ates will wipe out most in­for­ma­tion about low-level de­tails: things far away from the lap­top won’t be able to de­tect pre­cise voltages in in­di­vi­d­ual tran­sis­tors, and the lap­top it­self won’t be able to de­tect po­si­tions of in­di­vi­d­ual molecules in far-away ob­jects (though note that our no­tion of “far away” can be ad­justed to ac­count for e.g. fancy sen­sors; it need not be phys­i­cal dis­tance). With the noisy in­ter­me­di­ates wiping out all that low-level de­tail, the lap­top’s be­hav­ior will de­pend only on some high-level sum­mary data about things far away, and the be­hav­ior of things far away will de­pend only on some high-level sum­mary data about the lap­top.

The high-level sum­maries typ­i­cally go by spe­cial names:

  • The high-level sum­mary con­tain­ing in­for­ma­tion about the lap­top which is rele­vant to the rest of the world is called “out­put”.

  • The high-level sum­mary con­tain­ing in­for­ma­tion about the rest of the world which is rele­vant to the lap­top is called “in­put”.

The goal of the com­piler stack is to make the out­put match the speci­fi­ca­tion given by the source code, on any in­put, while min­i­miz­ing side-chan­nel in­ter­ac­tion with the rest of the world (i.e. main­tain­ing the ab­strac­tion bound­ary).

For in­stance, let’s say I write and run a sim­ple python script:

x = 12
print(5*x + 3)

This script speci­fies a high-level be­hav­ior—speci­fi­cally, the num­ber “63” ap­pear­ing on my screen. That’s the part visi­ble from “far away”—i.e. visi­ble to me, the user. The rest is gen­er­ally in­visi­ble: I don’t ob­serve voltages in spe­cific tran­sis­tors, or val­ues in reg­isters, or ma­chine code, or …. Of course I could open up the box and ob­serve those things, but then I’d be break­ing the ab­strac­tion. In gen­eral, when some as­pect of the ex­e­cut­ing pro­gram is ex­ter­nally-visi­ble be­sides the source-code-speci­fied high-level be­hav­ior, we say that the ab­strac­tion is leak­ing. The point of a com­piler stack is to min­i­mize that leak­age: to pro­duce a sys­tem which yields the speci­fied high-level in­put-out­put be­hav­ior, with min­i­mal side-chan­nel in­ter­ac­tion with the world oth­er­wise.

Carte­sian Boundary

Ap­ply­ing this to agents is triv­ial: just imag­ine the source code speci­fies an agenty AI.

More gen­er­ally, to model a real-world sys­tem as an agent with a Carte­sian bound­ary, we can break the world into:

  • The sys­tem it­self (the “agent”)

  • Things “far away” from the sys­tem (the “en­vi­ron­ment”)

  • Noisy vari­ables in­ter­me­di­at­ing in­ter­ac­tions be­tween the two, which we ig­nore (i.e. in­te­grate out of the model)

The noisy in­ter­me­di­ates wipe out most of the low-level in­for­ma­tion about the two com­po­nents, so only some high-level sum­mary data about the agent is rele­vant to the en­vi­ron­ment, and only some high-level sum­mary data about the en­vi­ron­ment is rele­vant to the agent. We call the agent’s high-level sum­mary data “ac­tions”, and we call the en­vi­ron­ment’s high-level sum­mary data “ob­ser­va­tions”.

Be­cause of the noisy in­ter­me­di­ates, the agent’s in­com­ing data may not perfectly con­vey the en­vi­ron­ment’s high-level sum­mary.


This is not iden­ti­cal to the usual no­tion of agency used in de­ci­sion the­ory/​game the­ory. I’ll talk about a cou­ple differ­ences, but these ideas are still rel­a­tively fresh so there’s prob­a­bly many more.

Im­perfect Knowl­edge of Inputs

One po­ten­tially-sur­pris­ing im­pli­ca­tion of this model: the “in­puts” are not the raw data re­ceived by the agent; they’re per­haps more prop­erly thought of as “out­puts of the en­vi­ron­ment” than “in­puts of the agent”. In the tra­di­tional Carte­sian bound­ary model, these co­in­cide—i.e. the tra­di­tional model is an ex­act ab­strac­tion, with­out any noisy in­ter­me­di­ates be­tween agent and en­vi­ron­ment. But in this model, the “chan­nel” be­tween en­vi­ron­ment-out­puts and agent-in­com­ing-data is noisy and po­ten­tially quite com­pli­cated in its own right.

The up­shot is that, un­der this model, agents may not have perfect knowl­edge of their own in­puts.

This idea isn’t en­tirely novel. Abram has writ­ten be­fore about agents be­ing un­cer­tain about their own in­puts, and cited some pre­vi­ous liter­a­ture on the topic, un­der the name “rad­i­cal prob­a­bil­ism”.

Note that we could, in prin­ci­ple, still define “in­puts” in the more tra­di­tional way, as the agent’s in­com­ing data. But then we don’t have as clean a fac­tor­iza­tion of the model; there is no nat­u­ral high-level ob­ject cor­re­spond­ing to the agent’s in­com­ing data. Alter­na­tively, we could treat the noisy in­ter­me­di­ates as them­selves part of the en­vi­ron­ment, but this con­flicts with the ab­sence of hard bound­aries around real-world agenty sys­tems.

Flex­ible Boundaries

I’ve said be­fore:

I ex­pect the “bound­ary” of an agent to be fuzzy on the “in­puts” side, and less fuzzy but still flex­ible on the “out­puts” side. On the in­puts side, there’s a whole chain of cause-and-effect which feeds data into my brain, and there’s some free­dom in whether to con­sider “me” to be­gin at e.g. the eye, or the pho­tore­cep­tor, or the op­tic nerve, or… On the out­puts side, there’s a clearer crite­rion for what’s “me”: it’s what­ever things I’m “choos­ing” when I op­ti­mize, i.e. any­thing I as­sume I con­trol for plan­ning pur­poses. That’s a sharper crite­rion, but it still leaves a lot of flex­i­bil­ity—e.g. I can con­sider my car a part of “me” while I’m driv­ing it.

The Carte­sian-bound­ary-as-ab­strac­tion-bound­ary model fits these ideas quite neatly.

(Po­ten­tial point of con­fu­sion: that quote uses “in­put/​out­put” in the usual Carte­sian bound­ary sense of the words, not in the man­ner used in this post. I’ll use the phrase “in­com­ing data” to dis­t­in­guish what-data-reaches-the-agent from the en­vi­ron­ment-sum­mary-data.)

We model the agent as us­ing the agent-en­vi­ron­ment ab­strac­tion to model it­self. It has prefer­ences over the en­vi­ron­ment-sum­mary (the “in­puts”), and op­ti­mizes the agent-sum­mary (the “out­puts”). But ex­actly what chunk-of-the-world is called “the agent” is flex­ible—if I draw the bound­ary right out­side of my eye/​my pho­tore­cep­tors/​my op­tic nerve, all three of those carry mostly-similar in­for­ma­tion about far-away things, and all three will likely im­ply very similar high-level sum­maries of the en­vi­ron­ment. So the over­all model is rel­a­tively in­sen­si­tive to where the bound­ary is drawn on the in­com­ing-data side.

But the out­put side is differ­ent. That’s not be­cause the model is more sen­si­tive to the bound­ary on the out­put side, but be­cause the out­put it­self is an ob­ject in the high-level model, while in­com­ing data is not; the high-level model con­tains the en­vi­ron­ment-sum­mary and the out­puts, but not the agent’s in­com­ing data.

In other words: the high-level agent-en­vi­ron­ment model is rel­a­tively in­sen­si­tive to cer­tain changes in where the bound­ary is drawn. As a re­sult, ob­jects which live in the low-level model (e.g. the agent’s in­com­ing data) lack a sharp nat­u­ral defi­ni­tion, whereas ob­jects which live in the high-level model (e.g. the “in­puts”=en­vi­ron­ment sum­mary and “out­puts”=agent sum­mary) are more sharply defined.

This goes back to the ba­sic prob­lem of defin­ing “in­puts”: there is no high-level ob­ject cor­re­spond­ing to the agent’s in­com­ing data. That in­com­ing data lives only in the low-level model. The high-level model just con­tains the en­vi­ron­ment-sum­mary.