Botworld: a cellular automaton for studying self-modifying agents embedded in their environment

On April 1, I started work­ing full-time for MIRI. In the weeks prior, while I was wind­ing down my job and pack­ing up my things, Benja and I built Bot­world, a cel­lu­lar au­toma­ton that we’ve been us­ing to help us study self-mod­ify­ing agents. To­day, we’re pub­li­cly re­leas­ing Bot­world on the new MIRI github page. To give you a feel for Bot­world, I’ve re­pro­duced the be­gin­ning of the tech­ni­cal re­port be­low.


This re­port in­tro­duces Bot­world, a cel­lu­lar au­toma­ton that pro­vides a toy en­vi­ron­ment for study­ing self-mod­ify­ing agents.

The tra­di­tional agent frame­work, used for ex­am­ple in Markov De­ci­sion Pro­cesses and in Mar­cus Hut­ter’s uni­ver­sal agent AIXI, splits the uni­verse into an agent and an en­vi­ron­ment, which in­ter­act only via dis­crete in­put and out­put chan­nels.

Such for­mal­isms are per­haps ill-suited for real self-mod­ify­ing agents, which are em­bed­ded within their en­vi­ron­ments. In­deed, the agent/​en­vi­ron­ment sep­a­ra­tion is some­what rem­i­nis­cent of Carte­sian du­al­ism: any agent us­ing this frame­work to rea­son about the world does not model it­self as part of its en­vi­ron­ment. For ex­am­ple, such an agent would be un­able to un­der­stand the con­cept of the en­vi­ron­ment in­terfer­ing with its in­ter­nal com­pu­ta­tions, e.g. by in­duc­ing er­rors in the agent’s RAM through heat.

In­tu­itively, this sep­a­ra­tion does not seem to be a fatal flaw, but merely a tool for sim­plify­ing the dis­cus­sion. We should be able to re­move this “Carte­sian” as­sump­tion from for­mal mod­els of in­tel­li­gence. How­ever, the con­crete non-Carte­sian mod­els that have been pro­posed (such as Orseau and Ring’s for­mal­ism for space-time em­bed­ded in­tel­li­gence, Vladimir Slep­nev’s mod­els of up­date­less de­ci­sion the­ory, and Yud­kowsky and Her­reshoff’s tiling agents) de­part sig­nifi­cantly from their Carte­sian coun­ter­parts.

Bot­world is a toy ex­am­ple of the type of uni­verse that these for­mal­isms are de­signed to rea­son about: it pro­vides a con­crete world con­tain­ing agents (“robots”) whose in­ter­nal com­pu­ta­tions are a part of the en­vi­ron­ment, and al­lows us to study what hap­pens when the Carte­sian bar­rier be­tween an agent and its en­vi­ron­ment breaks down. Bot­world al­lows us to write de­ci­sion prob­lems where the Carte­sian bar­rier is rele­vant, pro­gram ac­tual agents, and run the sys­tem.

As it turns out, many in­ter­est­ing prob­lems arise when agents are em­bed­ded in their en­vi­ron­ment. For ex­am­ple, agents whose source code is read­able may be sub­jected to New­comb-like prob­lems by en­tities that simu­late the agent and choose their ac­tions ac­cord­ingly.

Fur­ther­more, cer­tain ob­sta­cles to self-refer­ence arise when non-Carte­sian agents at­tempt to achieve con­fi­dence in their fu­ture ac­tions. Some of these is­sues are raised by Yud­kowsky and Her­reshoff; Bot­world gives us a con­crete en­vi­ron­ment in which we can ex­am­ine them.

One of the pri­mary benefits of Bot­world is con­crete­ness: when work­ing with ab­stract prob­lems of self-refer­ence, it is of­ten very use­ful to see a con­crete de­ci­sion prob­lem (“game”) in a fully speci­fied world that di­rectly ex­hibits the ob­sta­cle un­der con­sid­er­a­tion. Bot­world makes it eas­ier to vi­su­al­ize these ob­sta­cles.

Con­versely, Bot­world also makes it eas­ier to vi­su­al­ize sug­gested agent ar­chi­tec­tures, which in turn makes it eas­ier to vi­su­al­ize po­ten­tial prob­lems and probe the ar­chi­tec­ture for edge cases.

Fi­nally, Bot­world is a tool for com­mu­ni­cat­ing. It is our hope that Bot­world will help oth­ers un­der­stand the vary­ing for­mal­isms for self-mod­ify­ing agents by giv­ing them a con­crete way to vi­su­al­ize such ar­chi­tec­tures be­ing im­ple­mented. Fur­ther­more, Bot­world gives us a con­crete way to illus­trate var­i­ous ob­sta­cles, by im­ple­ment­ing Bot­world games in which the ob­sta­cles arise.

Bot­world has helped us gain a deeper un­der­stand­ing of vary­ing for­mal­isms for self-mod­ify­ing agents and the ob­sta­cles they face. It is our hope that Bot­world will help oth­ers more con­cretely un­der­stand these is­sues as well.

Overview

Bot­world is a high level cel­lu­lar au­toma­ton: the con­tents of each cell can be quite com­plex. In­deed, cells may house robots with reg­ister ma­chines, which are run for a fixed amount of time in each cel­lu­lar au­toma­ton step. A brief overview of the cel­lu­lar au­toma­ton fol­lows. After­wards, we will pre­sent the de­tails along with a full im­ple­men­ta­tion in Haskell.

Bot­world con­sists of a grid of cells, each of which is ei­ther a square or an im­pass­able wall. Each square may con­tain an ar­bi­trary num­ber of robots and items. Robots can nav­i­gate the grid and pos­sess tools for ma­nipu­lat­ing items. Some items are quite use­ful: for ex­am­ple, shields can pro­tect robots from at­tacks by other robots. Other items are in­trin­si­cally valuable, though the val­ues of var­i­ous items de­pends upon the game be­ing played.

Among the items are robot parts, which the robots can use to con­struct other robots. Robots may also be bro­ken down into their com­po­nent parts (hence the ne­ces­sity for shields). Thus, robots in Bot­world are quite ver­sa­tile: a well-pro­grammed robot can re­assem­ble its en­e­mies into al­lies or con­struct a robot horde.

Be­cause robots are tran­sient ob­jects, it is im­por­tant to note that play­ers are not robots. Many games be­gin by al­low­ing each player to spec­ify the ini­tial state of a sin­gle robot, but clever play­ers will write pro­grams that soon dis­tribute them­selves across many robots or con­struct fleets of al­lied robots. Thus, Bot­world games are not scored de­pend­ing upon the ac­tions of the robot. In­stead, each player is as­signed a home square (or squares), and Bot­world games are scored ac­cord­ing to the items car­ried by all robots that are in the player’s home square at the end of the game. (We may imag­ine these robots be­ing air­lifted and the items in their pos­ses­sion be­ing given to the player.)

Robots can­not see the con­tents of robot reg­ister ma­chines by de­fault, though robots can ex­e­cute an in­spec­tion to see the pre­cise state of an­other robot’s reg­ister ma­chine. This is one way in which the Carte­sian bound­ary can break down: It may not be enough to choose an op­ti­mal ac­tion, if the way in which this ac­tion is com­puted can mat­ter.

For ex­am­ple, imag­ine a robot which tries to ex­e­cute an ac­tion that it can prove will achieve a cer­tain min­i­mum ex­pected util­ity u_min. In the tra­di­tional agent frame­work, this can im­ply an op­ti­mal­ity prop­erty: if there is any pro­gram p our robot could have run such that our robot can prove that p would have re­ceived ex­pected util­ity ≥ u_min, then our robot will re­ceive ex­pected util­ity ≥ u_min (be­cause it can always do what that other pro­gram would have done). But sup­pose that this robot is placed into an en­vi­ron­ment where an­other robot reads the con­tents of the first robot’s reg­ister ma­chine, and gives the first robot a re­ward if and only if the first robot runs the pro­gram “do noth­ing ever”. Then, since this is not the pro­gram our robot runs, it will not re­ceive the re­ward.

It is im­por­tant to note that there are two differ­ent no­tions of time in Bot­world. The cel­lu­lar au­toma­ton evolu­tion pro­ceeds in dis­crete steps ac­cord­ing to the rules de­scribed be­low. Dur­ing each cel­lu­lar au­toma­ton step, the ma­chines in­side the robots are run for some finite num­ber of ticks.

Like any cel­lu­lar au­toma­ton, Bot­world up­dates in dis­crete steps which ap­ply to ev­ery cell. Each cell is up­dated us­ing only in­for­ma­tion from the cell and its im­me­di­ate neigh­bors. Roughly speak­ing, the step func­tion pro­ceeds in the fol­low­ing man­ner for each in­di­vi­d­ual square:

The out­put reg­ister of the reg­ister ma­chine of each robot in the square is read to de­ter­mine the robot’s com­mand. Note that robots are ex­pected to be ini­tial­ized with their first com­mand in the out­put reg­ister.

The com­mands are used in ag­gre­gate to de­ter­mine the robot ac­tions. This in­volves check­ing for con­flicts and in­valid com­mands.

The list of items ly­ing around in the square is up­dated ac­cord­ing to the robot ac­tions. Items that have been lifted or used to cre­ate robots are re­moved, items that have been dropped are added.

Robots in­com­ing from neigh­bor­ing squares are added to the robot list.

Newly cre­ated robots are added to the robot list.

The in­put reg­isters are set on all robots. Robot in­put in­cludes a list of all robots in the square (in­clud­ing ex­it­ing, en­ter­ing, de­stroyed, and cre­ated robots), the ac­tions that each robot took, and the up­dated item list.

Robots that have ex­ited the square or that have been de­stroyed are re­moved from the robot list.

All re­main­ing robots have their reg­ister ma­chines ex­e­cuted (and are ex­pected to leave a com­mand in the out­put reg­ister.)

Th­ese rules al­low for a wide va­ri­ety of games, from NP-hard knap­sack pack­ing games to difficult New­comb-like games such as a var­i­ant of the Parfit’s hitch­hiker prob­lem (wherein a robot will drop a valuable item only if it, af­ter simu­lat­ing your robot, con­cludes that your robot will give it a less valuable item).

Carte­si­anism in Botworld

Though we have stated that we mean to study non-Carte­sian for­mal­iza­tions of in­tel­li­gence, Bot­world does in fact have a “Carte­sian” bound­ary. The robot parts are fun­da­men­tal ob­jects, the ma­chine reg­isters are non-re­ducible. The im­por­tant prop­erty of Bot­world is not that it lacks a Carte­sian bound­ary, but that the bound­ary is break­able.

In the real world the ex­e­cu­tion of a com­puter pro­gram is un­af­fected by the en­vi­ron­ment most of the time (ex­cept via the nor­mal in­put chan­nels). While the con­tents of a com­puter’s RAM can be changed by heat­ing it up with a desk lamp, they are usu­ally not. An Ar­tifi­cial Gen­eral In­tel­li­gence (AGI) would pre­sum­ably make use of this fact. Thus, an AGI may com­monly wish to en­sure that its Carte­sian bound­ary is not vi­o­lated in this way over some time pe­riod (dur­ing which it can make use of the nice prop­er­ties of Carte­sian frame­works). Bot­world at­tempts to model this in a sim­ple way by re­quiring agents to con­tend with the pos­si­bil­ity that they may be de­stroyed by other robots.

More prob­le­mat­i­cally, in the real world, the in­ter­nals of a com­puter pro­gram will always af­fect the en­vi­ron­ment—for ex­am­ple, through waste heat emit­ted by the com­puter—but it seems likely that these effects are usu­ally un­pre­dictable enough that an AGI will not be able to im­prove its perfor­mance by care­fully choos­ing e.g. the pat­tern of waste heat it emits. How­ever, an AGI will need to en­sure that these un­avoid­able vi­o­la­tions of its Carte­sian bound­ary will in fact not make an ex­pected differ­ence to its goals. Bot­world sidesteps this is­sue and only re­quires robots to deal with a more tractable is­sue: Con­tend­ing with the pos­si­bil­ity that their source code might be read by an­other agent.

Our model is not re­al­is­tic, but it is sim­ple to rea­son about. For all that the robot ma­chines are not re­ducible, the robots are still em­bed­ded in their en­vi­ron­ment, and they can still be read or de­stroyed by other agents. We hope that this cap­tures some of the com­plex­ity of nat­u­ral­is­tic agents, and that it will serve as a use­ful test bed for for­mal­isms de­signed to deal with this com­plex­ity. Although be­ing able to deal with the challenges of Bot­world is pre­sum­ably not a good in­di­ca­tor that a for­mal­ism will be able to deal with all of the challenges of nat­u­ral­is­tic agents, it al­lows us to see in con­crete terms how it deals with some of them.

In cre­at­ing Bot­world we tried to build some­thing im­ple­mentable by a lower-level sys­tem, such as Con­way’s Game of Life. It is use­ful to imag­ine such an im­ple­men­ta­tion when con­sid­er­ing Bot­world games.

Fu­ture ver­sions of Bot­world may treat the robot bod­ies as less fun­da­men­tal ob­jects. In the mean­time, we hope that it is pos­si­ble to pic­ture an im­ple­men­ta­tion where the Carte­sian bound­ary is much less fun­da­men­tal, and to use Bot­world to gain use­ful in­sights about agents em­bed­ded within their en­vi­ron­ment. Our in­tent is that when we ap­ply a for­mal­ism for nat­u­ral­is­tic agents to the cur­rent im­ple­men­ta­tion of Bot­world, then there will be a straight­for­ward trans­la­tion to an ap­pli­ca­tion of the same for­mal­ism to an im­ple­men­ta­tion of Bot­world in (say) the Game of Life.


The full tech­ni­cal re­port goes on to provide an im­ple­men­ta­tion of Bot­world in Haskell. You can find the source code on the MIRI Bot­world repos­i­tory. Sam­ple games are forth­com­ing.

Benja and I will be writ­ing up some of the re­sults we’ve achieved. In the mean­time, you’re en­couraged to play around with it and build some­thing cool.