# Distributed Cooperation

Reflec­tive or­a­cles can be ap­prox­i­mated by com­put­ing Nash equil­ibria. But is there some pro­ce­dure that pro­duces a Pareto-op­ti­mal equil­ibrium in a game, aka, a point pro­duced by a Co­op­er­a­tive or­a­cle? It turns out there is. There are some in­ter­est­ing philo­soph­i­cal as­pects to it, which will be typed up in the next post.

The re­sult is not origi­nal to me, it’s been float­ing around MIRI for a while. I think Scott, Sam, and Abram worked on it, but there might have been oth­ers. All I did was for­mal­ize it a bit, and gen­er­al­ize from the 2-player 2-move case to the n-player n-move case. With the for­mal­ism here, it’s a bit hard to in­tu­itively un­der­stand what’s go­ing on, so I’ll in­di­cate where to vi­su­al­ize an ap­pro­pri­ate 3-di­men­sional ob­ject.

## Defi­ni­tions:

First, we will define cell sets and cell prod­ucts.

# Cell Set and Cell Product:

A fam­ily of sets is a cell set of some con­vex poly­tope in finite-di­men­sional eu­clidean space, iff all el­e­ments of the fam­ily are com­pact con­vex poly­topes, the union of the fam­ily equals the origi­nal con­vex poly­tope, and for all pairs of el­e­ments of the fam­ily, the in­ter­sec­tion has Lebesgue mea­sure 0.

More in­tu­itively, a cell set is just a set of cells where they perfectly cover some space you want, the cells only over­lap at bound­aries, and the cells are all finitely big, con­vex, and in­clude their bound­aries (so a point may be part of mul­ti­ple cells if it’s on a bound­ary, that’s in­ten­tional). If you make a big cube out of a bunch of lit­tle cubes ar­ranged in a space-filling hon­ey­comb pat­tern, the set of all those lit­tle cubes is a cell set of the big cube.

Now the cell product will be defined. Fix two cell sets, and . In­dex their el­e­ments with the in­dex sets and . Let be the cell in with the in­dex . Define the cell product as fol­lows:

This fam­ily of sets has the fol­low­ing prop­er­ties:

1: All el­e­ments are com­pact con­vex poly­topes, be­cause the product of two con­vex com­pact poly­topes is a con­vex and com­pact poly­tope.

2: Given a point in , the pro­jec­tion of that point to lies in­side some cell in , and the same goes for . There­fore, the point lies in the cell that is the product of the cells in and it was pro­jected into. Similarly, given some ar­bi­trary point in an ar­bi­trary cell of the product, that cell is as­so­ci­ated with a pair of in­dices, so the pro­jec­tion of the point to the space con­tain­ing lies in the cell as­so­ci­ated with the first in­dex, so it lies in . A similar ar­gu­ment ap­plies for . There­fore, the union of all cells in the cell product of and equals the space .

3: The in­ter­sec­tion of two cells with in­dices and is the product of the in­ter­sec­tion of cell and cell , and the in­ter­sec­tion of cell and cell . All these in­ter­sec­tions have Lebesgue mea­sure 0, so the in­ter­sec­tion of those two cells must also have Lebesgue mea­sure 0.

There­fore, the cell product of any two cell sets and is a cell set of .

Now, let’s vi­su­al­ize a con­crete ex­am­ple of a cell product. Let be the unit side length tri­an­gle, and be the set of -length tri­an­gles in the tri­an­gu­lar tes­sel­la­tion of . Let be the unit length line seg­ment, and be the set of -length line seg­ments that go end-to-end to cover the unit length line seg­ment. The set would be a unit side length tri­an­gu­lar prism.

The cell product would be the set of lit­tle tri­an­gu­lar prisms of side length . Th­ese lit­tle cells com­pletely fill the big tri­an­gu­lar prism, each one of them is the product of a cell from and a cell from , and all pairs of in­dices from the in­dex sets cor­re­spond to a unique lit­tle tri­an­gu­lar prism.

Given a se­quence of cell sets, let be the cell product of the se­quence.

# Game The­ory Defi­ni­tions, With Cells:

is the prob­a­bil­ity dis­tri­bu­tion over pos­si­ble moves for player . is the prob­a­bil­ity of player play­ing move

is the space of strate­gies for the finite game, a product of prob­a­bil­ity dis­tri­bu­tions over finitely many moves for each player.

is some small finite num­ber of the form where x is a large in­te­ger.

Given some prob­a­bil­ity dis­tri­bu­tion , di­vide it into cells as fol­lows:

Slice it along the lines/​planes/​spaces where , and the lines/​planes/​spaces where , and so on. As a sim­ple ex­am­ple, in the 2-move case, this is split­ting a line seg­ment into smaller line seg­ments, in the 3-move case, this is split­ting a tri­an­gle into smaller tri­an­gles via a tri­an­gu­lar tiling, and in the 4-move case, this is split­ting a tetra­he­dron into tetra­he­dra and oc­ta­he­dra via the tetra­he­dral-oc­ta­he­dral hon­ey­comb.

is the re­sult­ing cell set of . Let be the in­dex set for some in­dex­ing of the cells in .

is defined in the ob­vi­ous way, as the cell product of the cell sets for each player, which is also a canon­i­cal cell set for .

Let . This is the in­dex set for . An el­e­ment of this set, , speci­fies a par­tic­u­lar cell . Let be the ’th el­e­ment of the se­quence . This is an el­e­ment of and speci­fies a par­tic­u­lar cell .

Given a point , let be the ’th com­po­nent of the point.

A Kaku­tani strat­egy for player is a func­tion that is nonempty, con­vex, and has closed graph, for all in­puts.

If all play­ers are us­ing a Kaku­tani strat­egy, then the func­tion where is nonempty, con­vex, and has closed graph, for all in­puts, and by Kaku­tani’s fixed point the­o­rem, there are some fixed points in .

Given a Kaku­tani strat­egy for player , let the func­tion be the func­tion such that:

This could be thought of as the func­tion where ev­ery­one else has locked in their move, and only player may re­spond with some other move.

A cell is ruled out by player if there is no point in the cell such that . Pretty much, no mat­ter the point in the cell, if ev­ery­one else tries to play it, player will go “nope, not play­ing this point”. If a cell is ruled out by some player, the fixed point of can­not lie within that cell.

Given some util­ity func­tion for each player over the out­come of the game, is the util­ity vec­tor made of the util­ity each player at­tains, as­sum­ing the cen­ter point of the cell is played in the game. Yes, not all points in at­tain the same util­ity, but for a suffi­ciently small , the cell is re­ally small, and for any player, the util­ity over the cell is well-ap­prox­i­mated by the util­ity at­tained at the mid­dle point in the cell. is the util­ity that the ’th player at­tains.

Every player also has a rank­ing of cells, that works as fol­lows:

In short, the rank­ing is or­dered by player ’s util­ity func­tion first and fore­most.

Also, if is a Pareto im­prove­ment on , then

.

This can be ex­tended to a rank­ing of a sub­set of cells in the ob­vi­ous way, by just re­mov­ing all cells out­side of the sub­set.

## The Pro­ce­dure:

The fol­low­ing game is played: There is a set of ruled out cells in . A ran­dom num­ber gen­er­a­tor is con­sulted, and used to as­sign ev­ery player a set of ruled out cells that they are re­spon­si­ble for. The play­ers may mod­ify their strat­egy to what­ever they want, sub­ject to three con­straints.

1: Their new strat­egy must be a Kaku­tani func­tion.

2: Their new strat­egy must rule out all cells that they are re­spon­si­ble for.

3: The set of cells that haven’t been ruled out yet, that their new strat­egy rules out, must be an ini­tial seg­ment of the rank­ing made by ap­ply­ing to the set of cells that aren’t ruled out.

If there is a player who can­not mod­ify their strat­egy ac­cord­ingly, or if no ad­di­tional cells can be re­moved, then the game re­peats with a new ran­dom as­sign­ment, and only ends if one cell is left.

The start­ing set of ruled out cells is .

In­tu­itively, the game is some­thing like “ev­ery­one sorts the list of out­comes by their own util­ity first, and by ev­ery­one else’s sec­ond, and then you loop through game rounds where ev­ery­one has to im­ple­ment a re­al­is­tic strat­egy that rules out some strate­gies that some­one else has vowed to rule out first, and also rules out per­son­ally ab­hor­rent points if it’s pos­si­ble, un­til it’s nar­rowed down and ev­ery­one agrees which cell to play”.

This isn’t quite as co­op­er­a­tive as it seems. Sort­ing equiv­a­lent out­comes by Pareto-op­ti­mal­ity for ev­ery­one else isn’t mo­ti­vated by purely self­ish in­ter­est, but it’s at least com­pat­i­ble with it. Every­one can simu­late what strat­egy ev­ery­one else picked, so ev­ery­one knows which cells have got­ten ruled out. And tak­ing re­spon­si­bil­ity for rul­ing out some­one else’s cells isn’t re­ally an im­po­si­tion, be­cause if the game ended right there, those cells couldn’t pos­si­bly have been played any­ways.

If this pro­cess even­tu­ally con­verges to one cell re­main­ing, then the cell must be on the Pareto fron­tier of the set of cells. The ar­gu­ment is very sim­ple. If there was a Pareto im­prove­ment on the fi­nal cell, then it would have had a higher rank than the fi­nal cell in ev­ery­body’s rank­ing, so it can’t pos­si­bly have been elimi­nated.

The hard part is show­ing that this pro­cess even­tu­ally con­verges to one cell re­main­ing and doesn’t get stuck. It will be proved by as­sum­ing there are at least 2 cells that have not been ruled out, and show­ing that there ex­ists an as­sign­ment of cells to play­ers that re­sult in at least one more cell get­ting ruled out, so the game can always progress.

## Proof of Above-Men­tioned Hard Part:

Let be the set of ruled-out cells.

As­sum­ing there is some cell be­ing used as a backup cell:

Let the do­main of player , be the set of cells

By the way the cell product was defined, both of these sets are sub­sets of . To vi­su­al­ize it, con­sult this image of a cube in Minecraft. The glow­stone is the backup cell, the green is the do­main of player 3, the black is the do­main of player 2, and the pur­ple is the do­main of player 1. This can be ver­ified by plug­ging into the defi­ni­tion of do­main. The first half of the do­main speci­fies all the cells in the cube, and the sec­ond half speci­fies the plane of cells where their third co­or­di­nates equal the third co­or­di­nate of the cube, . Thus, all the cells that don’t share the same plane as the backup point are green/​in the do­main of player 3. For , this equa­tion speci­fies the plane of black cells, minus the column of pur­ple cells. And so on.

Let the ruled out cells for player be , the ruled out cells for player be , and so on.

Fix some . Con­sider the func­tion that takes a point as in­put, mea­sures the dis­tance to the near­est cell in , gets the point , and plays the point that is dis­tance along the vec­tor be­tween and the cen­ter point of .

This func­tion is nonempty for all in­puts. It is con­vex for all in­puts be­cause it plays a set con­sist­ing of a sin­gle point. It has closed graph be­cause it is con­tin­u­ous. There­fore, it is a Kaku­tani func­tion. The cen­ter point of picks out the se­ries of cells where the ’th cell of the product that defines them is , so the point that this func­tion maps to­wards is out­side . Fur­ther­more, for all points in the set of cells , , so the point is moved 1 unit away from its start­ing point, so all these cells are ruled out. Fi­nally, for all cells out­side of , se­lect a point on the in­te­rior of the cell that is more than away from any cells that must be ruled out. Then this func­tion doesn’t change any co­or­di­nates of that point, so has a fixed point in that cell, so this cell has not been ruled out. This ar­gu­ment ap­plies for all .

There­fore, con­di­tions 1 and 2 are fulfilled by this as­sign­ment of cells. To go back to the Minecraft cube, if you pick out some sub­set of the green glass cubes as ones to be pre­vented, all points in or re­ally close to that sub­set would get mapped to­wards the plane of black glass cubes by player 3. Similarly, if you pick out some sub­set of the black glass cubes as ones to be pre­vented by player 2, player 2 would map all points on or re­ally close to those cubes to­wards the column of pur­ple glass cubes. Of course, the play­ers don’t have to use this ex­act Kaku­tani func­tion, it just proves that for ev­ery player, there is some Kaku­tani func­tion that lets them fulfill their obli­ga­tions.

Fi­nally, we will show that if there are at least 2 cells re­main­ing, there is always a pair of cells and such that is min­i­mally ranked for some player , and when is fixed as a backup cell, (which means it can be elimi­nated by player in the same Kaku­tani way as de­tailed be­fore, finish­ing up the proof by show­ing that a cell gets elimi­nated)

Take player , and let be their least-fa­vored cell. If there ex­ists some other cell that isn’t ruled out, and (ie, the de­tested cell and the backup cell have differ­ent fi­nal in­dices), then it can be the backup cell. Be­cause both these cells have differ­ent fi­nal in­dices, .

The quick vi­sual ver­sion of this is that if you delete some blocks out of the cube in the Minecraft image (these block holes cor­re­spond to cells that aren’t ruled out yet), then if the most-hated block hole has a differ­ent player 3 co­or­di­nate than some other block hole, then you can plug the other block hole with the glow­ing block, and the re­sult­ing black plane won’t con­tain the most-hated block hole, and the most-hated block hole will be in the green (which means it can be safely elimi­nated, just adopt a kaku­tani func­tion that maps ev­ery­thing in or near that cell to the black plane).

But what if all the cells re­main­ing/​holes in the Minecraft cube have the same fi­nal in­dex in /​lie in the same plane? This makes it so player doesn’t have a backup cell that lets them elimi­nate their least fa­vorite one.

Con­sider player . By the same ar­gu­ment as be­fore, player can elimi­nate their most de­tested cell, un­less all the non-elimi­nated cells have both the same player in­dex, and the same player in­dex.

Con­sider player , and so on. Either there is some player that can get a crack at elimi­nat­ing their least fa­vorite cell if the ran­dom num­ber gen­er­a­tor as­signs things right, or the se­quence of in­dices that in­dexes both cells is the ex­act same. That would mean these two dis­tinct cells are the same cell, which is im­pos­si­ble.

• Cool! I’m happy to see this writ­ten up fi­nally. It’s been a big source of in­tu­itions for me, so it’s good to see that the proof checks out.

A pos­si­ble next step to all this is to try to spec­ify proof-based DT agents which could play this game (or some­thing similar) based on Löbian hand­shakes. (In fact, part of the origi­nal mo­ti­va­tion was to try to bring the co­op­er­a­tive-or­a­cle model closer to the Löb-based co­op­er­a­tion you can get in pris­oner’s dilemma with visi­ble source code.)

It’s un­for­tu­nate that you had to add the pareto-im­prove­ment con­di­tion to the cell rank. That part seems es­pe­cially un­likely to drop out of a more gen­eral de­ci­sion the­ory.

I think I see an­other se­ri­ous com­pli­ca­tion:

Yes, not all points in at­tain the same util­ity, but for a suffi­ciently small ϵ, the cell is re­ally small, and for any player, the util­ity over the cell is well-ap­prox­i­mated by the util­ity at­tained at the mid­dle point in the cell.

So, for any de­sired util­ity-ap­prox­i­ma­tion ac­cu­racy , you can choose suffi­ciently small to achieve it. But, a pareto-op­tima in the set of mid­dle points can be ar­bi­trar­ily worse for some player than any pareto-op­tima of the full game; IE, tak­ing the mid­points can hide ar­bi­trar­ily large pareto im­prove­ments.

For ex­am­ple, sup­pose that =0.001. A pareto op­tima of the mid­points might give the util­ity vec­tor (2, 2, 3) for the three play­ers. There could be an­other mid­point (100, 100, 2.9999999), very near where the true game con­tains a point (100, 100, 3).

So, it seems the pareto op­ti­mum of the game on mid­points which is found by the pro­cess in the post can be ar­bi­trar­ily sub-op­ti­mal for all but one player, with no guaran­tee that this gets bet­ter as shrinks.

• If you drop the Pareto-im­prove­ment con­di­tion from the cell rank, and just have “ev­ery­one sorts things by their own util­ity”, then you won’t nec­es­sar­ily get a Pareto-op­ti­mal out­come (within the set of cell cen­ter-points), but you will at least get a point where there are no strict Pareto im­prove­ments (no points that leave ev­ery­one bet­ter off).

The differ­ence be­tween the two is… let’s say we’ve got a 2-player 2-move game that in util­ity-space, makes some sort of quadrilat­eral. If the top and right edges join at 90 de­grees, the Pareto-fron­tier would be the point on the cor­ner, and the set of “no strict Pareto im­prove­ments” would be the top and the right edges.

If that cor­ner is ob­tuse, then both “Pareto fron­tier” and “no strict Pareto im­prove­ments” agree that both line edges are within the set, and if the cor­ner is acute, then both “Pareto fron­tier” and “no strict Pareto im­prove­ments” agree that only the cor­ner is within the set. It ac­tu­ally isn’t much of a differ­ence, it only man­i­fests when the util­ities for a player are ex­actly equal, and is eas­ily changed by a lit­tle bit of noise.

The util­ity-ap­prox­i­ma­tion is­sue you pointed out seems to be point­ing to­wards the im­pos­si­bil­ity of guaran­tee­ing limit­ing to a point on the Pareto fron­tier (when you make the cell size smaller and smaller), pre­cisely be­cause of that “this set is un­sta­ble un­der ar­bi­trar­ily small noise” is­sue.

But, the “set of all points that have no strict Pareto im­prove­ments by more than for all play­ers”, ie, the -fuzzed ver­sion of “set of points with no strict pareto im­prove­ment”, does seem to be ro­bust against a lit­tle bit of noise, and doesn’t re­quire the Pareto-im­prove­ment con­di­tion on ev­ery­one’s rank­ing of cells.

So I’m think­ing that if that’s all we can at­tain (be­cause of the com­pli­ca­tion you pointed out), then it lets us drop that in­el­e­gant Pareto-im­prove­ment con­di­tion.

I’ll work on the proof that for suffi­ciently small cell size , you can get an out­come within of the set of “no strict Pareto im­prove­ments available”

Nice job spot­ting that flaw.