Nihiland: There is not even a coherent uni-agent theory, not to mention multi-agency. I find this world quite unlikely, but leave it here for the sake of completeness (and for the sake of the number 5). Closely related is antirealism about rationality, which I have criticized in the past. In this world it is not clear whether the alignment problem is well-posed at all.
Discordia: There is a coherent uni-agent theory, but no coherent theory of multi-agency. This world is conceivable, since the current understanding of multi-agency is much worse than the understanding of “solitary” agents. In this world, negative-sum conflicts and coordination failures are probably ubiquitous (even among arbitrarily sophisticated agents), because there is no principle of rationality that rules them out. Acausal trade is probably not a thing, or at least rare and fragile. In the context of value learning, there might be no principled way to deal with uncertainty (which could otherwise be regarded as a bargaining problem). There is also no principled solution to multi-user alignment.
Linguistica: There is a coherent theory of multi-agency, but agents are inevitably divided into “types” s.t. only interactions between agents of the same type have strong guarantees. (The name of the world is because we can metaphorically think of the types as different “languages”.) An example of how this might happen is reflective oracles, where the type corresponds to the choice of fixed point. Acausal trade probably exists[1], but is segregated by type. Alignment is complicated by the need to specify or learn the human type.
Economica: There is a coherent uni-type theory of multi-agency, but this theory involves desiderata that can only be motivated by multi-agency. Explicitly thinking about multi-agency is necessary to construct the full theory of agents[2]. In this world, the Yudkowskian hope for ubiquitous strong cooperation guarantees can be justified, and acausal trade might be very common. Figuring out the multi-agent theory, and not just the uni-agent fragment, is probably important for alignment, or at least necessary in order to avoid leaving huge gains from trade on the table.
Harmonia: There is a coherent uni-type theory of multi-agency, and this theory can be derived entirely from desiderata that can be motivated without invoking multi-agency at all. There is no special “mutli-agent sauce”: any sufficiently rational agents automatically have strong guarantees in the multi-agent setting. Explicitly understanding multi-agency is arguably still important for dealing with uncertainty in value-learning, and dealing with multi-user alignment. (And also in order to know that we are in this world.)
For simplicity, I’m ignoring what is arguably an “orthogonal” axis: to which extent the “correct” multi-agent theory implies acausal cooperation even under favorable conditions. I believe that, outside of Nihiland and Discordia, it probably does, but the alternative hypothesis is also tenable.
On the border between Linguistica and Economica, there are worlds with strong guarantees for agents of the same type and medium-strength guarantees for agents of different type (where “medium-strength” is still stronger than “achieve maximin payoff”: the latter is already guaranteed in infra-Bayesianism). This blurs the boundary, but I would consider this to be Linguistica if even slightly different types have much weaker guarantees (or if there is no useful notion of “slightly different types”) and Economica if there is continuous graceful degradation like in Yudkowsky’s subjective fairness proposal.
TLDR: A new proposal for a prescriptive theory of multi-agent rationality, based on generalizing superrationality using local symmetries.
Hofstadter’s “superrationality” postulates that a rational agent should behave as if other rational agents are similar to itself. In a symmetric game, this implies the selection of a Pareto efficient outcome. However, it’s not clear how to apply this principle to asymmetric games. I propose to solve this by suggesting a notion of “local” symmetry that is much more ubiquitous than global (ordinary) symmetry.
We will focus here entirely on finite deterministic (pure strategy) games. Generalizing this to mixed strategies and infinite games more generally is an important question left for the future.
Quotient Games
Consider a game Γ with a set P of players, for each player i∈P a non-empty set Si or pure strategies and a utility function ui:S∗→R, where S∗:=∏j∈PSj. What does it mean for the game to be “symmetric”? We propose a “soft” notion of symmetry that is only sensitive to the ordinal ordering between payoffs. While the cardinal value of the payoff will be important in the stochastic context, we plan to treat the latter by explicitly expanding the strategy space.
Given two games Γ(1), Γ(2), a premorphism from Γ(1) to Γ(2) is defined to consist of
A mapping q:P(1)→P(2)
For each i∈P(1), a mapping ¯qi:S(2)q(i)→S(1)i
Notice that these mappings induce a mapping ^q:S(2)∗→S(1)∗ via
^q(x)i:=¯qi(xq(i))
A premorphism is said to be a homomorpism if q and ¯q are s.t. that for any i∈P(1) and x,y∈S(2)∗, if uq(i)(x)≤uq(i)(y) then ui(^q(x))≤ui(^q(y)). Homomorphisms makes games into a category in a natural way.
An automorphism of Γ is a homomorphism from Γ to Γ that has an inverse homomorphism. An automorphism q of Γ is said to be flat when for any i∈P, if q(i)=i then ¯qi is the identity mapping[1]. A flat symmetry of Γ is a group G together with a group homomorphism φ:G→Aut(Γ) s.t.φ(g) is flat for all g∈G.
Given a flat symmetry (G,φ), we can apply the superrationality principle to reduce Γ to the “smaller” quotient game Γ/G. The players are P/G i.e. orbits in the original player set. Given α∈P/G, we define the set of strategies for player α in the game Γ/G to be
Observe that there is a natural mapping γ:(S/G)∗→S∗ given by
γ(x)i:=xGi(i)
Finally, the utility function of α is defined by
uα(x):=1|α|∑i∈αui(γ(x))
It is easy to see that the construction of Γ/G is functorial w.r.t. the category structure on games that we defined.
We can generalize this further by replacing game homomorphisms with game quasimorphisms: premorphisms satisfying the following relaxed condition on q and ¯q:
For any i∈P(2) and x,y∈S(1)∗, if uq(i)(x)<uq(i)(y) then ui(^q(x))≤ui(^q(y)).
This is no longer closed under composition, so no longer defines a category structure[2]. However, we can still define flat quasisymmetries and the associated quotient games, and this construction is still functorial w.r.t. the original (not “quasi”) category structure. Moreover, there is a canonical homomorphism (not just a quasimorphism) from Γ to Γ/G, even when G is a mere quasisymmetry.
The significance of quasisymmetries can be understood as follows.
The set of all games on a fixed set of players P with fixed strategy sets {Si}i∈P naturally forms a topological space[3]X. Given a group G acting on the game via invertible premorphisms, the subset of X where G is a symmetry is not closed, in general. However, the subset of X where G is a quasisymmetry is closed. I believe this will be important when generalizing the formalism to infinite games.
Coarse Graining
What if a game doesn’t have even quasisymmetries? Then, we can look for a coarse graining of the game which does have them.
Consider a game Γ. For each i∈P, let S′i be some set and fi:Si→S′i a surjection. Denote S′∗:=∏i∈PS′i. Given any x∈S′∗, we have the game Γx in which:
The set of players is P.
For any i∈P, their set of strategies is Sxi:=f−1i(xi). In particular, it implies that the set of outcomes Sx∗ satisfies Sx∗⊆S∗.
For any i∈P, their utility function is uxi:=ui|Sx∗.
If for any x∈S′∗ we choose some αx∈Sx∗ this allows us to define the coarse-grained game Γ(α) in which:
The set of players is P.
For any i∈P, the set of strategies is S′i.
For any i∈P and x∈S′∗, the utility function is u(α)i(x):=ui(αx).
Essentially, we rephrased Γ as an extensive form game in which first the game Γ(α) is played and then, if the outcome was x, the game Γx is played. This, assuming the expected outcome of game Γx is αx.
It is also possible to generalize this construction by allowing multivaluedf.
Local Symmetry Solutions
Given a game Γ, a local symmetry presolution (LSP) is recursively defined to be one of the following:
Assume there is i∈P s.t. for any j∈P∖i, |Sj|=1. Then, any x∗∈argmaxx∈S∗ui(x) is an LSP of Γ. [Effectively single player game]
Let {fi:Si→S′i}i∈P be a coarse-graining of Γ and for any x∈S′∗, let αx be an LSP of Γx. Let β be an LSP of Γ(α). Define x∗∈S∗ by x∗i:=αβi. Then, x∗ is an LSP of Γ.
Let G be a quasisymmetry of Γ and let y∗ be an LSP of Γ/G. Then, γ(y∗) is an LSP of Γ.
It’s easy to see that an LSP always exists, because for any game we can choose a sequence of coarse-grainings whose effect is making the players choose their strategies one by one (with full knowledge of choices by previous players). However, not all LSP are “born equal”. It seems appealing to prefer LSP which use “more” symmetry. This can be formalized as follows.
The way an LSP is constructed defines the set of coherent outcomesA⊆S∗, which is the set of outcomes compatible with the local symmetries[4]. We define it recursively as follows:
For the LSP of an effectively single-player game, we set A:=S∗.
For a coarse-graining LSP, for any x∈S′∗, let Ax be the set of coherent outcomes of Γx and let B be the set of coherent outcomes of Γ(α). We then define A:={x∈S∗|f(x)∈B and x∈Af(x)}. Here, f:S∗→S′∗ is defined by f(x)i:=fi(xi).
For a quasisymmetry LSP, let B be the set of coherent outcomes of Γ/G. We then define A:=γ(B).
We define a local symmetry solution (LSS) to be an LSP whose set of coherent outcomes is minimal w.r.t. set inclusion.
Examples
In the Prisoner’s Dilemma, the only LSS is Cooperate-Cooperate.
In Stag Hunt, the only LSS is Stag-Stag.
In Chicken, the only LSS is Swerve-Swerve. If we add a finite number of randomized strategies, it opens more possibilities.
In Battle of the Sexes, assuming a small payoff from going to your preferred place alone, the only LSS is: each player goes to their own preferred place. If we give each player a “coinflip” strategy, then I think the only LSS is both players using the coinflip.
Consider a pure cooperation game where both players have strategies {a,b} and the payoff is 1 if they choose the same strategy and 0 if they choose different strategies. Then, any outcome is an LSS. I think that, if we add a “fair coinflip” strategy, any LSS has payoff at least 14. I believe the LSS payoff will approach optimum if we (i) allow randomization (ii) make the game repeated with time long horizon and (iii) add some constraints on the symmetries, but I’ll leave this for a future post.
Consider a pure cooperation game where both players have strategies {a,b}, the payoff is 2 if both choose a, 1 if both choose b and 0 if they choose different strategies. Then, the only LSS is aa.
In any game, an LSS guarantees each player at least their deterministic-maximin payoff. In particular, in a two-player zero-sum game in which a pure Nash equilibrium exists, any LSS is a Nash equilibrium.
I believe this can be interpreted as a category enriched over the category of sets and partial functions, with the monoidal structure given by Cartesian product of sets.
Maybe we can interpret the set of coherent outcomes as a sharp infradistirbution corresponding to the players’ joint belief about the outcome of the game.
Master post for multi-agency. See also.
I propose a taxonomy of 5 possible worlds for multi-agent theory, inspired by Imagliazzo’s 5 possible worlds of complexity theory (and also the Aaronson-Barak 5 worlds of AI):
Nihiland: There is not even a coherent uni-agent theory, not to mention multi-agency. I find this world quite unlikely, but leave it here for the sake of completeness (and for the sake of the number 5). Closely related is antirealism about rationality, which I have criticized in the past. In this world it is not clear whether the alignment problem is well-posed at all.
Discordia: There is a coherent uni-agent theory, but no coherent theory of multi-agency. This world is conceivable, since the current understanding of multi-agency is much worse than the understanding of “solitary” agents. In this world, negative-sum conflicts and coordination failures are probably ubiquitous (even among arbitrarily sophisticated agents), because there is no principle of rationality that rules them out. Acausal trade is probably not a thing, or at least rare and fragile. In the context of value learning, there might be no principled way to deal with uncertainty (which could otherwise be regarded as a bargaining problem). There is also no principled solution to multi-user alignment.
Linguistica: There is a coherent theory of multi-agency, but agents are inevitably divided into “types” s.t. only interactions between agents of the same type have strong guarantees. (The name of the world is because we can metaphorically think of the types as different “languages”.) An example of how this might happen is reflective oracles, where the type corresponds to the choice of fixed point. Acausal trade probably exists[1], but is segregated by type. Alignment is complicated by the need to specify or learn the human type.
Economica: There is a coherent uni-type theory of multi-agency, but this theory involves desiderata that can only be motivated by multi-agency. Explicitly thinking about multi-agency is necessary to construct the full theory of agents[2]. In this world, the Yudkowskian hope for ubiquitous strong cooperation guarantees can be justified, and acausal trade might be very common. Figuring out the multi-agent theory, and not just the uni-agent fragment, is probably important for alignment, or at least necessary in order to avoid leaving huge gains from trade on the table.
Harmonia: There is a coherent uni-type theory of multi-agency, and this theory can be derived entirely from desiderata that can be motivated without invoking multi-agency at all. There is no special “mutli-agent sauce”: any sufficiently rational agents automatically have strong guarantees in the multi-agent setting. Explicitly understanding multi-agency is arguably still important for dealing with uncertainty in value-learning, and dealing with multi-user alignment. (And also in order to know that we are in this world.)
For simplicity, I’m ignoring what is arguably an “orthogonal” axis: to which extent the “correct” multi-agent theory implies acausal cooperation even under favorable conditions. I believe that, outside of Nihiland and Discordia, it probably does, but the alternative hypothesis is also tenable.
On the border between Linguistica and Economica, there are worlds with strong guarantees for agents of the same type and medium-strength guarantees for agents of different type (where “medium-strength” is still stronger than “achieve maximin payoff”: the latter is already guaranteed in infra-Bayesianism). This blurs the boundary, but I would consider this to be Linguistica if even slightly different types have much weaker guarantees (or if there is no useful notion of “slightly different types”) and Economica if there is continuous graceful degradation like in Yudkowsky’s subjective fairness proposal.
TLDR: A new proposal for a prescriptive theory of multi-agent rationality, based on generalizing superrationality using local symmetries.
Hofstadter’s “superrationality” postulates that a rational agent should behave as if other rational agents are similar to itself. In a symmetric game, this implies the selection of a Pareto efficient outcome. However, it’s not clear how to apply this principle to asymmetric games. I propose to solve this by suggesting a notion of “local” symmetry that is much more ubiquitous than global (ordinary) symmetry.
We will focus here entirely on finite deterministic (pure strategy) games. Generalizing this to mixed strategies and infinite games more generally is an important question left for the future.
Quotient Games
Consider a game Γ with a set P of players, for each player i∈P a non-empty set Si or pure strategies and a utility function ui:S∗→R, where S∗:=∏j∈PSj. What does it mean for the game to be “symmetric”? We propose a “soft” notion of symmetry that is only sensitive to the ordinal ordering between payoffs. While the cardinal value of the payoff will be important in the stochastic context, we plan to treat the latter by explicitly expanding the strategy space.
Given two games Γ(1), Γ(2), a premorphism from Γ(1) to Γ(2) is defined to consist of
A mapping q:P(1)→P(2)
For each i∈P(1), a mapping ¯qi:S(2)q(i)→S(1)i
Notice that these mappings induce a mapping ^q:S(2)∗→S(1)∗ via
^q(x)i:=¯qi(xq(i))A premorphism is said to be a homomorpism if q and ¯q are s.t. that for any i∈P(1) and x,y∈S(2)∗, if uq(i)(x)≤uq(i)(y) then ui(^q(x))≤ui(^q(y)). Homomorphisms makes games into a category in a natural way.
An automorphism of Γ is a homomorphism from Γ to Γ that has an inverse homomorphism. An automorphism q of Γ is said to be flat when for any i∈P, if q(i)=i then ¯qi is the identity mapping[1]. A flat symmetry of Γ is a group G together with a group homomorphism φ:G→Aut(Γ) s.t.φ(g) is flat for all g∈G.
Given a flat symmetry (G,φ), we can apply the superrationality principle to reduce Γ to the “smaller” quotient game Γ/G. The players are P/G i.e. orbits in the original player set. Given α∈P/G, we define the set of strategies for player α in the game Γ/G to be
(S/G)α:={σ∈∏i∈αSα∣∣ ∣∣∀i∈α,g∈G:σ(i)=¯¯¯¯¯¯¯¯¯¯φ(g)iσ(φ(g)i)}This is non-empty thanks to flatness.
Observe that there is a natural mapping γ:(S/G)∗→S∗ given by
γ(x)i:=xGi(i)Finally, the utility function of α is defined by
uα(x):=1|α|∑i∈αui(γ(x))It is easy to see that the construction of Γ/G is functorial w.r.t. the category structure on games that we defined.
We can generalize this further by replacing game homomorphisms with game quasimorphisms: premorphisms satisfying the following relaxed condition on q and ¯q:
For any i∈P(2) and x,y∈S(1)∗, if uq(i)(x)<uq(i)(y) then ui(^q(x))≤ui(^q(y)).
This is no longer closed under composition, so no longer defines a category structure[2]. However, we can still define flat quasisymmetries and the associated quotient games, and this construction is still functorial w.r.t. the original (not “quasi”) category structure. Moreover, there is a canonical homomorphism (not just a quasimorphism) from Γ to Γ/G, even when G is a mere quasisymmetry.
The significance of quasisymmetries can be understood as follows.
The set of all games on a fixed set of players P with fixed strategy sets {Si}i∈P naturally forms a topological space[3] X. Given a group G acting on the game via invertible premorphisms, the subset of X where G is a symmetry is not closed, in general. However, the subset of X where G is a quasisymmetry is closed. I believe this will be important when generalizing the formalism to infinite games.
Coarse Graining
What if a game doesn’t have even quasisymmetries? Then, we can look for a coarse graining of the game which does have them.
Consider a game Γ. For each i∈P, let S′i be some set and fi:Si→S′i a surjection. Denote S′∗:=∏i∈PS′i. Given any x∈S′∗, we have the game Γx in which:
The set of players is P.
For any i∈P, their set of strategies is Sxi:=f−1i(xi). In particular, it implies that the set of outcomes Sx∗ satisfies Sx∗⊆S∗.
For any i∈P, their utility function is uxi:=ui|Sx∗.
If for any x∈S′∗ we choose some αx∈Sx∗ this allows us to define the coarse-grained game Γ(α) in which:
The set of players is P.
For any i∈P, the set of strategies is S′i.
For any i∈P and x∈S′∗, the utility function is u(α)i(x):=ui(αx).
Essentially, we rephrased Γ as an extensive form game in which first the game Γ(α) is played and then, if the outcome was x, the game Γx is played. This, assuming the expected outcome of game Γx is αx.
It is also possible to generalize this construction by allowing multivalued f.
Local Symmetry Solutions
Given a game Γ, a local symmetry presolution (LSP) is recursively defined to be one of the following:
Assume there is i∈P s.t. for any j∈P∖i, |Sj|=1. Then, any x∗∈argmaxx∈S∗ui(x) is an LSP of Γ. [Effectively single player game]
Let {fi:Si→S′i}i∈P be a coarse-graining of Γ and for any x∈S′∗, let αx be an LSP of Γx. Let β be an LSP of Γ(α). Define x∗∈S∗ by x∗i:=αβi. Then, x∗ is an LSP of Γ.
Let G be a quasisymmetry of Γ and let y∗ be an LSP of Γ/G. Then, γ(y∗) is an LSP of Γ.
It’s easy to see that an LSP always exists, because for any game we can choose a sequence of coarse-grainings whose effect is making the players choose their strategies one by one (with full knowledge of choices by previous players). However, not all LSP are “born equal”. It seems appealing to prefer LSP which use “more” symmetry. This can be formalized as follows.
The way an LSP is constructed defines the set of coherent outcomes A⊆S∗, which is the set of outcomes compatible with the local symmetries[4]. We define it recursively as follows:
For the LSP of an effectively single-player game, we set A:=S∗.
For a coarse-graining LSP, for any x∈S′∗, let Ax be the set of coherent outcomes of Γx and let B be the set of coherent outcomes of Γ(α). We then define A:={x∈S∗|f(x)∈B and x∈Af(x)}. Here, f:S∗→S′∗ is defined by f(x)i:=fi(xi).
For a quasisymmetry LSP, let B be the set of coherent outcomes of Γ/G. We then define A:=γ(B).
We define a local symmetry solution (LSS) to be an LSP whose set of coherent outcomes is minimal w.r.t. set inclusion.
Examples
In the Prisoner’s Dilemma, the only LSS is Cooperate-Cooperate.
In Stag Hunt, the only LSS is Stag-Stag.
In Chicken, the only LSS is Swerve-Swerve. If we add a finite number of randomized strategies, it opens more possibilities.
In Battle of the Sexes, assuming a small payoff from going to your preferred place alone, the only LSS is: each player goes to their own preferred place. If we give each player a “coinflip” strategy, then I think the only LSS is both players using the coinflip.
Consider a pure cooperation game where both players have strategies {a,b} and the payoff is 1 if they choose the same strategy and 0 if they choose different strategies. Then, any outcome is an LSS. I think that, if we add a “fair coinflip” strategy, any LSS has payoff at least 14. I believe the LSS payoff will approach optimum if we (i) allow randomization (ii) make the game repeated with time long horizon and (iii) add some constraints on the symmetries, but I’ll leave this for a future post.
Consider a pure cooperation game where both players have strategies {a,b}, the payoff is 2 if both choose a, 1 if both choose b and 0 if they choose different strategies. Then, the only LSS is aa.
In any game, an LSS guarantees each player at least their deterministic-maximin payoff. In particular, in a two-player zero-sum game in which a pure Nash equilibrium exists, any LSS is a Nash equilibrium.
Note that flatness is in general not preserved under composition.
I believe this can be interpreted as a category enriched over the category of sets and partial functions, with the monoidal structure given by Cartesian product of sets.
An affine space, even.
Maybe we can interpret the set of coherent outcomes as a sharp infradistirbution corresponding to the players’ joint belief about the outcome of the game.