Fairness and Geometry

This post was prompted by Vladimir Nesov’s comments, Wei Dai’s intro to cooperative games and Eliezer’s decision theory problems. Prerequisite: Re-formalizing PD.

Some people here have expressed interest in how AIs that know each other’s source code should play asymmetrical games, e.g. slightly asymmetrized PD. The problem is twofold: somehow assign everyone a strategy so that the overall outcome is “good and fair”, then somehow force everyone to play the assigned strategies.

For now let’s handwave around the second problem thus: AIs that have access to each other’s code and common random bits can enforce any correlated play by using the quining trick from Re-formalizing PD. If they all agree beforehand that a certain outcome is “good and fair”, the trick allows them to “mutually precommit” to this outcome without at all constraining their ability to aggressively play against those who didn’t precommit. This leaves us with the problem of fairness.

(Get ready, math ahead. It sounds massive, but is actually pretty obvious.)

Pure strategy plays of an N-player game are points in the N-dimensional space of utilities. Correlated plays form the convex hull of this point set, an N-polytope. Pareto-optimal outcomes are points on the polytope’s surface where the outward normal vector has all positive components. I want to somehow assign each player a “bargaining power” (by analogy with Nash bargaining solutions); collectively they will determine the slope of a hyperplane that touches the Pareto-optimal surface at a single point which we will dub “fair”. Utilities of different players are classically treated as incomparable, like metres to kilograms, i.e. having different dimensionality; thus we’d like the “fair point” to be invariant under affine recalibrations of utility scales. Coefficients of tangent hyperplanes transform as covectors under such recalibrations; components of a covector should have dimensionality inverse to components of a vector for the application operation to make sense; thus the bargaining power of each player must have dimensionality 1/​utility of that player.

(Whew! It’ll get easier from now.)

A little mental visualization involving a sphere and a plane confirms that when a player stretches their utility scale 2x, stretching the sphere along one of the coordinate axes, the player’s power (the coefficient of that coordinate in the tangent hyperplane equation) must indeed go down 2x to keep the fair point from moving. Incidentally, this means that we cannot somehow assign each player “equal power” in a way that’s consistent under recalibration.

Now, there are many ways to process an N-polytope and obtain N values, dimensioned as 1/​coordinate each. A natural way would be to take the inverse measure of the polytope’s projection onto each coordinate axis, but this approach fails because irrelevant alternatives can skew the result wildly. A better idea would be taking the inverse measures of projections of just the Pareto-optimal surface region onto the coordinate axes; this decision passes the smoke test of bargaining games, so it might be reasonable.

To reiterate the hypothesis: assign each player the amount of bargaining power inversely proportional to the range of their gains possible under Pareto-optimal outcomes. Then pick the point on the polytope’s surface that touches a hyperplane with those bargaining powers for coefficients, and call this point “fair”.

(NB: this idea doesn’t solve cases where the hyperplane touches the polytope at more than one point, e.g. risk-neutral division of the dollar. Some more refined fairness concept is required for those.)

At this point I must admit that I don’t possess a neat little list of “fairness properties” that would make my solution unique and inevitable, Shapley value style. It just… sounds natural. It’s an equilibrium, it’s symmetric, it’s invariant under recalibrations, it often gives a unique answer, it solves asymmetrized PD just fine, and the True PD, and other little games I’ve tried it on, and something like it might someday solve the general problem outlined at the start of the post; but then again, we’ve tossed out quite a lot of information along the way. For example, we didn’t use the row/​column structure of strategies at all.

What should be the next step in this direction?

Can we solve fairness?

EDIT: thanks to Wei Dai for the next step! Now I know that any “purely geometric” construction that looks only at the Pareto set will fail to incentivize players to adopt it. The reason: we can, without changing the Pareto set, give any player an additional non-Pareto-optimal strategy that always assigns them higher utility than my proposed solution, thus making them want to defect. Pretty conclusive! So much for this line of inquiry, I guess.