Three times in the past month, I’ve run across occasions where Shapley values were mentioned or would have been useful. There are a couple of good explainers of Shapley value already on the internet, but what most of them lack is a bunch of worked examples. I find that many times, the limiting factor in my ability to understand a concept is whether or not I’ve been exposed to multiple examples—so this post is where I’ll be posting some of the examples I worked through to understand how Shapley value works.

I’ll start with a brief overview of Shapley value, but first, a caveat. I think it’s fine to skim most of the math in the overview. We won’t really be using it that much, and I try to give an intuitive sense of what the equations mean before I present them. You could probably skip the entire overview and just go straight to the examples and not miss much.

Overview of Shapley Value

Suppose you have a group of people who work together to produce some sort of value (traditionally profit). How should you divide the credit for that value (or the actual profit) up among the individuals involved? One immediate suggestion might be “equally”, but that doesn’t necessarily satisfy intuitive notions of fairness. If a doctor performs life-saving surgery, should they receive equal credit for saving someone’s life as the random person holding the door open for the doctor at the end of the surgery? If you’re running a restaurant, and three servers spend all night running around serving people, and one server spends two hours on a smoke break, do all four deserve an equal amount of pay?

Shapley value provides a different way of computing what a fair division of value should be. Why use Shapley value? Well, like splitting things equally, it (a) divides all of the gains among all of the participants, (b) splits things equally between participants who contribute the same value, and (c) if there are two completely independent value-producing processes, then the assignment of value to each participant is equal to the sum of the value for that participant in each game^[1]. Unlike splitting things equally, it also (d) assigns no value to anyone who always contributes nothing, and in fact, it is the only assignment rule which satisfies all four of those constraints.

The formulas for computing the Shapley values can be found on its Wikipedia page.

The simplest way of computing Shapley value, the one we’ll be using for most of this post, is to consider the synergy of a group. (Here we use synergy in the original sense, not in the buzzword-ified sense. If you’ve never heard it used as anything besides a buzzword, it means “the value of a group that is greater than the sum of its parts”.) We’ll follow Wikipedia’s convention and use v(S) to notate the value produced by a set S, and w(S) to notate the synergy of some set of participants S. I’ll break with Wikipedia’s convention and use V(i) to indicate the Shapley value assigned to participant i (or subgroup R).

(Slightly obscure mathematical notation explainer footnote: ^[2])

The equation for synergy is

w (S) = \sum R \subset S (- 1)^{| S | - | R |} v (R)

but I think in practice, it’s easy for many questions to simply determine the synergy of a group by asking, “what value does this group provide that isn’t already accounted for by smaller groups contained within this one?”

Once we have the synergy, we can find the Shapley value of each participant by considering the synergy of each group and dividing the synergy equally among participants. (In a way, we’ve gone one level up; instead of dividing the value produced equally, we divide the synergy equally.) Here it is, written in equation form

V_{S} (i) = \sum R \subset S 1_{R} (i) \frac{w (R)}{| R |}

but I think that as we get into the examples, it will become much clearer.

Examples

Simple Factory Business

The most basic example, lifted straight from the Wikipedia page, is the “business” example. It works like this: suppose a group of workers and a factory owner want to start a company. The factory owner can provide a factory, but sitting empty, it provides nothing. In equation form,

v ({o w n e r}) = 0

Likewise, a group of workers on their own, but with no factory to work in, can make nothing.

v ({w o r k e r_{1}, w o r k e r_{2}, . . ., w o r k e r_{n}}) = 0

However, when the workers and the owner work together, each worker can use the factory to make some amount of profit.

v ({o w n e r, w o r k e r_{1}, w o r k e r_{2}, . . ., w o r k e r_{n}}) = n p

If we think about the synergy of this arrangement, the workers have no synergy with each other; a group of workers doesn’t provide anything more than exactly that many workers working alone, either with or without the owner involved. But there is synergy between each worker and the owner. That synergy is the profit each worker can make by working in the factory.

w ({o w n e r, w o r k e r_{i}}) = p

(The synergy of all other subsets is 0.)

From this, we can determine the Shapley value for each worker is half of the profit they produce, because we equally split the synergy p above between the owner and the worker. The Shapley value for the owner is half the profit of each worker, times the number of workers.

Plugging in some numbers, let’s say concretely that there are 10 workers, and each one can produce $500 of profit a day by working in the factory.

In that case, each worker keeps $250 per day, and the owner keeps $2,500 each day.

A lot of the examples of Shapley value I’ve seen online stop there, or go into more obscure applications. But I want to dig deeper, and un-sphereify this cow. The above example makes some strange assumptions, which can go unlooked at first glance.

For example, it supposes that the owner would rather make no money than work at their own factory. It’s reasonable to suppose that once the owner makes $2,500 a day from the capital, they wouldn’t bother laboring as well, but that’s not a reasonable assumption if no one else is working at the factory.^[3]

Additionally, the above states that each worker has no other available options for producing any profit, but in most cases, the workers would have other options—they could go work a worse job. Once again, the details of the other jobs available affect the Shapley value, just like the owner’s option to work in the factory does.

Workers Have Other Jobs

If we suppose instead that each worker can produce x value each day by working alone, doing something that doesn’t require any capital, that massively changes the synergy equation. Now, instead of

w (w o r k e r) = 0

w ({o w n e r, w o r k e r}) = p

we have

w (w o r k e r) = x

w ({o w n e r, w o r k e r}) = p - x

(The value of w({owner, worker}) changes because there’s less synergy, even though the value produced is the same.)

Now, when we add up the synergy for each worker, they keep ALL of the synergies from working alone, plus half of the synergy of working at the factory, for a total of x + (p-x)/2 = p/2 + x/2 This is true even if x is less than half of p; even though few workers would choose to make less, just having it as an option changes the Shapley value of the participant.

It also drives the owner’s total take down, from np/2 to n(p-x)/2. This effectively creates a sliding scale; as the value of what each worker could do without the factory goes up, if the value of the worker working at the factory stays the same, the share allocated to the worker of that work increases (at least, according to Shapley value).

Plugging in numbers from a few possible scenarios, using the same baseline of $500 profit produced per worker at the factory and 10 total workers, as above:

Solo Worker Value	Worker at Factory Shapley Value	Owner at Factory Shapley Value
$10	$255	$2450
$100	$300	$2000
$200	$350	$1500
$300	$400	$1000
$400	$450	$500
~$409.09	~$454.55	~$454.55

There are some more examples I want to explore, but they start to get complicated enough that I don’t trust myself to get the algebra correct. Instead, I’m going to switch to using a computer program to compute the Shapley value for the participants. You can see and run the program I’ve written here, but it’s not required to understand what happens in the scenarios below.

Minimum Number of Workers

What happens if we say that the factory requires a minimum of two workers before any profit is generated, but besides that everything stays the same as the initial problem?. In this case, we have

v (o w n e r) = 0

v (o w n e r, w o r k e r) = 0

v (o w n e r, w o r k e r_{1}, . . ., w o r k e r_{n}) = n p

for the value function. Using the concrete values of 5 workers and $500 profit per worker, we get $2454.55 in value for the owner and $254.55 for each worker.

What happens as we scale up the minimum number of workers needed to work?

Minimum Number of Workers Needed for Operation	Shapley Value of Worker	Shapley Value of Owner
1	$250	$2500
2	$254.55	$2454.55
3	$263.64	$2363.64
4	$277.27	$2227.27
5	$295.45	$2045.45
6	$318.18	$1818.18
7	$345.45	$1545.45
8	$377.27	$1227.27
9	$413.64	$863.64
10	$454.55	$454.55

Manager or CEO

What about management? Suppose we have a manager/CEO in addition to the owners and the workers. Without the CEO, the factory runs normally, producing $500 of profit per worker. With the CEO, the factory is more efficient and produces $600 instead profit. Plugging this into our program, we can see that the owner receives $2583.33 of the profit, the CEO receives $583.33 of the profit, and each worker receives $283.33 of the profit.

This is sort of like the “workers have other jobs available” scenario, in that we can adjust the amount added by the presence of the CEO to see how wages adjust

CEO Value Add	Worker Shapley Value	Owner Shapley Value	CEO Shapley Value
0	$250	$2500	$0
$10	$253.33	$2533.33	$33.33
$100	$283.33	$2833.33	$333.33
$500	$416.66	$4166.66	$1666.66
$1000	$583.33	$5833.33	$3333.33

Interestingly, under this model, there’s no value for which the CEO generates more Shapley value than the Owner, even if you posit that the presence of the CEO increases the profit per worker by 100 times.

Maximum Number of Workers

What if we posit a maximum number of workers? Suppose that there are 10 assembly stations in the factory, but 11 available workers (and everything else is equivalent to the initial scenario). In this case, there’s no clear way to determine which workers get to work and which don’t have a job, but if we compute the Shapley value, we can see that the addition of an unemployable worker causes the owner’s Shapley value to rise to $2708.33 (from $2500), and the worker’s Shapley values to drop to $208.33 (from $250).

This is strange to me, in the way that something just on the edge of the familiar is strange. It rhymes with the notion that in a free market, an increase in supply lowers the price (demand being constant), but in this case, we have a worker who does not work, who contributes no actual value but who still is assigned Shapley value due to counterfactual value. It also rhymes with unemployment welfare, in the sense that even those who cannot work (due to a lack of jobs) are given some money, but in this case, they’re given the same pay as their employed peers, and the money comes primarily from their peer’s pay and causes an increase in the owner’s pay.

Takeaways

There are a couple of other examples I might run through my program at some other time, and feel free to comment and request the Shapley values for a particular arrangement. But for now, I think I’ve gotten a good enough feel for how Shapley value changes in some common circumstances.

What practical takeaways am I getting from this write-up? Well, if we interpret the Shapley value as the way people should be paid when working as companies, it implies:

When your company hires someone in your management chain, if they’re paying them more than the previous manager (or adding a new manager), you also deserve a pay raise. (And the reverse is true!)
If you’re trying to hire two people for the same position, and you can’t decide who to hire, you should pay them both the same amount, even if you only hire one person.
Even if you’re in the best-paying job available to you, new opportunities below your current salary should result in you being paid more at your current job.

Or, perhaps the takeaway is that Shapley value is not a great way of actually distributing value in a corporate environment. One of the weird things about the above experiments is how there is some sort of spooky action at a distance in Shapley value; changing circumstances that don’t change the total value generated can change how that value is distributed. It’s entirely possible that measuring the Shapley value of employees at a corporation is significantly different than measuring the Shapley value of the same person inside of the greater economy.

^
This is a bit of a simplification: It’s really just linear, which implies a few additional things, but I think this gets at the core of why we want linearity.
^
This
$\sum R \subset S f (R)$
means “for each subset of S, which we name R, compute f(R), and sum together the results”
This
$1_{R} (x)$
is a notation for the indicator function; which returns 1 if x ∈ R, and 0 otherwise.
This
$| S |$
is the notation for the cardinality of—the number of elements in—a set.
^
The scenario where the owner always works is much simpler, and not super interesting, but is a good exercise to do to make sure you’ve understood the algebra, so I’ll refrain from doing that one.
Solution:
The owner gets to keep all of the extra profit from their own work, and it doesn’t impact anyone else

Worked Examples of Shapley Values