A huge thanks to @Agustín Covarrubias 🔸 for his feedback and support on the following article:
Shapley values are an extremely popular tool in both economics and explainable AI.
In this article, we use the concept of “synergy” to build intuition for why Shapley values are fair. There are four unique properties to Shapley values, and all of them can be justified visually. Let’s dive in!
The Game
On a sunny summer day, you and your two best friends decide to run a lemonade stand. Everyone contributes something special: Emma shares her family’s secret recipe, Liam finds premium-quality sugar, and you draw colorful posters.
The stand is a big hit! The group ends up making $280. But how best to split the profits? Each person contributed in a different way, and the success was clearly due to teamwork…
Luckily, Emma has a time machine. She goes back in time — redoing the day with different combinations of team members and recording the profits. This is how each simulation went:
| Who was involved? | Total profits |
|---|---|
| Only Emma | $20 |
| Only you | $30 |
| Only Liam | $10 |
| Emma and you | $90 |
| You and Liam | $100 |
| Liam and Emma | $30 |
| Emma, Liam, and you | $280 |
Individually, Emma makes 20 dollars running the lemonade stand, and you make 30 dollars. But working together, the team makes 90 dollars.
The sum of individual profits is 20 + 30 = 50 dollars, which is clearly less than 90 dollars. That extra 90 − 50 = 40 dollars can be attributed to team dynamics. In game theory, this bonus is called the “synergy” of you and Emma. Let’s visualize our scenario as a Venn diagram.
The synergy bonuses in the Venn diagram are “unlocked” when the intersecting people are part of the team. To calculate total profit, we add up all areas relevant to that team.
For example, when the team consists of just you and Liam, three portions of the Venn diagram are unlocked: the area exclusive to you (30 dollars), the area exclusive to Liam (10 dollars), and the area exclusively shared by you and Liam (60 dollars). Adding these areas together, the total profit for team “You and Liam” comes out to 30 + 10 + 60 = 100 dollars.
Referring to our Venn diagram, the same formula holds true for every other team:
| Team members | Sum of synergies | Total profits |
|---|---|---|
| Emma | $20 | $20 |
| You | $30 | $30 |
| Liam | $10 | $10 |
| Emma, You | $20 + $30 + $40 | $90 |
| You, Liam | $30 + $10 + $60 | $100 |
| Liam, Emma | $10 + $20 + $0 | $30 |
| Emma, You, Liam | $20 + $30 + $10 + $40 + $60 + $0 + $120 | $280 |
Emma and Liam are impatient and want their fair share of money. They turn to you, the quick-witted leader for help. While staring at the Venn diagram, an idea strikes!
Take a moment to look over the visual. How would you slice up the Venn diagram fairly? Pause here, and continue when ready.
You decide to take each “synergy bonus” and cut it evenly among those involved.
Doing the math, each person’s share comes out to:
Emma and Liam agree the splits are fair. The money is handed out, and everyone skips happily home to dinner.
In this story, the final payouts are the Shapley values of each team member. This intuition is all you need to understand Shapley values. For the adventurous reader, we now tie things back to formal game theory.
The Formalities
Shapley values are a concept from cooperative game theory. You, Liam, and Emma are all considered “players” in a “coalition game”. Every possible “coalition” (or team) has a certain “payoff” (or profit). The mapping between coalition and payoff (a.k.a. which just corresponds to our first table of profits) is called the “characteristic function” (as it defines the nature, or *character*, of the game).
We define a set of players (which, in this case, is You, Emma, and Liam), and a characteristic function , where :
We can see how this is the same mapping we had in our table of profits by players:
| Who was involved? | Total profits |
|---|---|
| Only Emma | $20 |
| Only you | $30 |
| Only Liam | $10 |
| Emma and you | $90 |
| You and Liam | $100 |
| Liam and Emma | $30 |
| Emma, Liam, and you | $280 |
We also define a synergy function labeled where :
Similarly, the synergy function just corresponds to areas of the Venn diagram:
Thus, for a given player , the Shapley value is written as:
Which, in more compact notation, becomes:
The last is exactly the formula described on Wikipedia.
Concluding Notes
Shapley values are the ONLY way to guarantee:
Efficiency — The sum of Shapley values adds up to the total payoff for the full group (in our case, $280).
Symmetry — If two players interact identically with the rest of the group, their Shapley values are equal.
Linearity — If the group runs a lemonade stand on two different days (with different team dynamics on each day), a player’s Shapley value is the sum of their payouts from each day.
Null player — If a player contributes nothing on their own and never affects group dynamics, their Shapley value is 0.
Take a moment to justify these properties visually.
No matter what game you play and who you play with, Shapley values always preserve these natural properties of “fairness”.
Hopefully, you have gained some intuition for why Shapley values are “fair” and why they account for interactions among players. Proofs and more rigorous definitions can be found on Wikipedia.
Thanks for reading! :)
This visual reduces mental load, shortens feedback loops, and effectively uses visual intuition.
Before, my understanding of Shapley values mostly came from the desirable properties (listed at the end of this article). But the actual formula itself didn’t have justification in my mind beyond “well, it’s uniquely determined by the desirable properties”. I had seen preceding justifications in terms of building up the coalition in a certain sequence, taking the marginal value a player contributes when they join, and then averaging over all ways coalition could’ve been built up (like Alice Bob then Charlie, or Alice Charlie then Bob). However, this is both less intuitively desirable and also harder to compute. The beauty of this explanation is that:
It gives you intuition for the formula, because intuitively the synergy parts of the diagram should be split up among those that caused the synergy.
It allows you to quickly compute the Shapley values for any given scenario
The computation is clear and simple, so it doesn’t take up as much working memory. You can see what contributes to the final answer. If you modify a scenario, you can see visually what changes, why, and how that changes the answer.
People (who aren’t blind) have a lot of visual reasoning ability, and so more math should be explained in ways that exploit visual understanding.
That second property is rather important, for the same reason that code that compiles in a second allows for faster debugging than code that takes 5 minutes. There are more detailed posts on the subject that I cannot remember right now, but the idea is that by making the feedback quicker, you can more easily experiment and fiddle to understand whatever situation you are dealing with.
Here’s some examples of concepts that came up in a casual conversation about Shapley values that I would’ve had a much harder time understanding without the post’s explanation, to the point that I might not have bothered: (they get a bit into the weeds)
You can replace the assumptions of (null player, symmetry, additivity) in the uniqueness list with the following “equal impact” axiom[1]: The impact to agent i‘s share of removing agent j should be the same as the impact to agent j’s share of removing agent i. Leaving aside how you prove uniqueness, let’s see why the Shapley value even has this property. Looking at the picture, this axiom means that if I remove (say) Liam’s circle, the green area (Emma’s share) has lost the same amount as the blue area (Liam’s share) would lose if I removed Emma’s circle.
A property that the Shapley value unintuitively (to me) doesn’t always satisfy is the “stand alone core”[2]:
Define a superadditive game as one where for any two (disjoint) coalitions S and T, the whole S&T together produces at least as much as the sum of what they produce alone. Equivalently, all synergies are non-negative.
The stand alone core property requires that for a superadditive game, the players in coalition C should together receive a total at least as high as if C worked by themselves (and likewise we require they make at most as much for subadditive games).
This property seems rather desirable, because it’s what’s needed to prevent basically-nash-equilibria where players merge into coalitions to mutually bargain with the remaining, but in fact it’s sometimes logically impossible to satisfy at all, no matter what you do. There’s an example in the reference involving software costs, where various software options work for some but not necessarily all of the 3 players and have different prices (so all the numbers are negative). If you work through the example[3] you can see that if two players merged into a single player, they’d get to pay less of the cost, basically because the central region in the venn diagram gets divvied into halves (1/2 of the cost going to the coalition) instead of thirds (2/3 of the cost going to each two players), and so there’s an incentive for any pair to band up and leave the third at home. If the third player is new to the company, the other two are probably going to demand some concessions beyond the Shapley value due to having a BATNA of being together (e.g. in real life Alice and Bob might demand that I pay the entire cost of having to use software that works for all of us).
Notice how playing around with this example requires computing a couple of Shapley values for a couple of games, and that the effect of subcoalitions is visually explained as divving up the center area into different portions. The example itself tells you interesting things about how people might bargain in the real world.
I’ve added the wonderful diagrams to the Wikipedia page, and I hope that more people will think of the Shapley value that way. The picture can be explained to a child, and having it makes complicated properties intuitive. The topic is important, and teaches us about actual bargaining and coordination—so thinking effectively about it matters.
Hervé Moulin (2004). Fair Division and Collective Welfare. Cambridge, Massachusetts: MIT Press. ISBN 9780262134231. pg 147–156
same ref as before, or ibid as the cool kids say.
For the curious, here it is: (my pictures not included—you however probably should draw them)
Alice, Bob, and Charlie are buying software.
Software W suffices for Alice and Charlie and costs $8
Software X suffices for Bob and Charlie and costs $9
Software Y suffices for Alice and Bob and costs $10
Software Z suffices for everyone and costs $17
So for example, Alice by herself has the best option of W for $8, Bob by himself has X for $9, Charlie by himself has $8.
They together buy software Z, as that works for all of them, and then start bickering about who should pay what.
Let xA,xB,xC be the cost each player pays in the game. The stand alone core is the system of inequalities
xA+xB≤10
xA+xC≤8
xB+xC≤9
add the three together and divide by 2 to get
xA+xB+xC≤13.5
but to foot the whole bill they need to pay at least $17 and ideally don’t charge any more than the actual cost. Thus, contradiction.
Lastly, the Shapley value would charge Alice $5.5, Bob $6.5, and Charlie $5 while if Alice and Bob played as a single combined player they would be together charged $9.5 while Charlie would pay $7.5.