# Go hun­gry with Al­most Free Lunches

Con­sider the fol­low­ing game, called “Al­most Free Lunches” (EDIT: this seems to be a var­i­ant of the trav­el­ler dilemma). You name any pound-and-pence amount be­tween £0 and £1,000,000; your op­po­nent does like­wise. Then you will both get whichever amount named was low­est.

On top of that, the per­son who named the high­est amount must give £0.02 to the other. If you tie, no ex­tra money changes hands.

What’s the Nash equil­ibrium of this game? Well:

• The only Nash equil­ibrium of Al­most Free Lunches is for both of you to name £0.00.

Proof: Sup­pose player A has a prob­a­bil­ity dis­tri­bu­tion over pos­si­ble amounts to name, and player has a prob­a­bil­ity dis­tri­bu­tion over pos­si­ble amounts. Let be the high­est amount such that is non-zero; let be the same, for . As­sume that is a Nash equil­ibrium.

As­sume fur­ther that (if that’s not the case, then just switch the la­bels and ). Then ei­ther £0.00 or £0.00 (and hence both play­ers se­lect £0.00).

We’ll now rule out £0.00. If £0.00, then player can im­prove their score by re­plac­ing with . To see this, as­sume that player has said , and player has said . If , then player can say just as well as - ei­ther choice gives them the same amount (namely, £0.02).

There re­main two other cases. If , then is su­pe­rior to , get­ting (rather than £0.02). And if , then gets £0.02 £0.01, rather than (if ) or (if ).

Fi­nally, if £0.00, then player gets -£0.02 un­less they also say £0.00.

Hence if £0.00, the can­not be part of a Nash Equil­ibrium. Thus £0.00 and hence the only Nash Equil­ibrium is at both play­ers say­ing £0.00.

## Pareto optimal

There are three Pareto-op­ti­mal out­comes: (£1,000,000.00, £1,000,000.00), (£1,000,000.01, £999,999.97), and (£999,999.97, £1,000,000.01). All of them are very much above the Nash Equil­ibrium.

## Min­max and maximin

The min­max and max­imin val­ues are also both ter­rible, and also equal to £0.00. This is not sur­pris­ing, though, as min­max and max­imin im­plic­itly as­sume the other play­ers are an­tag­o­nis­tic to you, and are try­ing to keep your prof­its low.

# Ar­bi­trary bad­ness with two options

This shows that choos­ing the Nash Equil­ibrium can be worse than al­most ev­ery other op­tion. We can of course in­crease the max­i­mal amount, and get the Nash Equil­ibrium to be ar­bi­trar­ily worse than any rea­son­able solu­tion (I would just say ei­ther £1,000,000.00 or £999,999.99, and leave it at that).

But we can also make the Nash Equil­ibrium ar­bi­trar­ily close to the worst pos­si­ble out­come, and that with­out even re­quiring more than two op­tions for each player.

As­sume that there are four or­dered amounts of money/​util­ity: . Each player can name or . Then if they both name the same, they get that amount of util­ity. If they name differ­ent ones, then then player nam­ing gets , and the player nam­ing gets .

By the same ar­gu­ment as above, the only Nash equil­ibrium is for both to name . The max­i­mum pos­si­ble amount is ; the max­i­mum they can get if they both co­or­di­nate is , the Nash equil­ibrium is , and the worst op­tion is . We can set and for ar­bi­trar­ily tiny , while set­ting to be larger than by some ar­bi­trar­ily high amount.

So the situ­a­tion is as bad as it could pos­si­bly be.

Note that this is a var­i­ant of the pris­oner’s dilemma with differ­ent num­bers. You could de­scribe it as “Your com­pan­ion goes to a hideous jail if and only if you defect (and vice versa). Those that don’t defect will also get a dust speck in their eye.”

• Yeah. Usu­ally the cen­tipede game is used to teach this les­son, but your game is also very nice :-)

• Nit: I think this game is more stan­dardly referred to in the liter­a­ture as the “trav­eler’s dilemma” (Google seems to re­turn no rele­vant hits for “al­most free lunches” apart from this post).

• That’s use­ful; I added a link to the other game in the main text (as far as I can tell, I came up with this in­de­pen­dently).

• So the situ­a­tion is as bad as it could pos­si­bly be.

You mean, it is as bad as it could pos­si­bly be for the Nash equil­ibrium to be a good strat­egy and a good pre­dic­tor in this setup? Yep, ab­solutely. All mod­els tend to have their do­main of val­idity, and this game shows the limits of the Nash equil­ibrium model of de­ci­sion mak­ing.

• That is true, but I meant it as “as close as you want to the worst pos­si­ble out­come, and as far as you want from the best mu­tual out­come”.

• Unilat­eral pre­com­mit­ment lets peo­ple win at “Al­most Free Lunches”.

One way to model pre­com­mit­ment is as a se­quen­tial game: first player 1 chooses a num­ber, then player 1 has the op­tion of ei­ther show­ing that num­ber to player 2 or keep­ing it hid­den, then player 2 chooses a num­ber. Op­ti­mal play is for player 1 to pick £1,000,000 and show that num­ber, and then for player 2 to choose £999,999.99.

An in­ter­est­ing fea­ture of this is that player 1′s pre­com­mit­ment helped player 2 even more than it helped player 1. Player 1 is “tak­ing one for the team”, but still win­ning big. This dis­t­in­guishes it from games like chicken, where pre­com­mit­ment is a threat that al­lows the pre­com­mit­ter to win the larger share. Though this means that if ei­ther player can pre­com­mit (rather than one be­ing pre-as­signed to go first as player 1) then they’d both pre­fer to have the other one be the pre­com­mit­ter.

This benefit of pre­com­mit­ment does not ex­tend to the two op­tion ver­sion (n2 vs. n1). In that ver­sion, player 2 is in­cen­tivized to say “n1” re­gard­less of what player 1 com­mits to, so unilat­eral pre­com­mit­ment doesn’t help them avoid the Nash Equil­ibrium. As in the pris­oner’s dilemma.

• Is there a name for this type of equil­ibrium, where a player can pre-com­mit in a way where the best re­sponse leaves the first player very well-off, but not quite op­ti­mally well-off? What about if it is a mixed strat­egy (e.g. con­sider the ver­sion of this game where the player who gave the larger num­ber gets paid noth­ing).

• I think an­other key differ­ence be­tween PD and trav­el­ler/​AFL is that in the PD var­i­ant, (n2, n1) is a Pareto out­come—you can’t im­prove the first player’s out­come with­out mak­ing the sec­ond one worse off. How­ever, in the other prob­lem, (0,0) is very very far from be­ing Pareto.

• Doesn’t the ex­is­tence of the rule that says that no money changes hands if there’s a tie al­ter the in­cen­tives? If we both state that we want 1,000,000 pounds, then we both get it and we both walk away happy. What in­cen­tive is there for ei­ther of the two agents to name a value that is lower than 1,000,000?

• If your strat­egy re­mains un­changed, I can change my strat­egy to “999,999.99 please” and come away with 1,000,000.01 in to­tal, so that’s not a Nash equil­ibrium.

• I see. And from then it fol­lows the same pat­tern as a dol­lar auc­tion, un­til the “win­ning” bet goes to zero.

• The max­i­mum the­o­ret­i­cal pay­out is 1000000.01 pounds. And since both play­ers know this, and for any given tie amount, a player’s net can be in­creased by a penny by re­duc­ing their bid by a penny, they will re­cur­sively calcu­late to 0.

Un­less they have a the­ory of mind and can model ways to take risks in or­der to in­crease re­sults in the cases where the other player ALSO takes risks. We call this “trust”. and it can be greatly in­creased with com­mu­ni­ca­tion, em­pa­thy, and side-agree­ments.

I think it’s long been known that Nash equil­ibria are not nec­es­sar­ily op­ti­mal, only guaran­teed that the other player’s choices can’t hurt you. It’s perfectly defen­sive in ad­ver­sar­ial games. This is great for zero-sum games, where the other player’s in­crease is ex­actly your de­crease. It’s nearly ir­rele­vant (ex­cept as a lower bound) for co­op­er­a­tive (vari­able-sum) games.

• Yup. Though you don’t nec­es­sar­ily need to imag­ine the money “chang­ing hands”—if both peo­ple get paid 2 ex­tra pen­nies if they tie, and the per­son who bids less gets paid 4 ex­tra pen­nies, the re­sult is the same.

The point is ex­actly what it says in the ti­tle. Rel­a­tive to the max­i­mum co­op­er­a­tive pay­off, the ac­tual equil­ibrium pay­off can be ar­bi­trar­ily small. And as you change the game, the tran­si­tion from low pay­off to high pay­off can be sharp—jump­ing straight from pen­nies to mil­lions just by chang­ing the pay­offs by a few pen­nies.

• Rel­a­tive to the max­i­mum co­op­er­a­tive pay­off, the ac­tual equil­ibrium pay­off can be ar­bi­trar­ily small.

Please be care­ful to spec­ify “Nash equil­ibrium” here, rather than just “equil­ibrium”. Nash is not the only pos­si­ble re­sult that can be ob­tained, es­pe­cially if play­ers have some abil­ity to co­op­er­ate or to model their op­po­nent in a way where Nash’s con­di­tions don’t hold.

In ac­tual tests (ad­mit­tedly not very com­plete, usu­ally done in psy­chol­ogy or eco­nomics classes), al­most no­body ends up at the Nash equil­ibrium in this style game (pos­i­tive-sum where al­tru­ism or trust can lead one to take risks).

• It checks out em­piri­cally. War is a Nash semi-equil­ibrium—both sides would be bet­ter off if they could co­or­di­nate an end to it, but they usu­ally can’t (in the near term). If Stanis­lav Petrov had been a bit less chill, the Cold War would have ended in 1983 in the “ev­ery­body launches all the nukes; cock­roaches win” Nash equil­ibrium. If that’s not ar­bi­trar­ily bad, it’s plenty bad enough.

• I think “ev­ery­body launches all nukes” might not be a Nash Equil­ibrium.

We can ar­gue that once one side launched their nukes the other side does not nec­es­sar­ily have an in­cen­tive to re­tal­i­ate, given they won’t re­ally care whether the en­emy got nuked or not af­ter they them­selves are nuked, and they prob­a­bly will have an in­cen­tive to not launch the nukes to pre­vent the “ev­ery­body dies” out­come, which can be ar­gued to be nega­tive for some­one who is about to die.

• It seems to me that both par­ties to the Cold War fa­vored the defect-defect out­come (launch all the nukes) over the co­op­er­ate-defect out­come (we die, they don’t). It’s hard to tell, though, be­cause both sides had an in­cen­tive to sig­nal that prefer­ence re­gard­less of the truth.

But that’s an ex­treme case. Any war you choose will have each side choos­ing be­tween con­tin­u­ing to fight and sur­ren­der­ing. The co­op­er­ate-co­op­er­ate out­come (mak­ing peace in a way that ap­prox­i­mates the likely out­come of a war) is prob­a­bly best for all, but it’s hard to achieve in prac­tice. And it seems to me that at least part of the prob­lem is that, if one side chooses to co­op­er­ate (sue for peace and re­frain from max­i­mally fight­ing), they run the risk that the other side will con­tinue to defect (fight) and seize an ad­van­tage.

• It is in­ter­est­ing that ex­per­i­men­tal re­sults of trav­el­ler’s dilemma seems to give re­sults which de­vi­ate strongly from the Nash Equil­ibrium, and in fact quite close to the Pareto Op­ti­mal Solu­tion.

This is pretty strange for a game that has only one round and no col­lu­sion (you’d ex­pect it to end as Pri­soner’s Dilemma, no?)

It is rather differ­ent from what we would see from the dol­lar auc­tion, which has no Nash Equil­ibrium but always de­vi­ate far away from the Pareto op­ti­mal solu­tion.

I sus­pect that the this game be­ing one round-only ac­tu­ally im­proves the Pareto effi­ciency of its out­comes:

Maybe if both par­ti­ci­pants are al­lowed to change their bid af­ter see­ing each other’s bid they WILL go into a down­ward spiral one cent by one cent un­til they reach zero or one player gives up at some point with a truce, just like how dol­lar auc­tions always stop at some point.

When there is only one round, how­ever, there is no way for a player to make their bid ex­actly 1 or 2 cents less than the other player, and bid­ding any less than that is sub­op­ti­mal com­pared to bid­ding more than the other player, so per­haps there is an in­cen­tive against low­er­ing one’s bid­ding in­definitely to 0 be­fore the game even starts, just like no one would bid \$1000 in the dol­lar auc­tion’s first round.

• you’d ex­pect it to end as Pri­soner’s Dilemma, no?

I think a key differ­ence is that in PD, (Defect, Co­op­er­ate) is a Pareto out­come (you can’t make it bet­ter for the co­op­er­a­tor with­out mak­ing it worse for the defec­tor). While (0, 0) is far from the Pareto bound­ary. So peo­ple can clearly see that nam­ing num­bers around 0 is a mas­sive loss, so they fo­cus on avoid­ing that loss rather than op­ti­mis­ing their game vs the other player.

• I haven’t found any in­for­ma­tion yet, but I sus­pect there is a mixed Nash some­where in TD.

• There is no mixed Nash equil­ibrium in the TD ex­am­ple above (see the proof above).

• Thanks, I for­got the proof be­fore re­ply­ing your com­ment.

You are cor­rect that in PD (D,C) is Pareto, and so the Nash Equil­ibrium (D,D) is much closer to a Pareto out­come than the Nash Equil­ibrium (0,0) of TD is to its Pareto out­comes (some­where along each per­son get­ting a mil­lion pounds, give or take a cent)

It still strange to see a game with only one round and no col­lu­sion to land pretty close to the op­ti­mal, while its re­peated ver­sion (dol­lar auc­tion) seems to de­vi­ate badly from the Pareto out­come.

• It still strange to see a game with only one round and no col­lu­sion to land pretty close to the op­ti­mal, while its re­peated ver­sion (dol­lar auc­tion) seems to de­vi­ate badly from the Pareto out­come.

It is a bit strange. It seems this is be­cause in the dol­lar auc­tion, you can always make your po­si­tion slightly bet­ter unilat­er­ally, in a way that will make it worse once the other player re­acts. Iter­ate enough, and all value is de­stroyed. But in a one-round game, you can’t slide down that path, so you pick by look­ing at the over­all pic­ture.