2014 iterated prisoner’s dilemma tournament results

Fol­lowup to: An­nounc­ing the 2014 pro­gram equil­ibrium iter­ated PD tour­na­men­t

In Au­gust, I an­nounced an iter­ated pris­oner’s dilemma tour­na­ment in which bots can simu­late each other be­fore mak­ing a move. Eleven bots were sub­mit­ted to the tour­na­ment. To­day, I am pleased to an­nounce the fi­nal stand­ings and re­lease the source code and full re­sults.

All of the source code sub­mit­ted by the com­peti­tors and the full re­sults for each match are available here. See here for the full set of rules and tour­na­ment code.

Be­fore we get to the fi­nal re­sults, here’s a quick run­down of the bots that com­peted:

AnderBot

An­derBot fol­lows a sim­ple tit-for-tat-like al­gorithm that es­chews simu­la­tion:

  • On the first turn, Co­op­er­ate.

  • For the next 10 turns, play tit-for-tat.

  • For the rest of the game, Defect with 10% prob­a­bil­ity or Defect if the op­pos­ing bot has defected more times than An­derBot.

CheeseBot

CheeseBot tries sev­eral simu­la­tion-based strate­gies, and uses the first one that ap­plies to the cur­rent situ­a­tion.

  • If the op­po­nent defected on the pre­vi­ous round, Defect.

  • If the op­po­nent does not defect against defec­tBot, Defect.

  • If defect­ing on this round would lead to CheeseBot be­ing pun­ished for it in fu­ture rounds, Co­op­er­ate. CheeseBot checks this by simu­lat­ing fu­ture rounds with a simu­lated his­tory in which it defects on the cur­rent round.

  • If the op­po­nent is a mir­ror-like bot, Co­op­er­ate. To test whether a bot is mir­ror-like, CheeseBot simu­lates the op­po­nent and checks if it defects against Defec­tBot and co­op­er­ates with a bot that plays tit-for-tat but defects against Co­op­er­ateBot and Defec­tBot.

  • If it is the last round, Co­op­er­ate.

  • Defect.

DefectBot

Defect!

This bot was sub­mit­ted pub­li­cly by James Miller.

DMRB

DavidMonRoBot (DMRB) takes a more cau­tious ap­proach to simu­lat­ing: It spends a few hun­dred mil­lisec­onds simu­lat­ing its op­po­nent to figure out what the rest of the round will look like given a Co­op­er­ate or Defect on the cur­rent round, and then picks the out­come that leads to the high­est to­tal num­ber of points for DMRB.

This al­lows DMRB to gauge whether its op­po­nent is “dumb,” i.e. does not pun­ish defec­tors. If the op­po­nent is dumb, DMRB rea­sons that the best move is to defect; oth­er­wise, if DMRB thinks that its op­po­nent will pun­ish defec­tion, it sim­ply plays tit-for-tat. DMRB spends only a small amount of time simu­lat­ing so that other simu­la­tion-based bots will be less likely to have their simu­la­tions time out while simu­lat­ing DMRB.

Pip

This one is a be­he­moth. At al­most 500 very dense lines of Haskell (in­clud­ing com­ments con­tain­ing quotes from “The Quan­tum Thief”), Pip uses a com­plex se­ries of simu­la­tions to clas­sify the op­po­nent into a large set of defined be­hav­iors, such as “Co­op­er­ateOnLastInTheFaceOfCo­op­er­a­tion”, “Un­com­pro­mis­ing” and “Ex­tor­tion­ist.” Then, it builds out a de­ci­sion tree and se­lects the out­come that leads to the high­est score at the end of the match. If I’m be­ing vague here, it’s be­cause the in­ner work­ings of Pip are still mostly a mys­tery to me.

SimHatingTitForTat

SimHat­ingTitForTat, as the name im­plies, plays tit-for-tat and at­tempts to pun­ish bots that use simu­la­tion to ex­ploit it, namely by defect­ing against bots that de­vi­ated from tit-for-tat on any pre­vi­ous round. Its strat­egy is as fol­lows:

  • On the first round, Co­op­er­ate.

  • On ev­ery sub­se­quent round, if the op­po­nent played tit-for-tat on all pre­vi­ous rounds, play tit-for-tat. Other­wise, Defect.

SimpleTFTseerBot

Sim­pleTFTseer uses a mod­ified ver­sion of tit-for-tat that ruth­lessly pun­ishes defec­tors and uses one simu­la­tion per round to look at what its op­po­nent will do on the last round.

  • If ei­ther player defected in a pre­vi­ous round, Defect.

  • If it is the last round, Co­op­er­ate.

  • Other­wise, simu­late my op­po­nent play­ing against me on the last round of this match, as­sum­ing that we both Co­op­er­ate from the cur­rent round un­til the sec­ond-to-last round, and do what­ever my op­po­nent does in that sce­nario. If the simu­la­tion does not ter­mi­nate, Defect.

SwitchBot

SwitchBot also uses a mod­ified tit-for-tat al­gorithm:

  • On the first turn, Co­op­er­ate.

  • On the sec­ond turn, if my op­po­nent Defected on the pre­vi­ous turn, simu­late my op­po­nent play­ing against mir­rorBot, and do what­ever my op­po­nent would do in that sce­nario.

  • Other­wise, play tit-for-tat.

TatForTit

TatForTit fol­lows a com­plex simu­la­tion-based strat­egy:

  • On the first round, do what­ever the op­po­nent would do against TatForTit on the sec­ond round, as­sum­ing both bots co­op­er­ated on the first round.

  • On the sec­ond round, if my op­po­nent defected on the pre­vi­ous round, Defect. Other­wise, do what­ever my op­po­nent would do against me, as­sum­ing they co­op­er­ated on the first round.

  • On all sub­se­quent turns, if my pre­vi­ous move was not the same as my op­po­nents move two turns ago, Defect. Other­wise, do what­ever my op­po­nent would do against me one turn in the fu­ture, as­sum­ing TatForTit re­peats its pre­vi­ous move on the next turn and the op­pos­ing bot co­op­er­ated on the next turn.

TwoFacedTit

TwoFacedTit simu­lates its op­po­nent play­ing against mir­rorBot; if it takes more than 10 mil­lisec­onds to re­spond, TwoFacedTit plays Co­op­er­ate. Other­wise, TwoFacedTit plays tit-for-tat and then Defects on the last round.

VOFB

Like Sim­pleTFTseerBot, VeryOp­por­tunis­ticFarseerBot (VOFB) uses a very ag­gres­sive defec­tion-pun­ish­ment strat­egy: If ei­ther player defected in a pre­vi­ous round, Defect. Other­wise, VOFB simu­lates the next round to de­ter­mine what the op­po­nent will do given a Co­op­er­ate or Defect on the cur­rent round. If the op­po­nent does not pun­ish defec­tion or the simu­la­tion does not ter­mi­nate, VOFB Defects. On the fi­nal round, VOFB uses ad­di­tional simu­la­tions to de­tect whether its op­po­nent defects against back­stab­bers, and if not, plays Defect.

Tour­na­ment results

After 1000 round-robin elimi­na­tion matches, the fi­nal stand­ings are:

1st place (tied): CheeseBot and DavidMonRoBot
3rd place: VeryOp­por­tunis­ticFarseerBot
4th place: TatForTit

The win fre­quen­cies for each bot:

If it were a sport­ing event, this tour­na­ment would not be par­tic­u­larly ex­cit­ing to watch, as most games were nearly iden­ti­cal from start to finish. The graph be­low shows each bot’s fre­quency of sur­viv­ing the first round-robin round:


In other words, the same half of the field con­sis­tently swept the first round; An­derBot, Defec­tBot, Pip, and Sim­pleTFTseer never sur­vived to see a sec­ond round. In gen­eral, this is be­cause high-scor­ing bots al­most always co­op­er­ated with each other (with the oc­ca­sional back­stab at the end of the round), and defected against An­derBot, Defec­tBot, Pop, and Sim­pleTFTseerBot, as these bots ei­ther did not con­sis­tently re­tal­i­ate when defected against or pre-emp­tively defected, trig­ger­ing a chain of mu­tual defec­tions. In­ter­est­ingly, the bots that con­tinued on to the next round did not do so by a large mar­gin:

How­ever, the var­i­ance in these scores was very low, pri­mar­ily due to the re­peated matchups of mostly-de­ter­minis­tic strate­gies con­sis­tently re­sult­ing in the same out­comes.

In ad­di­tion, all of the matches pro­gressed in one of the fol­low­ing 5 ways (hat tip lack­ofcheese):

  • ALL->[Cheese,DMRB,SimHat­ingTFT,Switch,TwoFacedTit,VOFB]->[Cheese,DMRB,VOFB] (963 matches)

    • (Only CheeseBot, DMRB, and VOFB made it into the fi­nal round; all three co­op­er­ated with each other for the en­tire round, re­sult­ing in a three-way tie)

  • ALL->[Cheese,DMRB,SimHat­ingTFT,Switch,TatForTit,TwoFacedTit]->[Cheese,DMRB,TatForTit]->[DMRB,TatForTit]->[TatForTit] (32 matches)

    • (Only TatForTit and DMRB made it into the fi­nal round; both bots co­op­er­ated un­til the sec­ond-to-last turns of each matchup, where TatForTit played Defect while the DMRB played Co­op­er­ate, re­sult­ing in a TatForTit vic­tory)

  • ALL->[Cheese,DMRB,SimHat­ingTFT,Switch,TatForTit,TwoFacedTit,VOFB]->[Cheese,DMRB,SimHat­ingTFT,TwoFacedTit]->[Cheese,DMRB] (3 matches)

    • (Only CheeseBot and DMRB made it into the fi­nal round; both bots co­op­er­ated with each other for the en­tire round, re­sult­ing in a two-way tie)

  • ALL->[Cheese,DMRB,SimHat­ingTFT,Switch,TatForTit,VOFB]->[Cheese,DMRB,SimHat­ingTFT]->[Cheese,DMRB] (1 match)

    • (Only CheeseBot and DMRB made it into the fi­nal round; both bots co­op­er­ated with each other for the en­tire round, re­sult­ing in a two-way tie)

  • ALL->[Cheese,DMRB,SimHat­ingTFT,Switch,TwoFacedTit,VOFB]->[Cheese,DMRB,SimHat­ingTFT]->[Cheese,DMRB] (1 match)

    • (Only CheeseBot and DMRB made it into the fi­nal round; both bots co­op­er­ated with each other for the en­tire round, re­sult­ing in a two-way tie)

This sug­gests that this game does have some kind of equil­ibrium, be­cause these top three bots use very similar strate­gies: Si­mu­late my op­po­nent and figure out if defec­tions will be pun­ished; if so, Co­op­er­ate, and oth­er­wise, defect. This al­lows bots fol­low­ing this strat­egy to always co­op­er­ate with each other, con­sis­tently pro­vid­ing them with a large num­ber of points in ev­ery round, en­sur­ing that they out­com­pete back­stab­bing or other ag­gres­sive strate­gies. In this tour­na­ment, this al­lowed the top three bots to add a guaran­teed 600 points per round, more than enough to con­sis­tently keep them from be­ing elimi­nated.

The tour­na­ment was slightly more in­ter­est­ing (and far more varied) on a matchup-by-matchup ba­sis. Last-round and sec­ond-to-last round defec­tions af­ter mu­tual co­op­er­a­tion were com­mon. TatForTit fre­quently used this tech­nique against VOFB, CheeseBot, DMRB, and vice versa; this tac­tic al­lowed it to steal 32 wins in the fi­nal round. Other bots, par­tic­u­larly An­derBot and Pip, be­haved very differ­ently be­tween matches. Pip, in par­tic­u­lar, some­times co­op­er­ated and some­times defected for long stretches, and An­derBot’s ran­dom­ness also led to er­ratic be­hav­ior. Ul­ti­mately, though, this did not net these bots a large num­ber of points, as their op­po­nents gen­er­ally defected as soon as they stopped co­op­er­at­ing.

For those in­ter­ested in the gritty de­tails, I’ve for­mat­ted the out­put of each match to be hu­man-read­able, so you can eas­ily read through the play-by-play of each match (and hope­fully get some en­joy­ment out of it, as well). See the github repo for the full logs.

Postscript

In the course of run­ning the tour­na­ment, I re­ceived a num­ber of sug­ges­tions and idea for how things could be im­proved; some of these ideas in­clude:

  • Ran­dom-length matches in­stead of fixed-length matches.

  • Straight elimi­na­tion rather than round-robin elimi­na­tion.

  • More “can­non-fod­der” bots in­cluded in the tour­na­ment by de­fault, such as copies of co­op­er­ateBot, defec­tBot, and tit-for-tat.

  • A Quick­Check-based test suite that al­lows bot writ­ers to more eas­ily test prop­er­ties of their bot while de­vel­op­ing, such as “co­op­er­ates with co­op­er­ateBot” or “co­op­er­ates if no one has defected so far.”


If any­one would like me to run this tour­na­ment again at some un­speci­fied time in the fu­ture, with or with­out mod­ifi­ca­tions, feel free to let me know in the com­ments. If you would like to fork the pro­ject and run it on your own, you are more than wel­come to do so.

Many thanks to ev­ery­one who par­ti­ci­pated!