# The Pavlov Strategy

Epistemic Sta­tus: Com­mon knowl­edge, just not to me

The Evolu­tion of Trust is a de­cep­tively friendly lit­tle in­ter­ac­tive game. Near the end, there’s a “sand­box” evolu­tion­ary game the­ory simu­la­tor. It’s pretty flex­ible. You can do quick ex­per­i­ments in it with­out writ­ing code. I highly recom­mend play­ing around.

One of the things that sur­prised me was a strat­egy the game calls Sim­ple­ton, also known in the liter­a­ture as Pavlov. In cer­tain con­di­tions, it works pretty well — even bet­ter than tit-for-tat or tit-for-tat with for­give­ness.

Let’s set the frame­work first. You have a Pri­soner’s dilemma type game.

• If both par­ties co­op­er­ate, they each get +2 points.

• If one co­op­er­ates and the other defects, the defec­tor gets +3 points and the co­op­er­a­tor gets −1 point

• If both defect, both get 0 points.

This game is iter­ated — you’re ran­domly as­signed to a part­ner and you play many rounds. Longer rounds re­ward more co­op­er­a­tive strate­gies; shorter rounds re­ward more defec­tion.

It’s also evolu­tion­ary — you have a pro­por­tion of bots each play­ing their strate­gies, and af­ter each round, the bots with the most points repli­cate and the bots with the least points die out. Suc­cess­ful strate­gies will tend to re­pro­duce while un­suc­cess­ful ones die out. In other words, this is the Dar­win Game.

Fi­nally, it’s stochas­tic — there’s a small prob­a­bil­ity that any bot will make a mis­take and co­op­er­ate or defect at ran­dom.

Now, how does Pavlov work?

Pavlov starts off co­op­er­at­ing. If the other player co­op­er­ates with Pavlov, Pavlov keeps do­ing what­ever it’s do­ing, even if it was a mis­take; if the other player defects, Pavlov switches its be­hav­ior, even if it was a mis­take.

In other words, Pavlov:

• co­op­er­ates when you co­op­er­ate with it, ex­cept by mistake

• “pushes bound­aries” and keeps defect­ing when you co­op­er­ate, un­til you retaliate

• “con­cedes when pun­ished” and co­op­er­ates af­ter a defect/​defect result

• “re­tal­i­ates against un­pro­voked ag­gres­sion”, defect­ing if you defect on it while it co­op­er­ates.

If there’s any ran­dom­ness, Pavlov is bet­ter at co­op­er­at­ing with it­self than Tit-For-Tat. One ac­ci­den­tal defec­tion and two Tit-For-Tats are stuck in an eter­nal defect cy­cle, while Pavlov’s for­give each other and wind up back in a co­op­er­ate/​co­op­er­ate pat­tern.

More­over, Pavlov can ex­ploit Co­op­er­ateBot (if it defects by ac­ci­dent, it will keep greed­ily defect­ing against the hap­less Co­op­er­ateBot, while Tit-For-Tat will not) but still ex­erts some pres­sure against Defec­tBot (defect­ing against it half the time, com­pared to Tit-For-Tat’s con­sis­tent defec­tion.)

The in­ter­est­ing thing is that Pavlov can beat Tit-For-Tat or Tit-for-Tat-with-For­give­ness in a wide va­ri­ety of sce­nar­ios.

If there are only Pavlov and Tit-For-Tat bots, Tit-For-Tat has to start out out­num­ber­ing Pavlov quite sig­nifi­cantly in or­der to win. The same is true for a pop­u­la­tion of Pavlov and Tit-For-Tat-With-For­give­ness. It doesn’t change if we add in some Co­op­er­a­tors or Defec­tors ei­ther.

Why?

Com­pared to Tit-For-Tat, Pavlov co­op­er­ates bet­ter with it­self. If two Tit-For-Tat bots are paired, and one of them ac­ci­den­tally defects, they’ll be stuck in a mu­tual defec­tion equil­ibrium. How­ever, if one Pavlov bot ac­ci­den­tally defects against its clone, we’ll see

C/​D → D/​D → C->C

which re­cov­ers a mu­tual-co­op­er­a­tion equil­ibrium and picks up more points.

Com­pared to Tit-For-Tat-With-For­give­ness, Pavlov co­op­er­ates *worse* with it­self (it takes longer to re­cover from mis­takes) but it “ex­ploits” TFTWF’s pa­tience bet­ter. If Pavlov ac­ci­den­tally defects against TFTWF, the re­sult is

D/​C → D/​C → D/​D → C/​D → D/​D → C/​C,

which leaves Pavlov with a net gain of 1 point per turn, (over the first five turns be­fore a co­op­er­a­tive equil­ibrium) com­pared to TFTWF’s 15 point per turn.

If TFTWF ac­ci­den­tally defects against Pavlov, the re­sult is

C/​D → D/​C → D/​C → D/​D → C/​D

which cy­cles eter­nally (un­til the next mis­take), get­ting Pavlov an av­er­age of 54 points per turn, com­pared to TFTWF’s 1 point per turn.

Either way, Pavlov even­tu­ally over­takes TFTWF.

If you add enough Defec­tBots to a mix of Pavlovs and TFT’s (and it has to be a large ma­jor­ity of the to­tal pop­u­la­tion be­ing Defec­tBots) TFT can win, be­cause it’s more re­sis­tant against Defec­tBots than Pavlov is. Pavlov co­op­er­ates with Defec­tBots half the time; TFT never does ex­cept by mis­take.

Pavlov isn’t perfect, but it performs well enough to hold its own in a va­ri­ety of cir­cum­stances. An adapted ver­sion of Pavlov won the 2005 iter­ated game the­ory tour­na­ment.

Why, then, don’t we ac­tu­ally talk about it, the way we talk about Tit-For-Tat? If it’s true that moral max­ims like the Golden Rule emerge out of the fact that Tit-For-Tat is an effec­tive strat­egy, why aren’t there moral max­ims that ex­em­plify the Pavlov strat­egy? Why haven’t I even heard of Pavlov un­til now, de­spite hav­ing taken a game the­ory course once, when ev­ery­body has heard of Tit-For-Tat and has an in­tu­itive feel­ing for how it works?

In Wedekind and Milin­ski’s 1996 ex­per­i­ment with hu­man sub­jects, play­ing an iter­ated pris­oner’s dilemma game, a full 70% of them en­gaged in Pavlov-like strate­gies. The hu­man Pavlo­vians were smarter than a pure Pavlov strat­egy — they even­tu­ally rec­og­nized the Defec­tBots and stopped co­op­er­at­ing with them, while a pure-Pavlov strat­egy never would — but, just like Pavlov, the hu­mans kept “push­ing bound­aries” when un­op­posed.

More­over, hu­mans ba­si­cally di­vided them­selves into Pavlo­vians and Tit-For-Tat-ers; they didn’t switch strate­gies be­tween game con­di­tions where one strat­egy or an­other was su­pe­rior, but just played the same way each time.

In other words, it seems fairly likely not only that Pavlov performs well in com­puter simu­la­tions, but that hu­mans do have some in­tu­itive model of Pavlov.

Hu­man play­ers are more likely to use gen­er­ous Tit-For-Tat strate­gies rather than Pavlov when they have to play a work­ing-mem­ory game at the same time as they’re play­ing iter­ated Pri­soner’s Dilemma. In other words, Pavlov is prob­a­bly more costly in work­ing mem­ory than gen­er­ous Tit for Tat.

If you look at all 16 the­o­ret­i­cally pos­si­ble strate­gies that only have mem­ory of the pre­vi­ous round, and let them evolve, evolu­tion­ary dy­nam­ics can wind up quite com­plex and os­cilla­tory.

A pop­u­la­tion of TFT play­ers will be in­vaded by more “for­giv­ing” strate­gies like Pavlov, who in turn can be in­vaded by Defec­tBot and other un­co­op­er­a­tive strate­gies, which again can be in­vaded by TFT, which thrives in high-defec­tion en­vi­ron­ments. If you track the over­all rate of co­op­er­a­tion over time, you get very reg­u­lar os­cilla­tions, though these are quite sen­si­tive to vari­a­tion in the er­ror and mu­ta­tion rates and non­pe­ri­odic (chaotic) be­hav­ior can oc­cur in some regimes.

This is strangely rem­i­nis­cent of Peter Turchin’s the­ory of sec­u­lar cy­cles in his­tory. Pe­ri­ods of peace and pros­per­ity al­ter­nate with pe­ri­ods of con­flict and poverty; em­pires rise and fall. Pe­ri­ods of low co­op­er­a­tion hap­pen at the fall of an em­pire/​state/​civ­i­liza­tion; this en­ables new em­pires to rise when a sub­group has bet­ter abil­ity to co­op­er­ate with it­self and fight off its en­e­mies than the sur­round­ing war­ring peo­ples; but in peace­time, at the height of an em­pire, more for­giv­ing and ex­ploita­tive strate­gies like Pavlov can emerge, which them­selves are vuln­er­a­ble to the bar­baric defec­tors. This is a vastly sim­plified story com­pared to the ac­tual math­e­mat­i­cal dy­nam­ics or the ac­tual his­tory, of course, but it’s an illus­tra­tive gist.

The big take­away from learn­ing about evolu­tion­ary game the­ory is that it’s gen­uinely com­pli­cated from a player-per­spec­tive.

“It’s com­pli­cated” some­times func­tions as a cu­ri­os­ity-stop­per; you con­clude “more re­search is needed” in­stead of look­ing at the data you have and draw­ing pre­limi­nary con­clu­sions, if you want to pro­tect your in­tel­lec­tual “ter­ri­tory” with­out putting your­self out of a job.

That isn’t the kind of “com­plex­ity” I’m talk­ing about here. Chaos in dy­nam­i­cal sys­tems has a spe­cific mean­ing: the sys­tem is so sen­si­tive to ini­tial con­di­tions that even a small mea­sure­ment er­ror in de­ter­min­ing where it starts means you can­not even ap­prox­i­mately pre­dict where it will end up.

“Chaos: When the pre­sent de­ter­mines the fu­ture, but the ap­prox­i­mate pre­sent does not ap­prox­i­mately de­ter­mine the fu­ture.”

Op­ti­mal strat­egy de­pends sen­si­tively on who else is in the pop­u­la­tion, how many er­rors you make, and how likely strate­gies are to change (or en­ter or leave). There are a lot of mov­ing parts here.

• I gen­er­ally ap­pre­ci­ate posts that help me un­der­stand game the­ory. I ap­pre­ci­ate this post as well as it’s fol­lowup for putting clearly ex­plain­ing a bunch of build­ing-blocks that, even if a bit “spher­i­cal cow” sim­plis­tic, help give my real-world co­or­di­na­tion some gears.

But I think this par­tic­u­lar post was use­ful be­cause of it’s sur­pris­ing­ness – many game-the­ory posts sort of roughly re­in­force what I already knew. You can win at co­or­di­na­tion through pun­ish­ment, or co­op­er­at­ing with al­lies, etc. The Pavlov al­gorithm was some­thing I couldn’t pat­tern-match as eas­ily to a known so­cial strat­egy. It felt a bit alien, and I think this gave me a bit of “origi­nal see­ing” that I’ve found sub­jec­tively helpful.

I’m not sure this has ac­tu­ally born fruit yet – I don’t think I’ve yet re­ally made de­ci­sions differ­ently be­cause of this. But, it feels like an im­por­tant piece of the over­all puz­zle of co­or­di­na­tion.

• This post bridges two do­mains, game the­ory and re­in­force­ment learn­ing, which pre­vi­ously I pre­vi­ously thought of as mostly sep­a­rate; and it caused a pretty big shift in my model of how in­tel­li­gence-in-gen­eral works, since this is much sim­pler than my pre­vi­ous sim­plest model of how re­in­force­ment learn­ing would do game the­ory.

• It was in­sight­ful for me and helped to un­der­stand my failures in busi­ness.

• helped to un­der­stand my failures in busi­ness.

I’m in­ter­ested in hear­ing de­tails about this if you’re up for shar­ing! My own take on Pavlov was “huh, in­ter­est­ing, let’s keep an eye for out situ­a­tions this seems rele­vant to”, but most of the up­dates were on the meta level. Cu­ri­ous if it was more di­rectly ap­pli­ca­ble to you.

• I con­tinued to work with a part­ner who cheated on me with­out pun­ish­ing him, and the part­ner cheated even more.

No reviews.
• Huh, in­ter­st­ing pa­per. That’s 1993 - Is there a more mod­ern ver­sion with more stochas­tic pa­ram­e­ters ex­plored? Seems like an easy pa­per if not.

I’m also re­minded of how com­puter sci­en­tists of­ten end up do­ing simu­la­tions rather than ba­sic math. This seems like a com­pli­cated sys­tem of equa­tions, but maybe you could work out its prop­er­ties with a cou­ple of hours and ba­sic non­lin­ear dy­nam­ics knowl­edge.

• Pro­moted to cu­rated: I think this post both com­mu­ni­cates a set of im­por­tant con­cepts, while also build­ing on past con­cepts that have been dis­cussed on LessWrong. I think this kind of game-the­ory is pretty valuable as an in­tu­ition pump for a large va­ri­ety of en­vi­ron­ments and prob­lems.

Pre­sen­ta­tion wise, I think the se­quences of of agent-be­hav­iors was a bit hard for me to fol­low, and while I think I was still able to fol­low the core points with­out them, find­ing some way to make them more in­tu­itive or an­no­tate them more would be valuable.

• If Pavlov ac­ci­den­tally defects against TFTWF, the re­sult is
D/​C → D/​C → D/​D → C/​D → D/​D → C/​C,

Can you ex­plain this se­quence? I’m puz­zled by it as it doesn’t fol­low the defi­ni­tions that I know about. My un­der­stand­ing of TFTWF is that it is “Tit for Tat with a small ran­domised pos­si­bil­ity of for­giv­ing a de­faulter by co­op­er­at­ing any­way.” What seems to be hap­pen­ing in the above se­quence is Pavlov on the left and, on the right, TFT with a de­lay of 1.

• I think what’s be­ing called “TFTWF” here is what some other places call “Tit for Two Tats”, that is, it defects in re­sponse to two defec­tions in a row.

• But wouldn’t the se­quence then look like this?

D/​C → D/​C → D/​D → C/​D → D/​C

and con­tinue like this for­ever.

Why does TFTWF defect against C? What’s the for­give­ness there?

• Yeah, I think you’re right.* So it ac­tu­ally looks the same as the “TFTWF ac­ci­den­tally defects” case.

*as­sum­ing we spec­ify TFTWF as “defect against DD, co­op­er­ate oth­er­wise”. I don’t see a rea­son­able al­ter­nate defi­ni­tion. I think you’re right that defect­ing against DC is bad, and if we go to 3-mem­ory, defect­ing against DDC while co­op­er­at­ing with DCD seems bad too.** Sarah can’t be as­sum­ing the lat­ter, any­way, be­cause the “TFTWF ac­ci­den­tally defects” case would look differ­ent.

**there might be some fairly rea­son­ably-be­haved var­i­ant that’s like “defect if >=2 of 3 past moves were D”, but that seems like a) prob­a­bly bad since I just made it up and b) not what’s be­ing dis­cussed here.

• The link for “Chaos: When the pre­sent de­ter­mines the fu­ture, but the ap­prox­i­mate pre­sent does not ap­prox­i­mately de­ter­mine the fu­ture.” ap­pears to be not working

• Is the “Chaos” part meant to be a link? It doesn’t seem to go any­where.