Based on the quote from Jessica Taylor, it seems like the FDT agents are trying to maximize their long-term share of the population, rather than their absolute payoffs in a single generation? If I understand the model correctly, that means the FDT agents should try to maximize the ratio of FDT payoff : 9-bot payoff (to maximize the ratio of FDT:9-bot in the next generation). The algebra then shows that they should refuse to submit to 9-bots once the population of FDT agents gets high enough (Wolfram|Alpha link), without needing to drop the random encounters assumption.
It still seems like CDT agents would behave the same way given the same goals, though?
Apparently an LW user did a series of interviews with AI researchers in 2011, some of which included a similar question. I know most LW users have probably seen this, but I only found it today and thought it was worth flagging here.