A Paradox in Timeless Decision Theory

I’m putting this in the discussion section because I’m not sure whether something like this has already been thought of, and I don’t want to repeat things in a top-level post.

Anyway, consider a Prisoner’s-Dilemma-like situation with the following payoff matrix:
You defect, opponent defects: 0 utils
You defect, opponent cooperates: 3 utils
You cooperate, opponent defects: 1 util
You cooperate, opponent cooperates: 2 utils
Assume all players have either have full information about their opponents, or are allowed to communicate and will be able to deduce each others’ strategy correctly.

Suppose you are a a timeless decision theory agent playing this modified Prisoner’s Dilemma with an actor that will always pick “defect” no matter what your strategy is. Clearly, your best move is to cooperate, gaining you 1 util instead of no utility, and giving your opponent his maximum 3 utils instead of the no utility he would get if you defected. Now suppose you are playing against another timeless decision theory agent. Clearly, the best strategy is to be that actor which defects no matter what. If both agents do this, the worst possible result for both of them occurs.

This situation can actually happen in the real world. Suppose there are two rival countries, and one demands some tribute or concession from the other, and threatens war if the other country does not agree, even though such a war would be very costly for both countries. The rulers of the threatened country can either pay the less expensive tribute or accept a more expensive war so that the first country will back off, but the rulers of the first country have thought of that and have committed to not back down anyway. If the tribute is worth 1 util to each side, and a war costs 2 utils to each side, this is identical to the payoff matrix I described. I’d be pretty surprised if nothing like this has ever happened.