...”If agents know a good amount about each other due to access to source code or other such means, then a single-shot game has the character of iterated game theory. How can we capture that in our rationality criteria?”

I would study this problem by considering a population of learning agents that are sequentially paired up for one-shot games where the source code of each is revealed to the other. This way they can learn how to reason about source codes. In contrast, the model where the agents try to formally prove theorems seems poorly motivated to me.

Actually, here’s another thought about this. Consider the following game: each player submits a program, then the programs are given each other as inputs, and executed in anytime mode (i.e. at given moment each program has to have some answer ready). The execution has some probability ϵ to terminate on each time step, so that the total execution time follows the geometric distribution. Once execution is finished, the output of the programs is interpreted as strategies in a normal-form game and the payoffs are accordingly.

This seems quite similar to an iterated game with geometric time discount! In particular, in this version of the Prisoner’s Dilemma, there is the following analogue of the tit-for-tat strategy: start by setting the answer to C, simulate the other agent, once the other agents starts producing answers, set your answer to equal to the other agent’s answer. For sufficiently shallow time discount, this is a Nash equilibrium.

In line with the idea I explained in a previous comment, it seems tempting to look for proper/thermal equilibria in this game with some constraint on the programs. One constraint that seems appealing is: force the program to have O(1) space complexity modulo oracle access to the other program. It is easy to see that a pair of such programs can be executed using O(1) space complexity as well.

Actually, here’s another thought about this. Consider the following game: each player submits a program, then the programs are given each other as inputs, and executed in anytime mode (i.e. at given moment each program has to have some answer ready). The execution has some probability ϵ to terminate on each time step, so that the total execution time follows the geometric distribution. Once execution is finished, the output of the programs is interpreted as strategies in a normal-form game and the payoffs are accordingly.

This seems quite similar to an iterated game with geometric time discount! In particular, in this version of the Prisoner’s Dilemma, there is the following analogue of the tit-for-tat strategy: start by setting the answer to C, simulate the other agent, once the other agents starts producing answers, set your answer to equal to the other agent’s answer. For sufficiently shallow time discount, this is a Nash equilibrium.

In line with the idea I explained in a previous comment, it seems tempting to look for proper/thermal equilibria in this game with some constraint on the programs. One constraint that seems appealing is: force the program to have O(1) space complexity modulo oracle access to the other program. It is easy to see that a pair of such programs can be executed using O(1) space complexity as well.