Or if your knowledge of the environment does helpful randomization for you (if you’re not >99% sure your two copies will take the same action), CDT’ll at least press the button. But yeah, interesting problem.
Is the correct policy an equilibrium? Suppose the payoff was 5$, not 1000$. If you all press with probability P, you get: (1-P)^3 of 0, 3P(1-P)^2 of −1, 3P^2(1-P) of 3, and P^3 of 2. Optimal P is 0.8873 for payoff of 2.162.
Now suppose you know your two copies are pressing the button with P=0.8873. You press with probability Q. You get (1-P)^2(1-Q) of 0, 2P(1-P)(1-Q) + (1-P)^2Q of −1, 2P(1-P)Q + P^2(1-Q) of 3, and P^2Q of 2. Optimal Q is 0. If you never press the button, you get 2*0.8873*(1-0.8873) of −1 and 0.8873^2 of 3, which is 2.262.
So if you know your copies are playing the optimal policy for three, you shouldn’t press the button :D
Or if your knowledge of the environment does helpful randomization for you (if you’re not >99% sure your two copies will take the same action), CDT’ll at least press the button. But yeah, interesting problem.
Is the correct policy an equilibrium? Suppose the payoff was 5$, not 1000$. If you all press with probability P, you get: (1-P)^3 of 0, 3P(1-P)^2 of −1, 3P^2(1-P) of 3, and P^3 of 2. Optimal P is 0.8873 for payoff of 2.162.
Now suppose you know your two copies are pressing the button with P=0.8873. You press with probability Q. You get (1-P)^2(1-Q) of 0, 2P(1-P)(1-Q) + (1-P)^2Q of −1, 2P(1-P)Q + P^2(1-Q) of 3, and P^2Q of 2. Optimal Q is 0. If you never press the button, you get 2*0.8873*(1-0.8873) of −1 and 0.8873^2 of 3, which is 2.262.
So if you know your copies are playing the optimal policy for three, you shouldn’t press the button :D
I think if others play with probability P, every value of Q is equally good.
Not sure if this is a typo, but I get 2*0.8873*(1-0.8873)(-1)+0.8873^{2}(3) = 2.162
Which is the same as if you play Q=P. Which supports the claim that every value of Q is equally good.
I can’t check today, but whoops, sorry if I typoed the equation at some step.