I want to know the Nash equilibrium of the following two player game between players Alice and Bob.
Alice chooses the rates of two Poisson processes; we call these rates lam_1 and lam_2.
Bob needs to run each of these processes for 0.5 seconds. Bob can run each of the processes in “shielded” mode or “unshielded” mode, but can only afford to use t seconds of shielded mode in total between the two processes. (So Bob chooses t1 and t2 such that t1+t2=t.)
The game goes as follows:
We run process 1 for time t1 and process 2 for time t2, in shielded mode. If either event occurs during these time periods, Bob wins.
Then we run process 1 for time 0.5−t1 and process 2 for time 0.25−t2, in unshielded mode. If either event occurs during these time periods, Alice wins.
If the event never occurred, Bob wins.
Here’s a Sympy implementation of this game (where I parameterize with t1=pt and t2=(1−p)t.)
The expression for P(Alice wins) is e−λ1pt+λ2pt−λ2t−e−λ12+λ2pt2−λ2t2−λ24.
Again, I want the Nash equilibrium.
(This is a model for a scenario where you’re trying to decide how much of your oversight budget to spend overseeing a model when it’s doing actions that are good opportunities for attempting to cause a catastrophe, vs actions which are bad opportunities for attempting a catastrophe.)
I am going to assume that in the code, when calculating p_alice_win_given_not_caught, we do not divide the λ2 term by two (since this is not that consistent with the description. I am also assuming that 0.25−t2 is a typo and 0.5−t2 is meant, which would also be more consistent with other stuff).
So I am going to assume assume a symmetrical version.
Here, P(Alice wins) is e−λ1t1−λ2t2−e−λ1/2−λ2/2=:f(λ1,λ2,t1,t2).
Wlog we can assume 0<t<1 (otherwise Bob will run everything or nothing in shielded mode).
We claim that ((λ∗1,λ∗2),(t∗1,t∗2)):=((μ∗,μ∗),(t/2,t/2) is a (pure) Nash equilibrium,
where μ∗=logt/(t−1).
To verify, lets first show that Alice cannot make a better choice if Bob plays (t∗1,t∗2)=(t/2,t/2).
We have f(λ1,λ2,t∗1,t∗2)=e−(λ1+λ2)t/2−e−(λ1+λ2)/2.
Since this only depends on the sum, we can make the substitution μ=(λ1+λ2)/2.
Thus, we want to maximize ~f(μ,t)=e−μt−e−μ.
We have ~f′(μ,t)=−te−μt+e−μ=0.
Rearranging, we get te−μt=e−μ.
Taking logs, we get logt−μt=−μ.
Rearranging, we get μ=logt/(t−1).
Thus, μ=μ∗ is the optimal choice.
This means, that if Bob sticks to his strategy, Alice cannot do better than f(μ∗,μ∗,t/2,t/2).
Now, lets show that Bob cannot do better.
We have f(λ∗1,λ∗2,t1,t2)=e−μ∗t1−μ∗t2−e−μ∗/2−μ∗/2=e−μ∗t−e−μ∗.
This does not depend on t1 and t2 anymore, so any choice of t1 and t2
is optimal if Alice plays (μ∗,μ∗).
(If I picked the wrong version of the question, and you actually want some symmetry: I suspect that the solution will have similarities, or that in some cases the solution can be obtained by rescaling the problem back into a more symmetric form.)
Thanks heaps! I wanted the asymmetric version but concurrently with your comment I figured out how to convert from the asymmetric version to the symmetric version.
I’ll credit you when I post the writeup that this is a part of, and I might run a draft by you in case you want to point out any errors. :)
My investigations with Code Interpreter suggest that the Nash equilbria have p = 0.5, and the lambdas depend on t in a concave way (for 0 < t < 0.25). They also maybe aren’t unique for a given t?
I didn’t really check to make sure all that code interpreter was doing made sense tho
Here’s a math problem I’d like solved:
I want to know the Nash equilibrium of the following two player game between players Alice and Bob.
Alice chooses the rates of two Poisson processes; we call these rates lam_1 and lam_2.
Bob needs to run each of these processes for 0.5 seconds. Bob can run each of the processes in “shielded” mode or “unshielded” mode, but can only afford to use t seconds of shielded mode in total between the two processes. (So Bob chooses t1 and t2 such that t1+t2=t.)
The game goes as follows:
We run process 1 for time t1 and process 2 for time t2, in shielded mode. If either event occurs during these time periods, Bob wins.
Then we run process 1 for time 0.5−t1 and process 2 for time 0.25−t2, in unshielded mode. If either event occurs during these time periods, Alice wins.
If the event never occurred, Bob wins.
Here’s a Sympy implementation of this game (where I parameterize with t1=pt and t2=(1−p)t.)
The expression for P(Alice wins) is e−λ1pt+λ2pt−λ2t−e−λ12+λ2pt2−λ2t2−λ24.
Again, I want the Nash equilibrium.
(This is a model for a scenario where you’re trying to decide how much of your oversight budget to spend overseeing a model when it’s doing actions that are good opportunities for attempting to cause a catastrophe, vs actions which are bad opportunities for attempting a catastrophe.)
I am going to assume that in the code, when calculating
p_alice_win_given_not_caught
, we do not divide the λ2 term by two (since this is not that consistent with the description. I am also assuming that 0.25−t2 is a typo and 0.5−t2 is meant, which would also be more consistent with other stuff). So I am going to assume assume a symmetrical version.Here, P(Alice wins) is e−λ1t1−λ2t2−e−λ1/2−λ2/2=:f(λ1,λ2,t1,t2). Wlog we can assume 0<t<1 (otherwise Bob will run everything or nothing in shielded mode).
We claim that ((λ∗1,λ∗2),(t∗1,t∗2)):=((μ∗,μ∗),(t/2,t/2) is a (pure) Nash equilibrium, where μ∗=logt/(t−1).
To verify, lets first show that Alice cannot make a better choice if Bob plays (t∗1,t∗2)=(t/2,t/2). We have f(λ1,λ2,t∗1,t∗2)=e−(λ1+λ2)t/2−e−(λ1+λ2)/2. Since this only depends on the sum, we can make the substitution μ=(λ1+λ2)/2. Thus, we want to maximize ~f(μ,t)=e−μt−e−μ. We have ~f′(μ,t)=−te−μt+e−μ=0. Rearranging, we get te−μt=e−μ. Taking logs, we get logt−μt=−μ. Rearranging, we get μ=logt/(t−1). Thus, μ=μ∗ is the optimal choice. This means, that if Bob sticks to his strategy, Alice cannot do better than f(μ∗,μ∗,t/2,t/2).
Now, lets show that Bob cannot do better. We have f(λ∗1,λ∗2,t1,t2)=e−μ∗t1−μ∗t2−e−μ∗/2−μ∗/2=e−μ∗t−e−μ∗. This does not depend on t1 and t2 anymore, so any choice of t1 and t2 is optimal if Alice plays (μ∗,μ∗).
(If I picked the wrong version of the question, and you actually want some symmetry: I suspect that the solution will have similarities, or that in some cases the solution can be obtained by rescaling the problem back into a more symmetric form.)
Thanks heaps! I wanted the asymmetric version but concurrently with your comment I figured out how to convert from the asymmetric version to the symmetric version.
I’ll credit you when I post the writeup that this is a part of, and I might run a draft by you in case you want to point out any errors. :)
Sure, I’d be happy to read a draft
My investigations with Code Interpreter suggest that the Nash equilbria have p = 0.5, and the lambdas depend on t in a concave way (for 0 < t < 0.25). They also maybe aren’t unique for a given t?
I didn’t really check to make sure all that code interpreter was doing made sense tho