Buck comments on Buck’s Shortform

Buck 14 Oct 2023 22:24 UTC
LW: 14 AF: 7
0
AF
Here’s a math problem I’d like solved:
I want to know the Nash equilibrium of the following two player game between players Alice and Bob.
Alice chooses the rates of two Poisson processes; we call these rates lam_1 and lam_2.
Bob needs to run each of these processes for 0.5 seconds. Bob can run each of the processes in “shielded” mode or “unshielded” mode, but can only afford to use $t$ seconds of shielded mode in total between the two processes. (So Bob chooses $t_{1}$ and $t_{2}$ such that $t_{1} + t_{2} = t$ .)
The game goes as follows:
- We run process 1 for time $t_{1}$ and process 2 for time $t_{2}$ , in shielded mode. If either event occurs during these time periods, Bob wins.
- Then we run process 1 for time $0.5 - t_{1}$ and process 2 for time $0.25 - t_{2}$ , in unshielded mode. If either event occurs during these time periods, Alice wins.
- If the event never occurred, Bob wins.
Here’s a Sympy implementation of this game (where I parameterize with $t_{1} = p t$ and $t_{2} = (1 - p) t$ .)
```
import sympy as sp

lam_1, lam_2, p, t = sp.symbols('lambda_1 lambda_2 p t', real=True, positive=True)

time_2_overseen = t * (1 - p)
time_2_not_overseen = sp.Rational(1, 2) - time_2_overseen
time_1_overseen = t * p
time_1_not_overseen = sp.Rational(1, 2) - time_1_overseen

p_caught = 1 - sp.exp(-lam_2 * time_2_overseen - lam_1 * time_1_overseen)

p_alice_win_given_not_caught = 1 - sp.exp(-lam_1 * time_1_not_overseen - lam_2 * time_2_not_overseen / 2)

p_alice_win = ((1 - p_caught) * p_alice_win_given_not_caught).expand().simplify()
p_alice_win
```
The expression for P(Alice wins) is $e^{- λ_{1} p t + λ_{2} p t - λ_{2} t} - e^{- \frac{λ_{1}}{2} + \frac{λ_{2} p t}{2} - \frac{λ_{2} t}{2} - \frac{λ_{2}}{4}}$ .
Again, I want the Nash equilibrium.
(This is a model for a scenario where you’re trying to decide how much of your oversight budget to spend overseeing a model when it’s doing actions that are good opportunities for attempting to cause a catastrophe, vs actions which are bad opportunities for attempting a catastrophe.)
What links here?
- Toy models of AI control for concentrated catastrophe prevention by Fabien Roger (6 Feb 2024 1:38 UTC; 51 points)
- harfe 15 Oct 2023 22:31 UTC
  LW: 3 AF: 2
  0
  AF Parent
  I am going to assume that in the code, when calculating p_alice_win_given_not_caught, we do not divide the $λ_{2}$ term by two (since this is not that consistent with the description. I am also assuming that $0.25 - t_{2}$ is a typo and $0.5 - t_{2}$ is meant, which would also be more consistent with other stuff). So I am going to assume assume a symmetrical version.
  
  Here, P(Alice wins) is $e^{- λ_{1} t_{1} - λ_{2} t_{2}} - e^{- λ_{1} / 2 - λ_{2} / 2} =: f (λ_{1}, λ_{2}, t_{1}, t_{2})$ . Wlog we can assume $0 < t < 1$ (otherwise Bob will run everything or nothing in shielded mode).
  
  We claim that $((λ_{1}^{*}, λ_{2}^{*}), (t_{1}^{*}, t_{2}^{*})) := ((μ^{*}, μ^{*}), (t / 2, t / 2)$ is a (pure) Nash equilibrium, where $μ^{*} = log t / (t - 1)$ .
  
  To verify, lets first show that Alice cannot make a better choice if Bob plays $(t_{1}^{*}, t_{2}^{*}) = (t / 2, t / 2)$ . We have $f (λ_{1}, λ_{2}, t_{1}^{*}, t_{2}^{*}) = e^{- (λ_{1} + λ_{2}) t / 2} - e^{- (λ_{1} + λ_{2}) / 2}$ . Since this only depends on the sum, we can make the substitution $μ = (λ_{1} + λ_{2}) / 2$ . Thus, we want to maximize $~ f (μ, t) = e^{- μ t} - e^{- μ}$ . We have ${~ f}^{'} (μ, t) = - t e^{- μ t} + e^{- μ} = 0$ . Rearranging, we get $t e^{- μ t} = e^{- μ}$ . Taking logs, we get $log t - μ t = - μ$ . Rearranging, we get $μ = log t / (t - 1)$ . Thus, $μ = μ^{*}$ is the optimal choice. This means, that if Bob sticks to his strategy, Alice cannot do better than $f (μ^{*}, μ^{*}, t / 2, t / 2)$ .
  
  Now, lets show that Bob cannot do better. We have $f (λ_{1}^{*}, λ_{2}^{*}, t_{1}, t_{2}) = e^{- μ^{*} t_{1} - μ^{*} t_{2}} - e^{- μ^{*} / 2 - μ^{*} / 2} = e^{- μ^{*} t} - e^{- μ^{*}}$ . This does not depend on $t_{1}$ and $t_{2}$ anymore, so any choice of $t_{1}$ and $t_{2}$ is optimal if Alice plays $(μ^{*}, μ^{*})$ .
  
  (If I picked the wrong version of the question, and you actually want some symmetry: I suspect that the solution will have similarities, or that in some cases the solution can be obtained by rescaling the problem back into a more symmetric form.)
  - Buck 16 Oct 2023 15:35 UTC
    LW: 3 AF: 2
    0
    AF Parent
    Thanks heaps! I wanted the asymmetric version but concurrently with your comment I figured out how to convert from the asymmetric version to the symmetric version.
    I’ll credit you when I post the writeup that this is a part of, and I might run a draft by you in case you want to point out any errors. :)
    - harfe 16 Oct 2023 15:46 UTC
      1 point
      0
      Parent
      Sure, I’d be happy to read a draft
- kave 15 Oct 2023 22:11 UTC
  1 point
  0
  Parent
  My investigations with Code Interpreter suggest that the Nash equilbria have p = 0.5, and the lambdas depend on t in a concave way (for 0 < t < 0.25). They also maybe aren’t unique for a given t?
  I didn’t really check to make sure all that code interpreter was doing made sense tho