Joern Stoehler comments on Corrigibility, Much more detail than anyone wants to Read

Joern Stoehler 10 May 2023 11:03 UTC
0 points
0
Thanks for this concise post :) If we set $λ = 1$ I actually worry that agent $b$ will not do nothing, but instead prevent us from doing anything that reduces $F (a)$ . Imo it is not easy to formalize $F (a)$ such that we no longer want to reduce $F (a)$ ourselves. For example, we may want to glue a vase onto a fixed location inside our house, preventing it from accidentally falling and breaking. This however also prevents us from constantly moving the vase around the house, or from breaking it and scattering the pieces for maximum entropy.

Building an aligned superintelligence may also reduce $F (a)$ , as the SI steers the universe into a narrow set of states.
- Logan Zoellner 10 May 2023 13:26 UTC
  2 points
  1
  Parent
  F(a) is the set of futures reachable by agent a at some intial t=0. F_b(a) is the set of futures reachable at time t=0 by agent a if agent b exists. There’s no way for F_b(a) > F(a), since creating agent b is under our assumptions one of the things agent a can do.