TurnTrout comments on Three mental images from thinking about AGI debate & corrigibility

TurnTrout 4 Aug 2020 19:33 UTC
LW: 6 AF: 4
0
AF
Maybe. What I was arguing was: just because all of the partial derivatives are 0 at a point, doesn’t mean it isn’t a saddle point. You have to check all of the directional derivatives; in two dimensions, there are uncountably infinitely many.
Thus, I can prove to you that we are extremely unlikely to ever encounter a valley in real life:
1. A valley must have a lowest point $A$ .
2. For $A$ to be a local minimum, all of its directional derivatives must be 0:
  1. Direction N (north), AND
  2. Direction NE (north-east), AND
  3. Direction NNE, AND
  4. Direction NNNE, AND
  5. ...
This doesn’t work because the directional derivatives aren’t probabilistically independent in real life; you have to condition on the underlying geological processes, instead of supposing you’re randomly drawing a topographic function from $R^{2}$ to $R$ .
For the corrigibility argument to go through, I claim we need to consider more information about corrigibility in particular.
- Steven Byrnes 5 Aug 2020 2:05 UTC
  LW: 3 AF: 1
  0
  AF Parent
  I guess my issue is that corrigibility is an exogenous specification; you’re not just saying “the algorithm goes to a fixed point” but rather “the algorithm goes to this particular pre-specified point, and it is a fixed point”. If I pick a longitude and latitude with a random number generator, it’s unlikely to be the bottom of a valley. Or maybe this analogy is not helpful and we should just be talking about corrigibility directly :-P