I think there argument might be misleading in that local stability isn’t that rare in practice
Surely this depends on the number of dimensions, with local stability being rarer the more dimensions you have. [Hence the argument that, in the infinite-dimensional limit, everything that would have been an “local minimum” is instead a saddle point.]
Maybe. What I was arguing was: just because all of the partial derivatives are 0 at a point, doesn’t mean it isn’t a saddle point. You have to check all of the directional derivatives; in two dimensions, there are uncountably infinitely many.
Thus, I can prove to you that we are extremely unlikely to ever encounter a valley in real life:
A valley must have a lowest point A.
For A to be a local minimum, all of its directional derivatives must be 0:
Direction N (north), AND
Direction NE (north-east), AND
Direction NNE, AND
Direction NNNE, AND
...
This doesn’t work because the directional derivatives aren’t probabilistically independent in real life; you have to condition on the underlying geological processes, instead of supposing you’re randomly drawing a topographic function from R2 to R.
For the corrigibility argument to go through, I claim we need to consider more information about corrigibility in particular.
I guess my issue is that corrigibility is an exogenous specification; you’re not just saying “the algorithm goes to a fixed point” but rather “the algorithm goes to this particular pre-specified point, and it is a fixed point”. If I pick a longitude and latitude with a random number generator, it’s unlikely to be the bottom of a valley. Or maybe this analogy is not helpful and we should just be talking about corrigibility directly :-P
Surely this depends on the number of dimensions, with local stability being rarer the more dimensions you have. [Hence the argument that, in the infinite-dimensional limit, everything that would have been an “local minimum” is instead a saddle point.]
Maybe. What I was arguing was: just because all of the partial derivatives are 0 at a point, doesn’t mean it isn’t a saddle point. You have to check all of the directional derivatives; in two dimensions, there are uncountably infinitely many.
Thus, I can prove to you that we are extremely unlikely to ever encounter a valley in real life:
A valley must have a lowest point A.
For A to be a local minimum, all of its directional derivatives must be 0:
Direction N (north), AND
Direction NE (north-east), AND
Direction NNE, AND
Direction NNNE, AND
...
This doesn’t work because the directional derivatives aren’t probabilistically independent in real life; you have to condition on the underlying geological processes, instead of supposing you’re randomly drawing a topographic function from R2 to R.
For the corrigibility argument to go through, I claim we need to consider more information about corrigibility in particular.
I guess my issue is that corrigibility is an exogenous specification; you’re not just saying “the algorithm goes to a fixed point” but rather “the algorithm goes to this particular pre-specified point, and it is a fixed point”. If I pick a longitude and latitude with a random number generator, it’s unlikely to be the bottom of a valley. Or maybe this analogy is not helpful and we should just be talking about corrigibility directly :-P