Charlie Steiner comments on Gradient descent is not just more efficient genetic algorithms

Charlie Steiner 9 Sep 2021 3:36 UTC
2 points
0
Huh. I was interpreting it differently then—if I was building a module-checker to keep an eye out for AI tampering, I would not feed the result of the checker back into the gradient signal.

The big difference, parroting Steve ( https://www.lesswrong.com/posts/ey7jACdF4j6GrQLrG/thoughts-on-safety-in-predictive-learning , I think) is that gradient descent doesn’t try things out and then keep what works, it models changes and does what is good in the model.
- leogao 9 Sep 2021 16:34 UTC
  4 points
  0
  Parent
  Sorry if this wasn’t clear in the post but when I say “we’re” trying to protect some submodule, I don’t mean us as the engineers who want to make sure the model doesn’t change a submodule (we could do that trivially, just add a stop gradient), but rather from the perspective of a gradient hacker that needs to protect its own logic from being disabled by SGD.
  - Charlie Steiner 10 Sep 2021 3:31 UTC
    2 points
    0
    Parent
    Ah, that makes sense.