jsteinhardt comments on [link] New essay summarizing some of my latest thoughts on AI safety

jsteinhardt 10 Nov 2015 8:29 UTC
2 points
0
Yeah I should be a bit more careful on number 4. The point is that many papers which argue that a given NN is learning “natural” representations do so by looking at what an individual hidden unit responds to (as opposed to looking at the space spanned by the hidden layer as a whole). Any such argument seems dubious to me without further support, since it relies on a sort of delicate symmetry-breaking which can only come from either the training procedure or noise in the data, rather than the model itself. But I agree that if such an argument was accompanied by justification of why the training procedure or data noise or some other factor led to the symmetry being broken in a natural way, then I would potentially be happy.
- paulfchristiano 15 Nov 2015 1:15 UTC
  0 points
  0
  Parent
  
  delicate symmetry-breaking which can only come from either the training procedure or noise in the data, rather than the model itself
  
  I’m still not convinced. The pointwise nonlinearities introduce a preferred basis, and cause the individual hidden units to be much more meaningful than linear combinations thereof.
  - jsteinhardt 15 Nov 2015 7:48 UTC
    0 points
    0
    Parent
    Yeah; I discussed this with some others and came to the same conclusion. I do still think that one should explain why the preferred basis ends up being as meaningful as it does, but agree that this is a much more minor objection.