The composition paper seems to exemplify what I talk about as my intuition for how NNs work. The models are both very small and trained on little data, but image classification seems to be much easier than NLP (which is why the DL revolution came to image classification many years before NLP), so it’s enough to train the CNN to have fairly meaningful disentangled representations of the kind we expect; their RNN model, however, continues to grope through relatively superficial associations and tricks, as the text database is relatively tiny. I’d predict that if they analyze much larger networks, like BiT or GPT-3, they’d find much more composition, and much less reliance on polysemanticity, and less vulnerability to easy ‘copy-paste’ adversarial examples.
The composition paper seems to exemplify what I talk about as my intuition for how NNs work. The models are both very small and trained on little data, but image classification seems to be much easier than NLP (which is why the DL revolution came to image classification many years before NLP), so it’s enough to train the CNN to have fairly meaningful disentangled representations of the kind we expect; their RNN model, however, continues to grope through relatively superficial associations and tricks, as the text database is relatively tiny. I’d predict that if they analyze much larger networks, like BiT or GPT-3, they’d find much more composition, and much less reliance on polysemanticity, and less vulnerability to easy ‘copy-paste’ adversarial examples.
Yup, I generally agree (both with the three predictions, and the general story of how NNs work).