I found a recent paper that ran a fun little contest on whether three seq2seq models (LSTMS2S, ConvS2S and Transformer) are “compositional”, for various definitions of “compositional”: The compositionality of neural networks: integrating symbolism and connectionism. (Answer: Basically yes, especially the Transformer.) That was somewhat helpful, but I still feel like I don’t really understand what exactly these models are learning and how (notwithstanding your excellent Transformer blog post), or how their “knowledge” compares with the models built by the more directly brain-inspired wing of ML (example), or for that matter to actual brain algorithms. I need to think about it more. Anyway, thanks for writing this, it’s a helpful perspective on these issues.
I found a recent paper that ran a fun little contest on whether three seq2seq models (LSTMS2S, ConvS2S and Transformer) are “compositional”, for various definitions of “compositional”: The compositionality of neural networks: integrating symbolism and connectionism. (Answer: Basically yes, especially the Transformer.) That was somewhat helpful, but I still feel like I don’t really understand what exactly these models are learning and how (notwithstanding your excellent Transformer blog post), or how their “knowledge” compares with the models built by the more directly brain-inspired wing of ML (example), or for that matter to actual brain algorithms. I need to think about it more. Anyway, thanks for writing this, it’s a helpful perspective on these issues.