Here’s my opinions on what deep learning can do, FWIW − 1 (abstraction) yes, but they aren’t sample efficient ! 2. (generalization) eh, not if you define generalization as going out of distribution (note: that’s not how it’s normally defined in ML literature). Deep learning systems can barely generalize outside their training data distribution at all. The one exception I know is how GPT-3 learned addition but even then it broke down at large numbers. Some GPT-3 generalization failures can be seen here. 3. (causality) maybe? 4. (long term planning) - I think DL can do this, but maybe not necessarily using the same kind of hierarchical planning framework that humans seem to use 5. (need for intervention) - is this getting at embodiment? I’m a bit unclear. In any case, it doesn’t seem like this is real critical to me.
So for me, the main issues I have with deep learning are the low sample efficiency and lack of generalization ability. Which is why I’m skeptical that just scaling up the deep learning methods we have now can lead to true AGI, although it might get to something which is for many practical purposes pretty close.
Here’s my opinions on what deep learning can do, FWIW −
1 (abstraction) yes, but they aren’t sample efficient !
2. (generalization) eh, not if you define generalization as going out of distribution (note: that’s not how it’s normally defined in ML literature). Deep learning systems can barely generalize outside their training data distribution at all. The one exception I know is how GPT-3 learned addition but even then it broke down at large numbers. Some GPT-3 generalization failures can be seen here.
3. (causality) maybe?
4. (long term planning) - I think DL can do this, but maybe not necessarily using the same kind of hierarchical planning framework that humans seem to use
5. (need for intervention) - is this getting at embodiment? I’m a bit unclear. In any case, it doesn’t seem like this is real critical to me.
So for me, the main issues I have with deep learning are the low sample efficiency and lack of generalization ability. Which is why I’m skeptical that just scaling up the deep learning methods we have now can lead to true AGI, although it might get to something which is for many practical purposes pretty close.