I was thinking of the NN approximating the “extrapolate” function itself, that thing which takes in partial data and generates an extrapolation. That function is, by assumption, capable of extrapolation from incomplete data. Therefore, I expect a sufficiently precise approximation to that function is able to extrapolate.
It may be helpful to argue from Turing completeness instead. Transformers are Turing complete. If “extrapolation” is Turing computable, then there’s a transformer that implements extrapolation.
Also, the post is talking about what NNs are ever able to do, not just what they can do now. That’s why I thought it was appropriate to bring up theoretical computer science.
I was thinking of the NN approximating the “extrapolate” function itself, that thing which takes in partial data and generates an extrapolation. That function is, by assumption, capable of extrapolation from incomplete data. Therefore, I expect a sufficiently precise approximation to that function is able to extrapolate.
It may be helpful to argue from Turing completeness instead. Transformers are Turing complete. If “extrapolation” is Turing computable, then there’s a transformer that implements extrapolation.
Also, the post is talking about what NNs are ever able to do, not just what they can do now. That’s why I thought it was appropriate to bring up theoretical computer science.