Note: I think what you’re doing there is asking what incremental change in the training data uniquely strengthens the influence of one feature in the network without touching the others.
The “pointiest directions” in parameter space correspond to the biggest features in the orthogonalised feature set of the network.
So I’d agree with the prediction that if you calculate what dtheta the dx corresponds to in the second network, you’d indeed often find that it’s close to being an eigenvector/most prominent orthogonalised feature of the second network too. Because we know that neural networks tend to learn similar features when trained on similar tasks.
I think it might be interesting to see whether actually modifying the training data in the dx direction would tend to give you a network where the corresponding feature is more prominent, and how large dx can get before that ceases to hold.
Note: I think what you’re doing there is asking what incremental change in the training data uniquely strengthens the influence of one feature in the network without touching the others.
The “pointiest directions” in parameter space correspond to the biggest features in the orthogonalised feature set of the network.
So I’d agree with the prediction that if you calculate what dtheta the dx corresponds to in the second network, you’d indeed often find that it’s close to being an eigenvector/most prominent orthogonalised feature of the second network too. Because we know that neural networks tend to learn similar features when trained on similar tasks.
I think it might be interesting to see whether actually modifying the training data in the dx direction would tend to give you a network where the corresponding feature is more prominent, and how large dx can get before that ceases to hold.