Not sure where to land on that. It seems like both are good analogies? Brains might not be using gradients at all[1], whereas evolution basically is.
I mean, does it matter? What if it turns out that gradient descent itself doesn’t affect inductive biases as much as the parameter->function mapping? If implicit regularization (e.g. SGD) isn’t an important part of the generalization story in deep learning, will you down-update on the appropriateness of the evolution/AI analogy?
I mean, does it matter? What if it turns out that gradient descent itself doesn’t affect inductive biases as much as the parameter->function mapping? If implicit regularization (e.g. SGD) isn’t an important part of the generalization story in deep learning, will you down-update on the appropriateness of the evolution/AI analogy?