simulus comments on Parameters Are Like Pixels

simulus 14 Jan 2026 20:15 UTC
1 point
2
This was the classical intuition, but turned out to be untrue in the regime of large NNs.
The modern view is double descent (https://en.wikipedia.org/wiki/Double_descent), where small models generalize better until the number of parameters exceeds the number of training examples, then larger models generalize better with the same amount of data.