This post was useful to me! I’ve heard people talk about this paper a lot, but I never quite understood why people were so interested in it. By the time it came out, I had already long considered statistical learning theory basically-useless in practice, and I already knew (from Jaynes) that overparameterized systems can generalize just fine if you do the full Bayesian math. But I hadn’t realized that this paper specifically hit people over the head with facts in that general cluster.
This post was useful to me! I’ve heard people talk about this paper a lot, but I never quite understood why people were so interested in it. By the time it came out, I had already long considered statistical learning theory basically-useless in practice, and I already knew (from Jaynes) that overparameterized systems can generalize just fine if you do the full Bayesian math. But I hadn’t realized that this paper specifically hit people over the head with facts in that general cluster.