Turns out our methods are not actually very path-dependent in practice!
Yeah I get that’s what Mingard et al are trying to show but the meaning of their empirical results isn’t clear to me—but I’ll try and properly read the actual paper rather than the blog post before saying any more in that direction.
“Flat minimum surrounded by areas of relatively good performance” is synonymous with compression. if we can vary the parameters in lots of ways without losing much performance, that implies that all the info needed for optimal performance has been compressed into whatever-we-can’t-vary-without-losing-performance.
I get that a truly flat area is synonymous with compression—but I think being surrounded by areas of good performance is anti-correlated with compression because it indicates redundancy and less-than-maximal sensitivity.
I agree that viewing it as flat eigendimensions in parameter space is the right way to think about it, I still worry that the same concerns apply that maximal compression in this space is traded against ease of finding what would be a flat plain in many dimensions, but a maximally steep ravine in all of the other directions. I can imagine this could be investigated with some small experiments, or they may well already exist but I can’t promise I’ll follow up, if anyone is interested let me know.
Yeah I get that’s what Mingard et al are trying to show but the meaning of their empirical results isn’t clear to me—but I’ll try and properly read the actual paper rather than the blog post before saying any more in that direction.
I get that a truly flat area is synonymous with compression—but I think being surrounded by areas of good performance is anti-correlated with compression because it indicates redundancy and less-than-maximal sensitivity.
I agree that viewing it as flat eigendimensions in parameter space is the right way to think about it, I still worry that the same concerns apply that maximal compression in this space is traded against ease of finding what would be a flat plain in many dimensions, but a maximally steep ravine in all of the other directions. I can imagine this could be investigated with some small experiments, or they may well already exist but I can’t promise I’ll follow up, if anyone is interested let me know.