Jesse Hoogland comments on You’re Measuring Model Complexity Wrong

Jesse Hoogland 19 Feb 2025 23:45 UTC
LW: 4 AF: 2
0
AF
To be precise, it is a property of singular models (which includes neural networks) in the Bayesian setting. There are good empirical reasons to expect the same to be true for neural networks trained with SGD (across a wide range of different models, we observe the LLC progressively increase from ~0 over the course of training).