Lucius Bushnaq comments on From SLT to AIT: NN generalisation out-of-distribution

Lucius Bushnaq 7 Sep 2025 18:36 UTC
6 points
0
Assuming that the bits to parameters encoding can be relaxed, there’s some literature about redundant computations in neural networks. If the feature vectors in a weight matrix aren’t linearly independent, for example, the same computation can be “spread” over many linearly dependent features, with the result that there are no free parameters but the total amount of computational work is the same.
There’s a few other cases like this where we know how various specific forms of simplicity in the computation map onto freedom in the parameters. But those are not enough in this case. We need more freedom than that.