Dalcy comments on What’s the Right Way to think about Information Theoretic quantities in Neural Networks?

Dalcy 22 Jan 2025 5:19 UTC
2 points
0
Ah you’re right. I was thinking about the deterministic case.
Your explanation of the jacobian term accounting for features “squeezing together” makes me update towards thinking maybe the quantizing done to turn neural networks from continuous & deterministic to discrete & stochastic, while ad hoc, isn’t as unreasonable as I originally thought it was. This paper is where I got the idea that discretization is bad because it “conflates ‘information theoretic stuff’ with ‘geometric stuff’, like clustering”—but perhaps this is in fact capturing something real.
What links here?
- What’s the Right Way to think about Information Theoretic quantities in Neural Networks? by Dalcy (19 Jan 2025 8:04 UTC; 45 points)