Superposition Without Compression: Why Entangled Representations Are the Default

Link post

I have heard numerous claims recently that the underparameterisation of neural networks can be implied due to the polysemanticity of its neurons, which is prevalent in LLMs.

Whilst I have no doubt that polysemanticity is the only solution to an underparameterised model, I urge on the side of caution when using polysemanticity as proof of underparametarisation.

In this note I claim that: even when sufficient capacity is available, superposition may be the default due to its overwhelming prevalence in the solution space. Disentangled, monosemantic solutions occupy a tiny fraction of the total low-loss solutions.

This suggests that superposition arises not just as a necessity in underparametarised models, but also is an inevitability of the search space of neural networks.

In this note I show a comprehensible toy example where this is the case and hypothesise that this is also the case in larger networks.

These were very rough Sunday musings so I am very interested about what other people think about this claim :).