Additionally, in an intuitive sense, pruning a network seems as though it could be defined in terms of clusterability notions, which limits my enthusiasm for that result.
I see what you mean, but there exist things called expander graphs which are very sparse (i.e. very pruned) but minimally clusterable. Now, these don’t have a topology compatible with being a neural network, but are proofs of concept that you can prune without being clusterable. For more evidence, note that our pruned networks are more clusterable than if you permuted the weights randomly—that is, than random pruned networks.
I see what you mean, but there exist things called expander graphs which are very sparse (i.e. very pruned) but minimally clusterable. Now, these don’t have a topology compatible with being a neural network, but are proofs of concept that you can prune without being clusterable. For more evidence, note that our pruned networks are more clusterable than if you permuted the weights randomly—that is, than random pruned networks.