chanind comments on Negative Results on Group SAEs

chanind 9 May 2025 11:35 UTC
5 points
0
Thank you for writing this up! I experimented briefly with group sparsity as well, but with the goal of learning the “hierarchy” of features rather than to learn circular features like you’re doing here. I also struggled to get it to work in toy settings, but didn’t try extensively and ended up moving on to other things. I still think there must be something in group sparsity, since it’s so well studied in sparse coding and clearly does work in theory.

I also struggled with the problem of how to choose groups, since for traditional group sparsity you need to set the groups before-hand. I like your idea of trying to learn the group space. For using group sparsity to recover hierarchy, I wonder if there’s a way to learn a direction for the group as a whole, and project out that direction from each member of the group. The idea would be that if latents are sharing common components, those common components should probably be their own “group” representation, and this should be done until the leaf nodes are mostly orthogonal to each other. There are definitely overlapping hierarchies too, which is a challenge.
Regardless, thank you for sharing this! There’s a lot of great ideas in this post.