Matt Levinson comments on Attribution-based parameter decomposition

Matt Levinson 29 Mar 2025 23:24 UTC
1 point
0
Really exciting stuff here! I’ve been working on an alternate formulation of circuit discovery in the now traditional fixed problems case and have been brainstorming unsupervised circuit discovery, in the same spiritual vein as this work, though much less developed. You’ve laid the groundwork for a really exciting research direction here!
I have a few questions on the components definition and optimization. What does it mean when you say you define C components $P_{c}$ ? Do randomly partition the parameter vector into C partitions and assign each partition as a $P_{c}$ , with zeros elsewhere? Do you divide each weight by C, setting $w_{c, l, i, j} = w_{l, i, j} / C$ (+ $ε$ ?)?
Assuming something like that is going on, I definitely believe this has been tricky to optimize on larger, more complex networks! I wonder if more informed priors might help? As in, using other methods to suggest at least some proportion of candidate components? Have you considered or tried anything like that?