FVUA is the formula I think is sensible; the bottom is simply the variance of the data, and the top is the variance of the residuals. The ∥ indicates the L2 norm over the dimension of the vector x. I believe it matches Wikipedia’s definition of FVU and R squared.
FVUB=1NN∑n=1∥xn−xn,pred∥2∥xn−μ∥2
FVUB is the formula used by SAELens and SAEBench. It seems less principled, @Lucius Bushnaq and I couldn’t think of a nice quantity it corresponds to. I think of it as giving more weight to samples that are close to the mean, kind-of averaging relative reduction in difference rather than absolute.
A third version (h/t @JoshEngels) which computes the FVU for each dimension independently and then averages, but that version is not used in the context we’re discussing here.
In my recent comment I had computed my own FVUA, and compared it to FVUs from SAEBench (which used FVUB) and obtained nonsense results.
Curiously the two definitions seem to be approximately proportional—below I show the performance of a bunch of SAEs—though for different distributions (here: activations in layer 3 and 4) the ratio differs.[1] Still, this means using FVUB instead of FVUA to compare e.g. different SAEs doesn’t make a big difference as long as one is consistent.
If a predictor doesn’t perform systematically better or worse at points closer to the mean then this makes sense. The denominator changes the relative weight of different samples but this doesn’t have any effect beyond noise and a global scale, as long as there is no systematic performance difference.
I would be very surprised if this FVU_B actually another definition and not a bug. It’s not a fraction of the variance and those denominators can easily be zero or very near zero.
I think this is the sum over the vector dimension, but not over the samples. The sum (mean) over samples is taken later in this line which happens after the division
Edit: And to clarify, my impression is that people think of this as alternative definitions of FVU and you got to pick one, rather than one being right and one being a bug.
Edit2: And I’m in touch with the SAEBench authors about making a PR to change this / add both options (and by extension probably doing the same in SAELens); though I won’t mind if anyone else does it!
This was really helpful, thanks! just wanting to clear up my understanding:
This is the wikipedia entry for FVU:
where:
There’s no mention of norms because (as I understand) y and ^y are assumed to be scalar values so SSerr and SStot are scalar. Do I understand it correctly that you’re treating ∥xn−xn,pred∥2 as the multi-dimensional equivalent of SSerr and ∥xn−μ∥2 as the multi-dimensional equivalent of SStot? This would make sense as using the squared norms of the differences makes it basis / rotation invariant.
Thanks. Also, in the case of crosscoders, where you have multiple output spaces, do you have any thoughts on the best way to aggregate across these? currently I’m just computing them separately and taking the mean. But I could see imagine it perhaps being better to just concat the spaces and do fvu on that, using l2 norm of the concated vectors.
Yeah you probably shouldn’t concat the spaces due to things like “they might have very different norms & baseline variances”. Maybe calculate each layer separately, then if they’re all similar average them together, otherwise keep separate and quote as separate numbers in your results
I think this is the sum over the vector dimension, but not over the samples. The sum (mean) over samples is taken later in this line which happens after the division
The previous lines calculate the ratio (or 1-ratio) stored in the “explained variance” key for every sample/batch. Then in that later quoted line, the list is averaged, I.e. we”re taking the sample average over the ratio. That’s the FVU_B formula.
Let me know if this clears it up or if we’re misunderstanding each other!
PSA: People use different definitions of “explained variance” / “fraction of variance unexplained” (FVU)
FVUA=1N∑Nn=1∥xn−xn,pred∥21N∑Nn=1∥xn−μ∥2where μ=1NN∑n=1xnFVUA is the formula I think is sensible; the bottom is simply the variance of the data, and the top is the variance of the residuals. The ∥ indicates the L2 norm over the dimension of the vector x. I believe it matches Wikipedia’s definition of FVU and R squared.
FVUB=1NN∑n=1∥xn−xn,pred∥2∥xn−μ∥2FVUB is the formula used by SAELens and SAEBench. It seems less principled, @Lucius Bushnaq and I couldn’t think of a nice quantity it corresponds to. I think of it as giving more weight to samples that are close to the mean, kind-of averaging relative reduction in difference rather than absolute.
A third version (h/t @JoshEngels) which computes the FVU for each dimension independently and then averages, but that version is not used in the context we’re discussing here.
In my recent comment I had computed my own FVUA, and compared it to FVUs from SAEBench (which used FVUB) and obtained nonsense results.
Curiously the two definitions seem to be approximately proportional—below I show the performance of a bunch of SAEs—though for different distributions (here: activations in layer 3 and 4) the ratio differs.[1] Still, this means using FVUB instead of FVUA to compare e.g. different SAEs doesn’t make a big difference as long as one is consistent.
Thanks to @JoshEngels for pointing out the difference, and to @Lucius Bushnaq for helpful discussions.
If a predictor doesn’t perform systematically better or worse at points closer to the mean then this makes sense. The denominator changes the relative weight of different samples but this doesn’t have any effect beyond noise and a global scale, as long as there is no systematic performance difference.
I would be very surprised if this FVU_B actually another definition and not a bug. It’s not a fraction of the variance and those denominators can easily be zero or very near zero.
https://github.com/jbloomAus/SAELens/blob/main/sae_lens/evals.py#L511 sums the numerator and denominator separately, if they aren’t doing that in some other place probably just file a bug report?
I think this is the sum over the vector dimension, but not over the samples. The sum (mean) over samples is taken later in this line which happens after the division
Edit: And to clarify, my impression is that people think of this as alternative definitions of FVU and you got to pick one, rather than one being right and one being a bug.
Edit2: And I’m in touch with the SAEBench authors about making a PR to change this / add both options (and by extension probably doing the same in SAELens); though I won’t mind if anyone else does it!
Ah, oops. I think I got confused by the absence of L_2 syntax in your formula for FVU_B. (I agree that FVU_A is more principled ^^.)
Oops, fixed!
This was really helpful, thanks! just wanting to clear up my understanding:
This is the wikipedia entry for FVU:
where:
There’s no mention of norms because (as I understand) y and ^y are assumed to be scalar values so SSerr and SStot are scalar. Do I understand it correctly that you’re treating ∥xn−xn,pred∥2 as the multi-dimensional equivalent of SSerr and ∥xn−μ∥2 as the multi-dimensional equivalent of SStot? This would make sense as using the squared norms of the differences makes it basis / rotation invariant.
Yep, that’s the generalisation that would make most sense
Thanks. Also, in the case of crosscoders, where you have multiple output spaces, do you have any thoughts on the best way to aggregate across these? currently I’m just computing them separately and taking the mean. But I could see imagine it perhaps being better to just concat the spaces and do fvu on that, using l2 norm of the concated vectors.
Yeah you probably shouldn’t concat the spaces due to things like “they might have very different norms & baseline variances”. Maybe calculate each layer separately, then if they’re all similar average them together, otherwise keep separate and quote as separate numbers in your results
FVU_B doesn’t make sense but I don’t see where you’re getting FVU_B from.
Here’s the code I’m seeing:
Explained variance = 1 - FVU = 1 - (residual sum of squares) / (total sum of squares)
I think this is the sum over the vector dimension, but not over the samples. The sum (mean) over samples is taken later in this line which happens after the division
Let’s suppose that’s the case. I’m still not clear on how are you getting to FVU_B?
The previous lines calculate the ratio (or 1-ratio) stored in the “explained variance” key for every sample/batch. Then in that later quoted line, the list is averaged, I.e. we”re taking the sample average over the ratio. That’s the FVU_B formula.
Let me know if this clears it up or if we’re misunderstanding each other!