StefanHex comments on StefanHex’s Shortform

StefanHex 11 Feb 2025 11:08 UTC
29 points
4
PSA: People use different definitions of “explained variance” / “fraction of variance unexplained” (FVU)
${F V U}_{A} = \frac{\frac{1}{N} \sum_{n = 1}^{N} ∥ x_{n} - x_{n, p r e d} ∥^{2}}{\frac{1}{N} \sum_{n = 1}^{N} ∥ x_{n} - μ ∥^{2}} where μ = \frac{1}{N} N \sum n = 1 x_{n}$
${F V U}_{A}$ is the formula I think is sensible; the bottom is simply the variance of the data, and the top is the variance of the residuals. The $∥$ indicates the $L_{2}$ norm over the dimension of the vector $x$ . I believe it matches Wikipedia’s definition of FVU and R squared.
${F V U}_{B} = \frac{1}{N} N \sum n = 1 \frac{∥ x_{n} - x_{n, p r e d} ∥^{2}}{∥ x_{n} - μ ∥^{2}}$
${F V U}_{B}$ is the formula used by SAELens and SAEBench. It seems less principled, @Lucius Bushnaq and I couldn’t think of a nice quantity it corresponds to. I think of it as giving more weight to samples that are close to the mean, kind-of averaging relative reduction in difference rather than absolute.
A third version (h/t @JoshEngels) which computes the FVU for each dimension independently and then averages, but that version is not used in the context we’re discussing here.
In my recent comment I had computed my own ${F V U}_{A}$ , and compared it to FVUs from SAEBench (which used ${F V U}_{B}$ ) and obtained nonsense results.
Curiously the two definitions seem to be approximately proportional—below I show the performance of a bunch of SAEs—though for different distributions (here: activations in layer 3 and 4) the ratio differs.^[1] Still, this means using ${F V U}_{B}$ instead of ${F V U}_{A}$ to compare e.g. different SAEs doesn’t make a big difference as long as one is consistent.
Thanks to @JoshEngels for pointing out the difference, and to @Lucius Bushnaq for helpful discussions.
1. ^
  If a predictor doesn’t perform systematically better or worse at points closer to the mean then this makes sense. The denominator changes the relative weight of different samples but this doesn’t have any effect beyond noise and a global scale, as long as there is no systematic performance difference.
- Terence Coelho 12 Feb 2025 2:21 UTC
  8 points
  3
  Parent
  I would be very surprised if this FVU_B actually another definition and not a bug. It’s not a fraction of the variance and those denominators can easily be zero or very near zero.
- Gurkenglas 11 Feb 2025 11:29 UTC
  2 points
  0
  Parent
  https://github.com/jbloomAus/SAELens/blob/main/sae_lens/evals.py#L511 sums the numerator and denominator separately, if they aren’t doing that in some other place probably just file a bug report?
  - StefanHex 11 Feb 2025 12:18 UTC
    2 points
    1
    Parent
    I think this is the sum over the vector dimension, but not over the samples. The sum (mean) over samples is taken later in this line which happens after the division
    
    metrics[f"{metric_name}"] = torch.cat(metric_values).mean().item()
    
    Edit: And to clarify, my impression is that people think of this as alternative definitions of FVU and you got to pick one, rather than one being right and one being a bug.
    
    Edit2: And I’m in touch with the SAEBench authors about making a PR to change this / add both options (and by extension probably doing the same in SAELens); though I won’t mind if anyone else does it!
    - Gurkenglas 11 Feb 2025 12:43 UTC
      4 points
      0
      Parent
      Ah, oops. I think I got confused by the absence of L_2 syntax in your formula for FVU_B. (I agree that FVU_A is more principled ^^.)
      - StefanHex 11 Feb 2025 13:41 UTC
        2 points
        0
        Parent
        Oops, fixed!
- Oliver Clive-Griffin 26 Feb 2025 20:11 UTC
  1 point
  0
  Parent
  This was really helpful, thanks! just wanting to clear up my understanding:
  
  This is the wikipedia entry for FVU:
  
  ${}$
  
  where:
  
  ${}$
  
  There’s no mention of norms because (as I understand) $y$ and $^y$ are assumed to be scalar values so $S S_{err}$ and $S S_{tot}$ are scalar. Do I understand it correctly that you’re treating $∥ x_{n} - x_{n, pred} ∥^{2}$ as the multi-dimensional equivalent of $S S_{err}$ and $∥ x_{n} - μ ∥^{2}$ as the multi-dimensional equivalent of $S S_{tot}$ ? This would make sense as using the squared norms of the differences makes it basis / rotation invariant.
  - StefanHex 26 Feb 2025 20:47 UTC
    3 points
    0
    Parent
    Yep, that’s the generalisation that would make most sense
    - Oliver Clive-Griffin 26 Feb 2025 20:57 UTC
      1 point
      0
      Parent
      Thanks. Also, in the case of crosscoders, where you have multiple output spaces, do you have any thoughts on the best way to aggregate across these? currently I’m just computing them separately and taking the mean. But I could see imagine it perhaps being better to just concat the spaces and do fvu on that, using l2 norm of the concated vectors.
      - StefanHex 27 Feb 2025 10:04 UTC
        2 points
        0
        Parent
        Yeah you probably shouldn’t concat the spaces due to things like “they might have very different norms & baseline variances”. Maybe calculate each layer separately, then if they’re all similar average them together, otherwise keep separate and quote as separate numbers in your results
- Archimedes 12 Feb 2025 5:34 UTC
  1 point
  0
  Parent
  FVU_B doesn’t make sense but I don’t see where you’re getting FVU_B from.
  
  Here’s the code I’m seeing:
```
resid_sum_of_squares = (
    (flattened_sae_input - flattened_sae_out).pow(2).sum(dim=-1)
)
total_sum_of_squares = (
    (flattened_sae_input - flattened_sae_input.mean(dim=0)).pow(2).sum(-1)
)

mse = resid_sum_of_squares / flattened_mask.sum()
explained_variance = 1 - resid_sum_of_squares / total_sum_of_squares
```
  Explained variance = 1 - FVU = 1 - (residual sum of squares) / (total sum of squares)
  - StefanHex 12 Feb 2025 10:19 UTC
    2 points
    0
    Parent
    I think this is the sum over the vector dimension, but not over the samples. The sum (mean) over samples is taken later in this line which happens after the division
    
    metrics[f"{metric_name}"] = torch.cat(metric_values).mean().item()
    - Archimedes 13 Feb 2025 0:47 UTC
      1 point
      0
      Parent
      Let’s suppose that’s the case. I’m still not clear on how are you getting to FVU_B?
      - StefanHex 13 Feb 2025 9:33 UTC
        2 points
        0
        Parent
        The previous lines calculate the ratio (or 1-ratio) stored in the “explained variance” key for every sample/batch. Then in that later quoted line, the list is averaged, I.e. we”re taking the sample average over the ratio. That’s the FVU_B formula.
        
        Let me know if this clears it up or if we’re misunderstanding each other!