Lucius Bushnaq comments on Compressed Computation is (probably) not Computation in Superposition

Lucius Bushnaq 25 Jun 2025 19:51 UTC
5 points
3
Thank you for looking into this.
This investigation updated me more toward thinking that Computation in Superposition is unlikely to train in this kind of setup, because it’s mainly concerned with minimising worst-case noise. It does lots of things, but it does them all with low precision. A task where the model is scored on how close to correct it gets many continuously-valued labels, as scored by MSE loss, is not good for this.

I think we need a task where the labels are somehow more discrete, or the loss function punishes outlier errors more, or the computation has multiple steps, where later steps in the computation depend on lots of intermediary results computed to low precision.
What links here?
- Circuits in Superposition 2: Now with Less Wrong Math by Linda Linsefors (30 Jun 2025 10:25 UTC; 73 points)