ryan_greenblatt comments on Causal scrubbing: results on a paren balance checker

ryan_greenblatt 11 Jan 2023 18:15 UTC
LW: 6 AF: 5
0
AF
An important note here is that our final 80% loss recovered results in loss which is worse than a constant model! (aka, a rock)

Specifically, note that the dataset consists of 26% balanced sequences as discussed here. So, the loss of a constant model is $- (l o g (0.26) * 0.26 + l o g (1 - 0.26) * (1 - 0.26)) = 0.573$ (equivalent to the entropy of the labels). This is less than the 1.22 loss we get for our final 72% loss recovered experiment.

We think this is explanation is still quite meaningful—adversarially constructed causal scrubbing hypothesis preserve variance and the model is very confident. So, getting to such a loss is still meaningful. For instance, if you train an affine model from the log probability the scrubbed model outputs to a new logit trained for getting optimal loss on the dataset, I expect this does much better than a constant model (a rock).

We probably should have included the constant model comparison in the table and related discussion (like in this comment) in the original post. Sorry. We may edit this into the post later.
- Tao Lin 11 Jan 2023 18:27 UTC
  LW: 3 AF: 4
  −1
  AF Parent
  I’m not against evaluating models in ways where they’re worse than rocks, I just think you shouldn’t expect anyone else to care about your worse-than-rock numbers without very extensive justification (including adversarial stuff)