eggsyntax comments on Nina Panickssery’s Shortform

eggsyntax 30 Oct 2025 3:59 UTC
4 points
0
For some applications, you may want to express something in terms of the model’s own abstractions
It seems like this applies to some kinds of activation steering (eg steering on SAE features) but not really to others (eg contrastive prompts); curious whether you would agree.
- Nina Panickssery 30 Oct 2025 4:46 UTC
  4 points
  0
  Parent
  Perhaps. I see where you are coming from. Though I think it’s possible contrastive-prompt-based vectors (eg. CAA) also approximate “natural” features better than training on those same prompts (fewer degrees of freedom with the correct inductive bias). I should check whether there has been new research on this…
  - eggsyntax 30 Oct 2025 4:55 UTC
    2 points
    0
    Parent
    Thanks! If you find research that addresses that question, I’d be interested to know about it.