Thane Ruthenis comments on leogao’s Shortform

Thane Ruthenis 16 Nov 2025 6:18 UTC
31 points
0
our recent paper
Aside: For me, this paper is potentially the most exciting interpretability result of the past several years (since SAEs). Scaling it to GPT-3 and beyond seems like a very promising direction. Great job!
- habryka 16 Nov 2025 8:37 UTC
  13 points
  −6
  Parent
  I agree! I admit I am not optimistic, but I am still very glad to see this.