hrdkbhatnagar

Karma: 148

(Not) Explaining GPT-2-Small Forward Passes with Edge-Level Autoencoder Circuits

David Udell, hrdkbhatnagar and JacksonKaunismaa

22 Jul 2025 20:36 UTC

23 points

0 comments6 min readLW link

Compositionality and Ambiguity: Latent Co-occurrence and Interpretable Subspaces

Matthew A. Clarke, hrdkbhatnagar and Joseph Bloom

20 Dec 2024 15:16 UTC

36 points

0 comments37 min readLW link

Toy Models of Feature Absorption in SAEs

chanind, hrdkbhatnagar, TomasD and Joseph Bloom

7 Oct 2024 9:56 UTC

49 points

8 comments10 min readLW link

[Paper] A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders

chanind, TomasD, hrdkbhatnagar and Joseph Bloom

25 Sep 2024 9:31 UTC

74 points

19 comments3 min readLW link

(arxiv.org)