RSS

Hoagy

Karma: 745

Sparse Au­toen­coders Find Highly In­ter­pretable Direc­tions in Lan­guage Models

21 Sep 2023 15:30 UTC
122 points
5 comments5 min readLW link

Au­toIn­ter­pre­ta­tion Finds Sparse Cod­ing Beats Alternatives

Hoagy17 Jul 2023 1:41 UTC
50 points
1 comment7 min readLW link

[Repli­ca­tion] Con­jec­ture’s Sparse Cod­ing in Small Transformers

16 Jun 2023 18:02 UTC
52 points
0 comments5 min readLW link

[Repli­ca­tion] Con­jec­ture’s Sparse Cod­ing in Toy Models

2 Jun 2023 17:34 UTC
21 points
0 comments1 min readLW link