RSS

chanind

Karma: 348

An­thropic’s JumpReLU train­ing method is re­ally good

3 Oct 2025 15:23 UTC
39 points
0 comments2 min readLW link

The “Spar­sity vs Re­con­struc­tion Trade­off” Illusion

26 Aug 2025 4:39 UTC
21 points
0 comments4 min readLW link

L0 is not a neu­tral hyperparameter

19 Jul 2025 13:51 UTC
24 points
3 comments5 min readLW link

Spar­sity is the en­emy of fea­ture ex­trac­tion (ft. ab­sorp­tion)

3 May 2025 10:13 UTC
32 points
0 comments6 min readLW link

A Bunch of Ma­tryoshka SAEs

4 Apr 2025 14:53 UTC
29 points
0 comments8 min readLW link

Fea­ture Hedg­ing: Another way cor­re­lated fea­tures break SAEs

25 Mar 2025 14:33 UTC
23 points
0 comments18 min readLW link

Bro­ken La­tents: Study­ing SAEs and Fea­ture Co-oc­cur­rence in Toy Models

30 Dec 2024 22:50 UTC
24 points
3 comments15 min readLW link

SAEBench: A Com­pre­hen­sive Bench­mark for Sparse Autoencoders

11 Dec 2024 6:30 UTC
82 points
6 comments2 min readLW link
(www.neuronpedia.org)

Toy Models of Fea­ture Ab­sorp­tion in SAEs

7 Oct 2024 9:56 UTC
49 points
8 comments10 min readLW link

[Paper] A is for Ab­sorp­tion: Study­ing Fea­ture Split­ting and Ab­sorp­tion in Sparse Autoencoders

25 Sep 2024 9:31 UTC
73 points
16 comments3 min readLW link
(arxiv.org)

Auto-match­ing hid­den lay­ers in Py­torch LLMs

chanind19 Feb 2024 12:40 UTC
2 points
0 comments3 min readLW link