RSS

Arthur Conmy

Karma: 775

Intepretability

Views my own

At­ten­tion SAEs Scale to GPT-2 Small

3 Feb 2024 6:50 UTC
69 points
1 comment8 min readLW link

Sparse Au­toen­coders Work on At­ten­tion Layer Outputs

16 Jan 2024 0:26 UTC
80 points
4 comments19 min readLW link

My best guess at the im­por­tant tricks for train­ing 1L SAEs

Arthur Conmy21 Dec 2023 1:59 UTC
35 points
4 comments3 min readLW link