Connor Kissane

Karma: 175

Sparse Autoencoders Work on Attention Layer Outputs

Connor Kissane, robertzk, Arthur Conmy and Neel Nanda

16 Jan 2024 0:26 UTC

82 points

5 comments19 min readLW link

Attention SAEs Scale to GPT-2 Small

Connor Kissane, robertzk, Arthur Conmy and Neel Nanda

3 Feb 2024 6:50 UTC

76 points

4 comments8 min readLW link

Connor Kissane 31 Mar 2024 16:35 UTC
7 points
0
on: SAE-VIS: Announcement Post
Amazing! We found your original library super useful for our Attention SAEs research, so thanks for making this!

Connor Kissane 14 Aug 2023 14:20 UTC
1 point
0
on: Mech Interp Puzzle 1: Suspiciously Similar Embeddings in GPT-Neo
These puzzles are great, thanks for making them!

Connor Kissane 19 Jul 2023 19:57 UTC
1 point
0
on: Causal scrubbing: results on induction heads
Code for this token filtering can be found in the appendix and the exact token list is linked.
Maybe I just missed it, but I’m not seeing this. Is the code still available?