RSS

Arthur Conmy

Karma: 1,139

Intepretability

Views my own

JumpReLU SAEs + Early Ac­cess to Gemma 2 SAEs

19 Jul 2024 16:10 UTC
44 points
8 comments1 min readLW link
(storage.googleapis.com)

SAEs (usu­ally) Trans­fer Between Base and Chat Models

18 Jul 2024 10:29 UTC
40 points
0 comments10 min readLW link

At­ten­tion Out­put SAEs Im­prove Cir­cuit Analysis

21 Jun 2024 12:56 UTC
31 points
0 comments19 min readLW link