RSS

Georg Lange(Georg Lange)

Karma: 60

SAEs Dis­cover Mean­ingful Fea­tures in the IOI Task

5 Jun 2024 23:48 UTC
4 points
0 comments9 min readLW link

An In­ter­pretabil­ity Illu­sion for Ac­ti­va­tion Patch­ing of Ar­bi­trary Subspaces

29 Aug 2023 1:04 UTC
75 points
4 comments1 min readLW link