RSS

Georg Lange

Karma: 125

Au­tomat­ing In­ter­pretabil­ity with Agents

1 May 2026 2:59 UTC
9 points
0 comments10 min readLW link

Cross-Layer Transcoders are in­cen­tivized to learn Un­faith­ful Circuits

2 Feb 2026 21:32 UTC
46 points
6 comments18 min readLW link

SAEs Dis­cover Mean­ingful Fea­tures in the IOI Task

5 Jun 2024 23:48 UTC
15 points
2 comments10 min readLW link

An In­ter­pretabil­ity Illu­sion for Ac­ti­va­tion Patch­ing of Ar­bi­trary Subspaces

29 Aug 2023 1:04 UTC
77 points
4 comments1 min readLW link