Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Georg Lange
Karma:
73
All
Posts
Comments
New
Top
Old
SAEs Discover Meaningful Features in the IOI Task
Alex Makelov
,
Georg Lange
and
Neel Nanda
Jun 5, 2024, 11:48 PM
15
points
2
comments
10
min read
LW
link
An Interpretability Illusion for Activation Patching of Arbitrary Subspaces
Georg Lange
,
Alex Makelov
and
Neel Nanda
Aug 29, 2023, 1:04 AM
77
points
4
comments
1
min read
LW
link
Back to top