Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Alex Makelov
Karma:
63
All
Posts
Comments
New
Top
Old
An Interpretability Illusion for Activation Patching of Arbitrary Subspaces
Georg Lange
,
Alex Makelov
and
Neel Nanda
29 Aug 2023 1:04 UTC
74
points
3
comments
1
min read
LW
link
Back to top