Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Joseph Miller comments on
Towards Monosemanticity: Decomposing Language Models With Dictionary Learning
Joseph Miller
7 Oct 2023 0:45 UTC
6
points
2
Looking at their
interactive visualization
, I was surprised how clean random learned features are.
Back to top
Looking at their interactive visualization, I was surprised how clean random learned features are.