Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
TomasD
Karma:
147
All
Posts
Comments
New
Top
Old
A Bunch of Matryoshka SAEs
chanind
,
TomasD
and
Adrià Garriga-alonso
4 Apr 2025 14:53 UTC
29
points
0
comments
8
min read
LW
link
Feature Hedging: Another way correlated features break SAEs
chanind
,
TomasD
and
Adrià Garriga-alonso
25 Mar 2025 14:33 UTC
22
points
0
comments
18
min read
LW
link
Toy Models of Feature Absorption in SAEs
chanind
,
hrdkbhatnagar
,
TomasD
and
Joseph Bloom
7 Oct 2024 9:56 UTC
49
points
8
comments
10
min read
LW
link
[Paper] A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
chanind
,
TomasD
,
hrdkbhatnagar
and
Joseph Bloom
25 Sep 2024 9:31 UTC
73
points
16
comments
3
min read
LW
link
(arxiv.org)
TomasD’s Shortform
TomasD
14 Mar 2024 15:03 UTC
1
point
0
comments
1
min read
LW
link
Back to top