Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
jacob_drori
Karma:
565
All
Posts
Comments
New
Top
Old
jacob_drori’s Shortform
jacob_drori
1 Aug 2025 17:47 UTC
7
points
6
comments
1
min read
LW
link
[Research Note] Optimizing The Final Output Can Obfuscate CoT
lukemarks
,
jacob_drori
,
cloud
and
TurnTrout
30 Jul 2025 21:26 UTC
196
points
22
comments
6
min read
LW
link
SAE on activation differences
Santiago Aranguri
,
jacob_drori
and
Neel Nanda
30 Jun 2025 17:50 UTC
44
points
3
comments
5
min read
LW
link
Sparsely-connected Cross-layer Transcoders
jacob_drori
18 Jun 2025 17:13 UTC
45
points
3
comments
12
min read
LW
link
There is a globe in your LLM
jacob_drori
8 Oct 2024 0:43 UTC
89
points
4
comments
1
min read
LW
link
Domain-specific SAEs
jacob_drori
7 Oct 2024 20:15 UTC
28
points
2
comments
5
min read
LW
link
Open Source Automated Interpretability for Sparse Autoencoder Features
kh4dien
,
SrGonao
,
jacob_drori
and
Nora Belrose
30 Jul 2024 21:11 UTC
67
points
1
comment
13
min read
LW
link
(blog.eleuther.ai)
Back to top