Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
kh4dien
Karma:
120
https://cadentj.com/
All
Posts
Comments
New
Top
Old
Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning
kh4dien
,
Helena Casademunt
,
Adam Karvonen
,
Sam Marks
,
Senthooran Rajamanoharan
and
Neel Nanda
23 Jul 2025 14:57 UTC
78
points
7
comments
5
min read
LW
link
Open Source Automated Interpretability for Sparse Autoencoder Features
kh4dien
,
SrGonao
,
jacob_drori
and
Nora Belrose
30 Jul 2024 21:11 UTC
67
points
1
comment
13
min read
LW
link
(blog.eleuther.ai)
Back to top