Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
RGRGRG
Karma:
138
All
Posts
Comments
New
Top
Old
Cross-Layer Transcoders are incentivized to learn Unfaithful Circuits
Georg Lange
,
RGRGRG
,
Kat Dearstyne
and
Kamal Maher
2 Feb 2026 21:32 UTC
40
points
6
comments
18
min read
LW
link
Alternative Models of Superposition
zroe1
and
RGRGRG
11 Aug 2025 15:52 UTC
20
points
6
comments
5
min read
LW
link
Seeking Feedback on My Mechanistic Interpretability Research Agenda
RGRGRG
12 Sep 2023 18:45 UTC
5
points
1
comment
3
min read
LW
link
Thoughts about the Mechanistic Interpretability Challenge #2 (EIS VII #2)
RGRGRG
28 Jul 2023 20:44 UTC
26
points
5
comments
20
min read
LW
link
[Question]
Best Ways to Try to Get Funding for Alignment Research?
RGRGRG
4 Apr 2023 6:35 UTC
10
points
6
comments
1
min read
LW
link
Back to top