Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
CallumMcDougall
Karma:
2,008
All
Posts
Comments
New
Top
Old
Page
1
New Cause Area Proposal
CallumMcDougall
Apr 1, 2025, 7:12 AM
109
points
4
comments
1
min read
LW
link
Negative Results for SAEs On Downstream Tasks and Deprioritising SAE Research (GDM Mech Interp Team Progress Update #2)
lewis smith
,
Senthooran Rajamanoharan
,
Arthur Conmy
,
CallumMcDougall
,
Tom Lieberum
,
János Kramár
,
Rohin Shah
and
Neel Nanda
Mar 26, 2025, 7:07 PM
111
points
15
comments
29
min read
LW
link
(deepmindsafetyresearch.medium.com)
ARENA 5.0 - Call for Applicants
JamesH
,
James Fox
,
CallumMcDougall
,
Chloe Li
and
David Quarel
Jan 30, 2025, 1:18 PM
35
points
2
comments
6
min read
LW
link
Scaling Sparse Feature Circuit Finding to Gemma 9B
Diego Caples
,
Jatin Nainani
,
CallumMcDougall
and
rrenaud
Jan 10, 2025, 11:08 AM
86
points
11
comments
17
min read
LW
link
SAEBench: A Comprehensive Benchmark for Sparse Autoencoders
Can
,
Adam Karvonen
,
Johnny Lin
,
Curt Tigges
,
Joseph Bloom
,
chanind
,
Yeu-Tong Lau
,
Eoin Farrell
,
Arthur Conmy
,
CallumMcDougall
,
Kola Ayonrinde
,
Matthew Wearden
,
Sam Marks
and
Neel Nanda
Dec 11, 2024, 6:30 AM
82
points
6
comments
2
min read
LW
link
(www.neuronpedia.org)
AI Alignment Research Engineer Accelerator (ARENA): Call for applicants v4.0
James Fox
,
Chloe Li
,
JamesH
,
Gracie Green
and
CallumMcDougall
Jul 6, 2024, 11:34 AM
57
points
7
comments
6
min read
LW
link
How ARENA course material gets made
CallumMcDougall
Jul 2, 2024, 6:04 PM
41
points
2
comments
7
min read
LW
link
A Selection of Randomly Selected SAE Features
CallumMcDougall
and
Joseph Bloom
Apr 1, 2024, 9:09 AM
109
points
2
comments
4
min read
LW
link
SAE-VIS: Announcement Post
CallumMcDougall
and
Joseph Bloom
Mar 31, 2024, 3:30 PM
74
points
8
comments
1
min read
LW
link
Mech Interp Challenge: January—Deciphering the Caesar Cipher Model
CallumMcDougall
Jan 1, 2024, 6:03 PM
17
points
0
comments
3
min read
LW
link
Interpretability with Sparse Autoencoders (Colab exercises)
CallumMcDougall
Nov 29, 2023, 12:56 PM
76
points
9
comments
4
min read
LW
link
AI Alignment Research Engineer Accelerator (ARENA): call for applicants
CallumMcDougall
Nov 7, 2023, 9:43 AM
56
points
0
comments
LW
link
Mech Interp Challenge: November—Deciphering the Cumulative Sum Model
CallumMcDougall
Nov 2, 2023, 5:10 PM
18
points
2
comments
2
min read
LW
link
[Paper] All’s Fair In Love And Love: Copy Suppression in GPT-2 Small
CallumMcDougall
,
Arthur Conmy
,
Cody Rushing
,
Tom McGrath
and
Neel Nanda
Oct 13, 2023, 6:32 PM
82
points
4
comments
8
min read
LW
link
Mech Interp Challenge: October—Deciphering the Sorted List Model
CallumMcDougall
Oct 3, 2023, 10:57 AM
23
points
0
comments
3
min read
LW
link
ARENA 2.0 - Impact Report
CallumMcDougall
Sep 26, 2023, 5:13 PM
35
points
5
comments
13
min read
LW
link
Mech Interp Challenge: September—Deciphering the Addition Model
CallumMcDougall
Sep 13, 2023, 10:23 PM
35
points
0
comments
4
min read
LW
link
Mech Interp Challenge: August—Deciphering the First Unique Character Model
CallumMcDougall
Aug 9, 2023, 7:14 PM
36
points
1
comment
3
min read
LW
link
Computational Thread Art
CallumMcDougall
Aug 6, 2023, 9:42 PM
76
points
2
comments
6
min read
LW
link
Six (and a half) intuitions for SVD
CallumMcDougall
Jul 4, 2023, 7:23 PM
71
points
1
comment
1
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel