Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Magdalena Wache
Karma:
566
All
Posts
Comments
New
Top
Old
The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks
Lucius Bushnaq
,
jake_mendel
,
Dan Braun
,
StefanHex
,
Nicholas Goldowsky-Dill
,
Kaarel
,
Avery
,
Joern Stoehler
,
debrevitatevitae
,
Magdalena Wache
and
Marius Hobbhahn
May 20, 2024, 5:53 PM
108
points
4
comments
3
min read
LW
link
Interpretability Externalities Case Study—Hungry Hungry Hippos
Magdalena Wache
Sep 20, 2023, 2:42 PM
64
points
22
comments
2
min read
LW
link
Technical AI Safety Research Landscape [Slides]
Magdalena Wache
Sep 18, 2023, 1:56 PM
49
points
2
comments
4
min read
LW
link
AI Safety Europe Retreat 2023 Retrospective
Magdalena Wache
Apr 14, 2023, 9:05 AM
43
points
0
comments
2
min read
LW
link
Finite Factored Sets in Pictures
Magdalena Wache
Dec 11, 2022, 6:49 PM
183
points
35
comments
12
min read
LW
link
Back to top
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel