Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Nevan Wichers
Karma:
159
All
Posts
Comments
New
Top
Old
Inoculation prompting: Instructing models to misbehave at train-time can improve run-time behavior
Sam Marks
,
Nevan Wichers
,
Daniel Tan
,
Aram Ebtekar
,
Jozdien
,
David Africa
,
Alex Mallen
and
Fabien Roger
8 Oct 2025 22:02 UTC
156
points
37
comments
2
min read
LW
link
Visualizing neural network planning
Nevan Wichers
,
Victor Tao
,
Fazl
and
Riccardo Volpato
9 May 2024 6:40 UTC
4
points
0
comments
5
min read
LW
link
A Variance Indifferent Maximizer Alternative
Nevan Wichers
13 Feb 2020 9:06 UTC
7
points
1
comment
4
min read
LW
link
Back to top