Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
aksh-n
Karma:
27
An ML engineer and ethicist turned AI alignment researcher.
All
Posts
Comments
New
Top
Old
Training Deliberative Monitors for Black-Box Scheming Detection
aksh-n
,
adityasinha
,
Victor Gillioz
,
Simon Storf
,
Kilian Merkelbach
,
richbc
,
Axel Højmark
and
Marius Hobbhahn
4 Jun 2026 16:43 UTC
33
points
5
comments
6
min read
LW
link
Contextual Constitutional AI
aksh-n
28 Sep 2024 23:24 UTC
16
points
2
comments
12
min read
LW
link
Back to top