Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Tuna
Karma:
34
All
Posts
Comments
New
Top
Old
Access to agent CoT makes monitors vulnerable to persuasion
Nikita Ostrovsky
,
Julija Bainiaksina
,
Tuna
and
Vika
25 Jul 2025 16:09 UTC
18
points
0
comments
4
min read
LW
link
Lessons from a year of university AI safety field building
yix
,
afterless
,
Parv Mahajan
,
Andersehen
,
Tuna
and
neverix
6 Jun 2025 14:35 UTC
28
points
3
comments
7
min read
LW
link
Back to top