Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
keshavs
Karma:
34
All
Posts
Comments
New
Top
Old
Introspection Adapters: Training LLMs to Report Their Learned Behaviors
keshavs
,
RowanWang
,
abhayesian
,
Sam Marks
and
SoerenMind
28 Apr 2026 19:02 UTC
41
points
1
comment
12
min read
LW
link
(alignment.anthropic.com)
Back to top