Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Dillon Plunkett
Karma:
55
All
Posts
Comments
New
Top
Old
Tests of LLM introspection need to rule out causal bypassing
Adam Morris
and
Dillon Plunkett
28 Nov 2025 17:42 UTC
49
points
6
comments
4
min read
LW
link
Self-interpretability: LLMs can describe complex internal processes that drive their decisions
Adam Morris
and
Dillon Plunkett
14 Nov 2025 0:18 UTC
12
points
0
comments
4
min read
LW
link
Back to top