Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Henk Tillman
Karma:
46
All
Posts
Comments
New
Top
Old
Investigating task-specific prompts and sparse autoencoders for activation monitoring
Henk Tillman
30 Apr 2025 17:09 UTC
23
points
0
comments
1
min read
LW
link
(arxiv.org)
Transformer Debugger
Henk Tillman
12 Mar 2024 19:08 UTC
26
points
0
comments
1
min read
LW
link
(github.com)
Back to top