Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Andrew Mack
Karma:
333
All
Posts
Comments
New
Top
Old
Deep Causal Transcoding: A Framework for Mechanistically Eliciting Latent Behaviors in Language Models
Andrew Mack
and
TurnTrout
3 Dec 2024 21:19 UTC
107
points
8
comments
41
min read
LW
link
Mechanistically Eliciting Latent Behaviors in Language Models
Andrew Mack
and
TurnTrout
30 Apr 2024 18:51 UTC
225
points
44
comments
45
min read
LW
link
1
review
Back to top