Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Ruixuan Huang comments on
Subspace Rerouting: Using Mechanistic Interpretability to Craft Adversarial Attacks against Large Language Models
Ruixuan Huang
22 Mar 2025 3:39 UTC
1
point
0
Great job! Consider reading our related paper:
https://arxiv.org/abs/2404.12038
Back to top
Great job! Consider reading our related paper: https://arxiv.org/abs/2404.12038