RSS

IvanC

Karma: 10

Undergraduate CS student working on mechanistic interpretability and empirical AI safety. Currently focused on residual stream representations and AI alignment.

Sin­gle Direc­tion vs Low-Rank Re­fusal in Small LLMs

IvanC2 Mar 2026 23:14 UTC
11 points
0 comments8 min readLW link