Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Peter Lai
Karma:
35
Mechanistic Interpretability Enthusiast
All
Posts
Comments
New
Top
Old
Proof-of-Concept Debugger for a Small LLM
Peter Lai
and
StefanHex
17 Mar 2025 22:27 UTC
27
points
0
comments
11
min read
LW
link
SAE regularization produces more interpretable models
Peter Lai
and
StefanHex
28 Jan 2025 20:02 UTC
21
points
7
comments
4
min read
LW
link
Peter Lai’s Shortform
Peter Lai
25 Jan 2025 19:41 UTC
3
points
0
comments
1
min read
LW
link
Back to top