Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
RowanWang
Karma:
282
https://rowankwang.com/
All
Posts
Comments
New
Top
Old
Modifying LLM Beliefs with Synthetic Document Finetuning
RowanWang
,
Johannes Treutlein
,
Avery
,
Ethan Perez
,
Fabien Roger
and
Sam Marks
Apr 24, 2025, 9:15 PM
70
points
12
comments
2
min read
LW
link
(alignment.anthropic.com)
Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small
RowanWang
,
Alexandre Variengien
,
Arthur Conmy
,
Buck
and
jsteinhardt
Oct 28, 2022, 11:55 PM
101
points
9
comments
9
min read
LW
link
2
reviews
(arxiv.org)
Gears-Level Mental Models of Transformer Interpretability
RowanWang
Mar 29, 2022, 8:09 PM
73
points
4
comments
6
min read
LW
link
Lessons After a Couple Months of Trying to Do ML Research
RowanWang
Mar 22, 2022, 11:45 PM
70
points
8
comments
6
min read
LW
link
Back to top