RSS

David Udell

Karma: 2,592

Ex­plain­ing GPT-2-Small For­ward Passes with Edge-Level Au­toen­coder Circuits

Jul 22, 2025, 8:36 PM
23 points

8 votes

Overall karma indicates overall quality.

0 comments6 min readLW link

Why Can’t We Hy­poth­e­size After the Fact?

David UdellFeb 26, 2025, 10:41 PM
40 points

15 votes

Overall karma indicates overall quality.

3 comments2 min readLW link

Causal Graphs of GPT-2-Small’s Resi­d­ual Stream

David UdellJul 9, 2024, 10:06 PM
53 points

18 votes

Overall karma indicates overall quality.

7 comments7 min readLW link

Sparse Cod­ing, for Mechanis­tic In­ter­pretabil­ity and Ac­ti­va­tion Engineering

David UdellSep 23, 2023, 7:16 PM
42 points

19 votes

Overall karma indicates overall quality.

7 comments34 min readLW link

Ac­tAdd: Steer­ing Lan­guage Models with­out Optimization

Sep 6, 2023, 5:21 PM
105 points

31 votes

Overall karma indicates overall quality.

3 comments2 min readLW link
(arxiv.org)

Steer­ing GPT-2-XL by adding an ac­ti­va­tion vector

May 13, 2023, 6:42 PM
439 points

206 votes

Overall karma indicates overall quality.

98 comments50 min readLW link1 review