RSS

Lee Sharkey(Lee Sharkey)

Karma: 1,088

Research engineer at Apollo Research (London).

My main research interests are mechanistic interpretability and inner alignment.

Gated At­ten­tion Blocks: Pre­limi­nary Progress to­ward Re­mov­ing At­ten­tion Head Superposition

8 Apr 2024 11:14 UTC
33 points
4 comments15 min readLW link

Spar­sify: A mechanis­tic in­ter­pretabil­ity re­search agenda

Lee Sharkey3 Apr 2024 12:34 UTC
85 points
20 comments22 min readLW link

Ad­dress­ing Fea­ture Sup­pres­sion in SAEs

16 Feb 2024 18:32 UTC
80 points
3 comments10 min readLW link

The­o­ries of Change for AI Auditing

13 Nov 2023 19:33 UTC
59 points
0 comments18 min readLW link
(www.apolloresearch.ai)

An­nounc­ing Apollo Research

30 May 2023 16:17 UTC
215 points
11 comments8 min readLW link