CallumMcDougall

Karma: 1,525

Six (and a half) intuitions for KL divergence

CallumMcDougall12 Oct 2022 21:07 UTC

154 points

25 comments10 min readLW link 1 review

(www.perfectlynormal.co.uk)

A Selection of Randomly Selected SAE Features

CallumMcDougall and Joseph Bloom

1 Apr 2024 9:09 UTC

106 points

2 comments4 min readLW link

AI Alignment Research Engineer Accelerator (ARENA): call for applicants

CallumMcDougall17 Apr 2023 20:30 UTC

100 points

9 comments7 min readLW link

Induction heads—illustrated

CallumMcDougall2 Jan 2023 15:35 UTC

92 points

8 comments3 min readLW link

[Paper] All’s Fair In Love And Love: Copy Suppression in GPT-2 Small

CallumMcDougall, Arthur Conmy, starship006, Tom McGrath and Neel Nanda

13 Oct 2023 18:32 UTC

82 points

4 comments8 min readLW link

An Analogy for Understanding Transformers

CallumMcDougall13 May 2023 12:20 UTC

81 points

5 comments9 min readLW link

Computational Thread Art

CallumMcDougall6 Aug 2023 21:42 UTC

75 points

2 comments6 min readLW link

SAE-VIS: Announcement Post

CallumMcDougall and Joseph Bloom

31 Mar 2024 15:30 UTC

73 points

8 comments1 min readLW link

Project Intro: Selection Theorems for Modularity

CallumMcDougall, Avery and Lucius Bushnaq

4 Apr 2022 12:59 UTC

71 points

20 comments16 min readLW link

Intro to Superposition & Sparse Autoencoders (Colab exercises)

CallumMcDougall29 Nov 2023 12:56 UTC

67 points

8 comments3 min readLW link

Six (and a half) intuitions for SVD

CallumMcDougall4 Jul 2023 19:23 UTC

66 points

1 comment1 min readLW link

Ten experiments in modularity, which we’d like you to run!

CallumMcDougall, Lucius Bushnaq and Avery

16 Jun 2022 9:17 UTC

62 points

3 comments9 min readLW link

AI Alignment Research Engineer Accelerator (ARENA): call for applicants

CallumMcDougall7 Nov 2023 9:43 UTC

56 points

0 comments1 min readLW link

Theories of Modularity in the Biological Literature

CallumMcDougall, Avery and Lucius Bushnaq

4 Apr 2022 12:48 UTC

51 points

13 comments7 min readLW link

AI Risk Intro 1: Advanced AI Might Be Very Bad

CallumMcDougall and L Rudolf L

11 Sep 2022 10:57 UTC

46 points

13 comments30 min readLW link

What Is The True Name of Modularity?

CallumMcDougall, Lucius Bushnaq and Avery

1 Jul 2022 14:55 UTC

38 points

10 comments12 min readLW link

The Natural Abstraction Hypothesis: Implications and Evidence

CallumMcDougall14 Dec 2021 23:14 UTC

37 points

8 comments19 min readLW link

Basin broadness depends on the size and number of orthogonal features

CallumMcDougall, Avery and Lucius Bushnaq

27 Aug 2022 17:29 UTC

36 points

21 comments6 min readLW link

How I use Anki: expanding the scope of SRS

CallumMcDougall12 Apr 2022 8:28 UTC

36 points

8 comments19 min readLW link

Mech Interp Challenge: September—Deciphering the Addition Model

CallumMcDougall13 Sep 2023 22:23 UTC

35 points

0 comments4 min readLW link