RSS

CallumMcDougall

Karma: 1,677

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): Call for ap­pli­cants v4.0

6 Jul 2024 11:34 UTC
57 points
7 comments6 min readLW link

How ARENA course ma­te­rial gets made

CallumMcDougall2 Jul 2024 18:04 UTC
41 points
2 comments7 min readLW link

A Selec­tion of Ran­domly Selected SAE Features

1 Apr 2024 9:09 UTC
109 points
2 comments4 min readLW link

SAE-VIS: An­nounce­ment Post

31 Mar 2024 15:30 UTC
74 points
8 comments1 min readLW link

Mech In­terp Challenge: Jan­uary—De­ci­pher­ing the Cae­sar Cipher Model

CallumMcDougall1 Jan 2024 18:03 UTC
17 points
0 comments3 min readLW link

In­ter­pretabil­ity with Sparse Au­toen­coders (Co­lab ex­er­cises)

CallumMcDougall29 Nov 2023 12:56 UTC
74 points
9 comments4 min readLW link

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): call for applicants

CallumMcDougall7 Nov 2023 9:43 UTC
56 points
0 comments1 min readLW link

Mech In­terp Challenge: Novem­ber—De­ci­pher­ing the Cu­mu­la­tive Sum Model

CallumMcDougall2 Nov 2023 17:10 UTC
18 points
2 comments2 min readLW link

[Paper] All’s Fair In Love And Love: Copy Sup­pres­sion in GPT-2 Small

13 Oct 2023 18:32 UTC
82 points
4 comments8 min readLW link

Mech In­terp Challenge: Oc­to­ber—De­ci­pher­ing the Sorted List Model

CallumMcDougall3 Oct 2023 10:57 UTC
23 points
0 comments3 min readLW link

ARENA 2.0 - Im­pact Report

CallumMcDougall26 Sep 2023 17:13 UTC
35 points
5 comments13 min readLW link

Mech In­terp Challenge: Septem­ber—De­ci­pher­ing the Ad­di­tion Model

CallumMcDougall13 Sep 2023 22:23 UTC
35 points
0 comments4 min readLW link

Mech In­terp Challenge: Au­gust—De­ci­pher­ing the First Unique Char­ac­ter Model

CallumMcDougall9 Aug 2023 19:14 UTC
36 points
1 comment3 min readLW link

Com­pu­ta­tional Thread Art

CallumMcDougall6 Aug 2023 21:42 UTC
75 points
2 comments6 min readLW link

Six (and a half) in­tu­itions for SVD

CallumMcDougall4 Jul 2023 19:23 UTC
69 points
1 comment1 min readLW link

An Anal­ogy for Un­der­stand­ing Transformers

CallumMcDougall13 May 2023 12:20 UTC
89 points
6 comments9 min readLW link

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): call for applicants

CallumMcDougall17 Apr 2023 20:30 UTC
100 points
9 comments7 min readLW link

In­duc­tion heads—illustrated

CallumMcDougall2 Jan 2023 15:35 UTC
109 points
9 comments3 min readLW link

Six (and a half) in­tu­itions for KL divergence

CallumMcDougall12 Oct 2022 21:07 UTC
162 points
25 comments10 min readLW link1 review
(www.perfectlynormal.co.uk)

AI Risk In­tro 2: Solv­ing The Problem

22 Sep 2022 13:55 UTC
22 points
0 comments27 min readLW link