RSS

Buck(Buck Shlegeris)

Karma: 3,202

Causal scrub­bing: re­sults on in­duc­tion heads

3 Dec 2022 0:59 UTC
32 points
0 comments17 min readLW link

Causal scrub­bing: re­sults on a paren bal­ance checker

3 Dec 2022 0:59 UTC
25 points
0 comments30 min readLW link

Causal scrub­bing: Appendix

3 Dec 2022 0:58 UTC
15 points
0 comments20 min readLW link

Causal Scrub­bing: a method for rigor­ously test­ing in­ter­pretabil­ity hy­pothe­ses [Red­wood Re­search]

3 Dec 2022 0:58 UTC
124 points
9 comments20 min readLW link

Multi-Com­po­nent Learn­ing and S-Curves

30 Nov 2022 1:37 UTC
57 points
24 comments7 min readLW link

Some Les­sons Learned from Study­ing Indi­rect Ob­ject Iden­ti­fi­ca­tion in GPT-2 small

28 Oct 2022 23:55 UTC
86 points
5 comments9 min readLW link
(arxiv.org)

Ap­ply to the Red­wood Re­search Mechanis­tic In­ter­pretabil­ity Ex­per­i­ment (REMIX), a re­search pro­gram in Berkeley

27 Oct 2022 1:32 UTC
134 points
14 comments12 min readLW link

Help out Red­wood Re­search’s in­ter­pretabil­ity team by find­ing heuris­tics im­ple­mented by GPT-2 small

12 Oct 2022 21:25 UTC
49 points
11 comments4 min readLW link

Poly­se­man­tic­ity and Ca­pac­ity in Neu­ral Networks

7 Oct 2022 17:51 UTC
78 points
9 comments3 min readLW link

Lan­guage mod­els seem to be much bet­ter than hu­mans at next-to­ken prediction

11 Aug 2022 17:45 UTC
152 points
56 comments13 min readLW link