RSS

Sid Black

Karma: 684

The Sin­gu­lar Value De­com­po­si­tions of Trans­former Weight Ma­tri­ces are Highly Interpretable

28 Nov 2022 12:54 UTC
155 points
25 comments31 min readLW link

Con­jec­ture Se­cond Hiring Round

23 Nov 2022 17:11 UTC
83 points
0 comments1 min readLW link

Con­jec­ture: a ret­ro­spec­tive af­ter 8 months of work

23 Nov 2022 17:10 UTC
179 points
9 comments8 min readLW link

Cur­rent themes in mechanis­tic in­ter­pretabil­ity research

16 Nov 2022 14:14 UTC
82 points
3 comments12 min readLW link

In­ter­pret­ing Neu­ral Net­works through the Poly­tope Lens

23 Sep 2022 17:58 UTC
122 points
26 comments33 min readLW link

Con­jec­ture: In­ter­nal In­fo­haz­ard Policy

29 Jul 2022 19:07 UTC
118 points
6 comments19 min readLW link