RSS

Evan Anders

Karma: 100

Postdoc at KITP studying AI safety /​ Mech Interp. Former astrophysical fluid dynamicist. Website: https://​​evanhanders.bitbucket.io/​​

Sparse au­toen­coders find com­posed fea­tures in small toy mod­els

14 Mar 2024 18:00 UTC
28 points
10 comments15 min readLW link

Ex­am­in­ing Lan­guage Model Perfor­mance with Re­con­structed Ac­ti­va­tions us­ing Sparse Au­toen­coders

27 Feb 2024 2:43 UTC
39 points
16 comments15 min readLW link

How poly­se­man­tic can one neu­ron be? In­ves­ti­gat­ing fea­tures in TinyS­to­ries.

Evan Anders16 Jan 2024 19:10 UTC
12 points
0 comments8 min readLW link
(evanhanders.blog)

How does a toy 2 digit sub­trac­tion trans­former pre­dict the differ­ence?

Evan Anders22 Dec 2023 21:17 UTC
12 points
0 comments10 min readLW link
(evanhanders.blog)

How does a toy 2 digit sub­trac­tion trans­former pre­dict the sign of the out­put?

Evan Anders19 Dec 2023 18:56 UTC
14 points
0 comments8 min readLW link
(evanhanders.blog)