RSS

Kaarel

Karma: 674

kaarelh AT gmail DOT com

personal website

Find­ing the es­ti­mate of the value of a state in RL agents

3 Jun 2024 20:26 UTC
7 points
4 comments4 min readLW link

In­ter­pretabil­ity: In­te­grated Gra­di­ents is a de­cent at­tri­bu­tion method

20 May 2024 17:55 UTC
22 points
7 comments6 min readLW link

The Lo­cal In­ter­ac­tion Ba­sis: Iden­ti­fy­ing Com­pu­ta­tion­ally-Rele­vant and Sparsely In­ter­act­ing Fea­tures in Neu­ral Networks

20 May 2024 17:53 UTC
105 points
4 comments3 min readLW link

A start­ing point for mak­ing sense of task struc­ture (in ma­chine learn­ing)

24 Feb 2024 1:51 UTC
45 points
2 comments12 min readLW link

Toward A Math­e­mat­i­cal Frame­work for Com­pu­ta­tion in Superposition

18 Jan 2024 21:06 UTC
193 points
17 comments73 min readLW link

Grokking, mem­o­riza­tion, and gen­er­al­iza­tion — a discussion

29 Oct 2023 23:17 UTC
66 points
10 comments23 min readLW link