Magdalena Wache

Karma: 533

The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks

Lucius Bushnaq, jake_mendel, Dan Braun, StefanHex, Nicholas Goldowsky-Dill, Kaarel, Avery, Joern Stoehler, debrevitatevitae, Magdalena Wache and Marius Hobbhahn

20 May 2024 17:53 UTC

101 points

2 comments3 min readLW link

AI Safety Research Organization Incubation Program—Expression of Interest

Alexandra Bos, Magdalena Wache, Kay Kozaronek, Gabe and Catalyze Impact

21 Nov 2023 10:23 UTC

65 points

6 comments1 min readLW link

Interpretability Externalities Case Study—Hungry Hungry Hippos

Magdalena Wache20 Sep 2023 14:42 UTC

64 points

22 comments2 min readLW link

Technical AI Safety Research Landscape [Slides]

Magdalena Wache18 Sep 2023 13:56 UTC

41 points

0 comments4 min readLW link

AI Safety Europe Retreat 2023 Retrospective

Magdalena Wache14 Apr 2023 9:05 UTC

43 points

0 comments2 min readLW link

Finite Factored Sets in Pictures

Magdalena Wache11 Dec 2022 18:49 UTC

174 points

35 comments12 min readLW link