RSS

Dan Braun

Karma: 551

Un­der­stand­ing strate­gic de­cep­tion and de­cep­tive alignment

25 Sep 2023 16:27 UTC
51 points
16 comments7 min readLW link
(www.apolloresearch.ai)

An­nounc­ing Apollo Research

30 May 2023 16:17 UTC
215 points
11 comments8 min readLW link

A small up­date to the Sparse Cod­ing in­terim re­search report

30 Apr 2023 19:54 UTC
61 points
5 comments1 min readLW link

Nav­i­gat­ing pub­lic AI x-risk hype while pur­su­ing tech­ni­cal solutions

Dan Braun19 Feb 2023 12:22 UTC
18 points
0 comments2 min readLW link

[In­terim re­search re­port] Tak­ing fea­tures out of su­per­po­si­tion with sparse autoencoders

13 Dec 2022 15:41 UTC
136 points
22 comments22 min readLW link2 reviews

In­ter­pret­ing Neu­ral Net­works through the Poly­tope Lens

23 Sep 2022 17:58 UTC
136 points
29 comments33 min readLW link