RSS

Dan H

Karma: 1,310

$20 Million in NSF Grants for Safety Research

Dan H28 Feb 2023 4:44 UTC
153 points
12 comments1 min readLW link

There are no co­her­ence theorems

20 Feb 2023 21:25 UTC
74 points
94 comments19 min readLW link

[MLSN #8] Mechanis­tic in­ter­pretabil­ity, us­ing law to in­form AI al­ign­ment, scal­ing laws for proxy gaming

20 Feb 2023 15:54 UTC
20 points
0 comments4 min readLW link
(newsletter.mlsafety.org)

[MLSN #7]: an ex­am­ple of an emer­gent in­ter­nal optimizer

9 Jan 2023 19:39 UTC
28 points
0 comments6 min readLW link

[MLSN #6]: Trans­parency sur­vey, prov­able ro­bust­ness, ML mod­els that pre­dict the future

Dan H12 Oct 2022 20:56 UTC
26 points
0 comments6 min readLW link

[MLSN #5]: Prize Compilation

Dan H26 Sep 2022 21:55 UTC
14 points
1 comment2 min readLW link

An­nounc­ing the In­tro­duc­tion to ML Safety course

6 Aug 2022 2:46 UTC
73 points
6 comments7 min readLW link

$20K In Boun­ties for AI Safety Public Materials

5 Aug 2022 2:52 UTC
68 points
8 comments6 min readLW link