RSS

Dan H

Karma: 960

[MLSN #6]: Trans­parency sur­vey, prov­able ro­bust­ness, ML mod­els that pre­dict the future

Dan H12 Oct 2022 20:56 UTC
26 points
0 comments6 min readLW link

[MLSN #5]: Prize Compilation

Dan H26 Sep 2022 21:55 UTC
14 points
1 comment2 min readLW link

An­nounc­ing the In­tro­duc­tion to ML Safety course

6 Aug 2022 2:46 UTC
69 points
6 comments7 min readLW link

$20K In Boun­ties for AI Safety Public Materials

5 Aug 2022 2:52 UTC
68 points
7 comments6 min readLW link

NeurIPS ML Safety Work­shop 2022

Dan H26 Jul 2022 15:28 UTC
72 points
2 comments1 min readLW link
(neurips2022.mlsafety.org)

[Linkpost] Ex­is­ten­tial Risk Anal­y­sis in Em­piri­cal Re­search Papers

Dan H2 Jul 2022 0:09 UTC
40 points
0 comments1 min readLW link
(arxiv.org)

Paper: Fore­cast­ing world events with neu­ral nets

1 Jul 2022 19:40 UTC
39 points
3 comments4 min readLW link

Open Prob­lems in AI X-Risk [PAIS #5]

10 Jun 2022 2:08 UTC
50 points
3 comments36 min readLW link

[MLSN #4]: Many New In­ter­pretabil­ity Papers, Vir­tual Logit Match­ing, Ra­tion­al­iza­tion Helps Robustness

Dan H3 Jun 2022 1:20 UTC
18 points
0 comments4 min readLW link

Perform Tractable Re­search While Avoid­ing Ca­pa­bil­ities Ex­ter­nal­ities [Prag­matic AI Safety #4]

30 May 2022 20:25 UTC
43 points
3 comments25 min readLW link

Com­plex Sys­tems for AI Safety [Prag­matic AI Safety #3]

24 May 2022 0:00 UTC
48 points
2 comments21 min readLW link

Ac­tion­able-guidance and roadmap recom­men­da­tions for the NIST AI Risk Man­age­ment Framework

17 May 2022 15:26 UTC
25 points
0 comments3 min readLW link

A Bird’s Eye View of the ML Field [Prag­matic AI Safety #2]

9 May 2022 17:18 UTC
125 points
5 comments35 min readLW link

In­tro­duc­tion to Prag­matic AI Safety [Prag­matic AI Safety #1]

9 May 2022 17:06 UTC
69 points
1 comment6 min readLW link

In­tro­duc­ing the ML Safety Schol­ars Program

4 May 2022 16:01 UTC
73 points
2 comments3 min readLW link

[$20K in Prizes] AI Safety Ar­gu­ments Competition

26 Apr 2022 16:13 UTC
74 points
543 comments3 min readLW link