RSS

Ad­ver­sar­ial Training

TagLast edit: 3 Jun 2022 1:30 UTC by Ruby

Ad­ver­sar­ial train­ing, im­por­tance sam­pling, and anti-ad­ver­sar­ial train­ing for AI whistleblowing

Buck2 Jun 2022 23:48 UTC
37 points
0 comments3 min readLW link

AXRP Epi­sode 17 - Train­ing for Very High Reli­a­bil­ity with Daniel Ziegler

DanielFilan21 Aug 2022 23:50 UTC
16 points
0 comments34 min readLW link

Take­aways from our ro­bust in­jury clas­sifier pro­ject [Red­wood Re­search]

dmz17 Sep 2022 3:55 UTC
137 points
10 comments6 min readLW link

Over­sight Leagues: The Train­ing Game as a Feature

Paul Bricman9 Sep 2022 10:08 UTC
20 points
6 comments10 min readLW link

EIS IX: In­ter­pretabil­ity and Adversaries

scasper20 Feb 2023 18:25 UTC
29 points
5 comments8 min readLW link

EIS XI: Mov­ing Forward

scasper22 Feb 2023 19:05 UTC
15 points
2 comments9 min readLW link

La­tent Ad­ver­sar­ial Training

Adam Jermyn29 Jun 2022 20:04 UTC
30 points
10 comments5 min readLW link

EIS XII: Sum­mary

scasper23 Feb 2023 17:45 UTC
12 points
0 comments6 min readLW link
No comments.