RSS

Nandi

Karma: 83

Ro­bust­ness of Con­trast-Con­sis­tent Search to Ad­ver­sar­ial Prompting

1 Nov 2023 12:46 UTC
16 points
1 comment7 min readLW link

Ma­chine Un­learn­ing Eval­u­a­tions as In­ter­pretabil­ity Benchmarks

23 Oct 2023 16:33 UTC
33 points
2 comments11 min readLW link

Split­ting De­bate up into Two Subsystems

Nandi3 Jul 2020 20:11 UTC
13 points
5 comments4 min readLW link

Ac­knowl­edg­ing Hu­man Prefer­ence Types to Sup­port Value Learning

Nandi13 Nov 2018 18:57 UTC
34 points
4 comments9 min readLW link