RSS

Kshitij Sachan

Karma: 292

Redwood Research

AI Con­trol: Im­prov­ing Safety De­spite In­ten­tional Subversion

13 Dec 2023 15:51 UTC
189 points
4 comments10 min readLW link

LLMs are (mostly) not helped by filler tokens

Kshitij Sachan10 Aug 2023 0:48 UTC
64 points
35 comments6 min readLW link