RSS

Oam Patel

Karma: 95

Limits of Ask­ing ELK if Models are Deceptive

Oam Patel15 Aug 2022 20:44 UTC
6 points
2 comments4 min readLW link

A Toy Model of Gra­di­ent Hacking

Oam Patel20 Jun 2022 22:01 UTC
30 points
7 comments4 min readLW link