RSS

Neel Nanda

Karma: 15,041

Models May Be­have Worse When Eval Aware

11 Jun 2026 9:28 UTC
18 points
0 comments13 min readLW link

Build­ing Bet­ter Ac­ti­va­tion Oracles

4 Jun 2026 18:34 UTC
60 points
1 comment7 min readLW link

Test your best meth­ods on our hard CoT in­terp tasks

26 Mar 2026 19:24 UTC
58 points
2 comments19 min readLW link