RSS

Rohin Shah

Karma: 16,146

Research Scientist at Google DeepMind. Creator of the Alignment Newsletter. http://​​rohinshah.com/​​

Chain of Thought Mon­i­tora­bil­ity: A New and Frag­ile Op­por­tu­nity for AI Safety

15 Jul 2025 16:23 UTC
166 points
32 comments1 min readLW link
(bit.ly)

Eval­u­at­ing and mon­i­tor­ing for AI scheming

10 Jul 2025 14:24 UTC
52 points
9 comments5 min readLW link
(deepmindsafetyresearch.medium.com)