RSS

Mary Phuong

Karma: 475

Phan­tom Trans­fer and the Ba­sic Science of Data Poisoning

15 Feb 2026 19:51 UTC
80 points
8 comments6 min readLW link

When should we train against a schem­ing mon­i­tor?

Mary Phuong21 Jan 2026 20:48 UTC
24 points
4 comments5 min readLW link

Sublimi­nal Learn­ing Across Models

26 Nov 2025 16:15 UTC
58 points
15 comments5 min readLW link

Eval­u­at­ing and mon­i­tor­ing for AI scheming

10 Jul 2025 14:24 UTC
53 points
10 comments5 min readLW link
(deepmindsafetyresearch.medium.com)

Un­faith­ful Rea­son­ing Can Fool Chain-of-Thought Monitoring

2 Jun 2025 19:08 UTC
78 points
17 comments3 min readLW link

Threat Model Liter­a­ture Review

1 Nov 2022 11:03 UTC
79 points
4 comments25 min readLW link

Clar­ify­ing AI X-risk

1 Nov 2022 11:03 UTC
127 points
24 comments4 min readLW link1 review