RSS

Mia Hopman

Karma: 152

Un­der­stand­ing when and why agents scheme

21 Mar 2026 20:33 UTC
37 points
0 comments4 min readLW link

Cur­rent LLM agents need strong pres­sure to en­gage in schem­ing behavior

20 Nov 2025 20:45 UTC
23 points
0 comments11 min readLW link

Prompt op­ti­miza­tion can en­able AI con­trol research

23 Sep 2025 12:46 UTC
40 points
4 comments9 min readLW link

Op­ti­mally Com­bin­ing Probe Mon­i­tors and Black Box Monitors

27 Jul 2025 19:13 UTC
52 points
2 comments6 min readLW link

Un­trusted AIs can ex­ploit feed­back in con­trol protocols

27 May 2025 16:41 UTC
30 points
0 comments16 min readLW link