RSS

Tomek Korbak

Karma: 1,181

I work on monitoring agents at OpenAI

https://​​tomekkorbak.com/​​

Pre­dict­ing LLM Safety Be­fore Re­lease by Si­mu­lat­ing Deployment

16 Jun 2026 19:55 UTC
35 points
2 comments1 min readLW link

Rea­son­ing Models Strug­gle to Con­trol Their Chains of Thought

5 Mar 2026 22:37 UTC
76 points
9 comments3 min readLW link

Train­ing Agents to Self-Re­port Misbehavior

25 Feb 2026 17:50 UTC
26 points
0 comments8 min readLW link