RSS

Tomek Korbak

Karma: 1,045

Senior Research Scientist at UK AISI working on AI control

https://​​tomekkorbak.com/​​

Les­sons from Study­ing Two-Hop La­tent Reasoning

11 Sep 2025 17:53 UTC
68 points
16 comments2 min readLW link
(arxiv.org)

If you can gen­er­ate obfus­cated chain-of-thought, can you mon­i­tor it?

4 Aug 2025 15:46 UTC
35 points
6 comments11 min readLW link

Re­search Areas in AI Con­trol (The Align­ment Pro­ject by UK AISI)

1 Aug 2025 10:27 UTC
25 points
0 comments18 min readLW link
(alignmentproject.aisi.gov.uk)

The Align­ment Pro­ject by UK AISI

1 Aug 2025 9:52 UTC
29 points
0 comments2 min readLW link
(alignmentproject.aisi.gov.uk)

Chain of Thought Mon­i­tora­bil­ity: A New and Frag­ile Op­por­tu­nity for AI Safety

15 Jul 2025 16:23 UTC
166 points
32 comments1 min readLW link
(bit.ly)

How to eval­u­ate con­trol mea­sures for LLM agents? A tra­jec­tory from to­day to superintelligence

14 Apr 2025 16:45 UTC
29 points
1 comment2 min readLW link