RSS

LAThomson

Karma: 133

Currently, MATS 9.1 Fellow with Victoria Krakovna working on automating science-of-evals research. Recently accepted a PhD offer in Oxford, starting Oct. 2026.

I have direct experience in evals, AI control, game theory; broader familiarity with the rest of the safety landscape.

A Frame­work for Eval Awareness

LAThomson23 Jan 2026 10:16 UTC
37 points
5 comments8 min readLW link

Agen­tic Mon­i­tor­ing for AI Control

LAThomson27 Oct 2025 16:38 UTC
10 points
0 comments9 min readLW link

Towards shut­down­able agents via stochas­tic choice

8 Jul 2024 10:14 UTC
59 points
11 comments23 min readLW link
(arxiv.org)

Tall Tales at Differ­ent Scales: Eval­u­at­ing Scal­ing Trends For De­cep­tion In Lan­guage Models

8 Nov 2023 11:37 UTC
49 points
0 comments18 min readLW link