RSS

LAThomson

Karma: 97

Independent AI safety researcher with experience in AI control, game theory, and LLM evals. Recent Computer Science and Philosophy graduate from Oxford. Avid musician too!

Agen­tic Mon­i­tor­ing for AI Control

LAThomson27 Oct 2025 16:38 UTC
9 points
0 comments9 min readLW link

Towards shut­down­able agents via stochas­tic choice

8 Jul 2024 10:14 UTC
59 points
11 comments23 min readLW link
(arxiv.org)

Tall Tales at Differ­ent Scales: Eval­u­at­ing Scal­ing Trends For De­cep­tion In Lan­guage Models

8 Nov 2023 11:37 UTC
49 points
0 comments18 min readLW link