Thomas Read

Karma: 283

Thomas Read’s Shortform

Thomas Read11 Jun 2026 16:34 UTC

4 points

1 comment1 min readLW link

Loss of Oversight: How AI Systems May Become Harder to Audit, Monitor, and Investigate

Jordan Taylor, Max H, Ed Fage, Thomas Read and Joseph Bloom

21 May 2026 14:52 UTC

83 points

0 comments6 min readLW link

(www.aisi.gov.uk)

Reproducing steering against evaluation awareness in a large open-weight model

Thomas Read, Bronson Schoen, Santiago Aranguri and Joseph Bloom

10 Apr 2026 10:45 UTC

89 points

17 comments15 min readLW link

We found an open weight model that games alignment honeypots

Thomas Read and Joseph Bloom

16 Mar 2026 12:57 UTC

79 points

2 comments10 min readLW link

[Research sprint] Single-model crosscoder feature ablation and steering

Thomas Read6 Apr 2025 14:42 UTC

11 points

0 comments12 min readLW link

[Replication] Crosscoder-based Stage-Wise Model Diffing

Anna Soligo, Thomas Read, Oliver Clive-Griffin, dmanningcoe, Chun Hei Yip, rajashree and Jason Gross

22 Mar 2025 18:35 UTC

25 points

0 comments7 min readLW link

Birmingham ACX Everywhere meetup 2022

Thomas Read23 Aug 2022 18:57 UTC

2 points

0 comments1 min readLW link

Coventry ACX Schelling Meetup

Thomas Read17 Apr 2022 9:05 UTC

2 points

0 comments1 min readLW link