RSS

Thomas Read

Karma: 107

We found an open weight model that games al­ign­ment honeypots

16 Mar 2026 12:57 UTC
57 points
0 comments10 min readLW link

[Re­search sprint] Sin­gle-model cross­coder fea­ture ab­la­tion and steering

Thomas Read6 Apr 2025 14:42 UTC
10 points
0 comments12 min readLW link

[Repli­ca­tion] Cross­coder-based Stage-Wise Model Diffing

22 Mar 2025 18:35 UTC
25 points
0 comments7 min readLW link

Bir­m­ing­ham ACX Every­where meetup 2022

Thomas Read23 Aug 2022 18:57 UTC
2 points
0 comments1 min readLW link

Coven­try ACX Schel­ling Meetup

Thomas Read17 Apr 2022 9:05 UTC
2 points
0 comments1 min readLW link