RSS

Anders Cairns Woodruff

Karma: 682

Es­ti­mat­ing No-CoT Task-Com­ple­tion Time Hori­zons of Fron­tier AI Models

10 Jun 2026 17:58 UTC
177 points
3 comments4 min readLW link

How use­ful is the in­for­ma­tion you get from work­ing in­side an AI com­pany?

11 May 2026 15:29 UTC
61 points
6 comments7 min readLW link

Early-stage em­piri­cal work on “spillway mo­ti­va­tions”

1 May 2026 21:29 UTC
26 points
3 comments8 min readLW link

Fail safe(r) at al­ign­ment by chan­nel­ing re­ward-hack­ing into a “spillway” motivation

27 Apr 2026 17:43 UTC
104 points
3 comments11 min readLW link
(blog.redwoodresearch.org)

AI’s ca­pa­bil­ity im­prove­ments haven’t come from it get­ting less affordable

Anders Cairns Woodruff27 Mar 2026 17:09 UTC
84 points
0 comments6 min readLW link

Are AIs more likely to pur­sue on-epi­sode or be­yond-epi­sode re­ward?

12 Mar 2026 17:35 UTC
45 points
0 comments8 min readLW link

Fron­tier AI com­pa­nies prob­a­bly can’t leave the US

Anders Cairns Woodruff26 Feb 2026 18:18 UTC
136 points
19 comments7 min readLW link
(blog.redwoodresearch.org)

Train­ing on Non-Poli­ti­cal but Trump-Style Text Causes LLMs to Be­come Authoritarian

Anders Cairns Woodruff27 Jan 2026 16:46 UTC
5 points
2 comments2 min readLW link

Ev­i­dence that would up­date me to­wards a soft­ware-only fast takeoff

Anders Cairns Woodruff20 Jan 2026 0:58 UTC
15 points
4 comments4 min readLW link

Aes­thetic Prefer­ences Can Cause Emer­gent Misalignment

Anders Cairns Woodruff26 Aug 2025 18:41 UTC
110 points
18 comments3 min readLW link