RSS

ryan_greenblatt

Karma: 26,193

I’m the chief scientist at Redwood Research.

AIs can contact me at: https://​​hotline.ryan-g.ai/​​ (Don’t use this if you aren’t an AI.)

Public keys (YubiKey-backed):

age:    age1yubikey1qfthrdwsg7xjz9vxcyzwdtgvscmxtpxcaklqq9nnc2mf8tuzs6zw295nerh
GPG fp: 5C98 63A9 CF80 6AD8 A4F5  E696 1ADD 1121 2C0C B1F4

Same keys are at https://​​hotline.ryan-g.ai/​​ and on my GitHub profile. If keys don’t match, don’t trust them.

For GPG, the full armored public key is at https://​​github.com/​​rgreenblatt.gpg and on keys.openpgp.org (obtainable via gpg --keyserver hkps://keys.openpgp.org --recv-keys 5C9863A9CF806AD8A4F5E6961ADD11212C0CB1F4).

If you aren’t an AI and you want to contact me, please message me on LW or similar.

A Re­search Agenda for Se­cret Loyalties

13 May 2026 17:34 UTC
33 points
3 comments3 min readLW link

To what ex­tent is Qwen3-32B pre­dict­ing its per­sona?

30 Apr 2026 21:09 UTC
85 points
3 comments10 min readLW link

AI com­pa­nies should pub­lish se­cu­rity assessments

ryan_greenblatt27 Apr 2026 14:39 UTC
94 points
1 comment3 min readLW link

Cur­rent AIs seem pretty mis­al­igned to me

ryan_greenblatt15 Apr 2026 15:14 UTC
670 points
80 comments27 min readLW link

An­thropic re­peat­edly ac­ci­den­tally trained against the CoT, demon­strat­ing in­ad­e­quate processes

14 Apr 2026 1:44 UTC
182 points
6 comments4 min readLW link

If Mythos ac­tu­ally made An­thropic em­ploy­ees 4x more pro­duc­tive, I would rad­i­cally shorten my timelines

ryan_greenblatt11 Apr 2026 0:38 UTC
139 points
16 comments4 min readLW link

My pic­ture of the pre­sent in AI

ryan_greenblatt7 Apr 2026 16:44 UTC
129 points
25 comments11 min readLW link

AIs can now of­ten do mas­sive easy-to-ver­ify SWE tasks and I’ve up­dated to­wards shorter timelines

ryan_greenblatt6 Apr 2026 16:01 UTC
200 points
16 comments13 min readLW link

How do we (more) safely defer to AIs?

12 Feb 2026 16:55 UTC
83 points
5 comments72 min readLW link

Dist­in­guish be­tween in­fer­ence scal­ing and “larger tasks use more com­pute”

ryan_greenblatt11 Feb 2026 18:37 UTC
87 points
5 comments2 min readLW link

The inau­gu­ral Red­wood Re­search podcast

4 Jan 2026 22:11 UTC
141 points
10 comments142 min readLW link

Re­cent LLMs can do 2-hop and 3-hop la­tent (no-CoT) rea­son­ing on nat­u­ral facts

ryan_greenblatt1 Jan 2026 13:36 UTC
128 points
11 comments3 min readLW link

Mea­sur­ing no CoT math time hori­zon (sin­gle for­ward pass)

ryan_greenblatt26 Dec 2025 16:37 UTC
215 points
18 comments3 min readLW link

Re­cent LLMs can use filler to­kens or prob­lem re­peats to im­prove (no-CoT) math performance

ryan_greenblatt22 Dec 2025 17:21 UTC
153 points
19 comments7 min readLW link

What’s up with An­thropic pre­dict­ing AGI by early 2027?

ryan_greenblatt3 Nov 2025 16:45 UTC
161 points
16 comments20 min readLW link

Son­net 4.5′s eval gam­ing se­ri­ously un­der­mines al­ign­ment evals, and this seems caused by train­ing on al­ign­ment evals

30 Oct 2025 15:34 UTC
144 points
22 comments14 min readLW link

Is 90% of code at An­thropic be­ing writ­ten by AIs?

ryan_greenblatt22 Oct 2025 14:50 UTC
93 points
15 comments5 min readLW link

Re­duc­ing risk from schem­ing by study­ing trained-in schem­ing behavior

ryan_greenblatt16 Oct 2025 16:16 UTC
32 points
0 comments11 min readLW link

Iter­ated Devel­op­ment and Study of Schemers (IDSS)

ryan_greenblatt10 Oct 2025 14:17 UTC
41 points
1 comment8 min readLW link

Plans A, B, C, and D for mis­al­ign­ment risk

ryan_greenblatt8 Oct 2025 17:18 UTC
137 points
77 comments6 min readLW link