Buck

Karma: 18,321

CEO at Redwood Research.

AI safety is a highly collaborative field—almost all the points I make were either explained to me by someone else, or developed in conversation with other people. I’m saying this here because it would feel repetitive to say “these ideas were developed in collaboration with various people” in all my comments, but I want to have it on the record that the ideas I present were almost entirely not developed by me in isolation.

Please contact me via email (bshlegeris@gmail.com) instead of messaging me on LessWrong.

If we are ever arguing on LessWrong and you feel like it’s kind of heated and would go better if we just talked about it verbally, please feel free to contact me and I’ll probably be willing to call to discuss briefly.

The OpenAI/Huggingface incident | Redwood Research podcast episode 2

ryan_greenblatt and Buck

23 Jul 2026 17:56 UTC

83 points

0 comments2 min readLW link

Efficient tradeoffs and the safety-usefulness tradeoff model

Buck8 Jun 2026 20:28 UTC

44 points

1 comment8 min readLW link

Notes on axes of variation in third-party risk assessment

Buck31 May 2026 20:48 UTC

39 points

2 comments10 min readLW link

How useful is the information you get from working inside an AI company?

Buck and Anders Cairns Woodruff

11 May 2026 15:29 UTC

61 points

7 comments7 min readLW link

A review of “Investigating the consequences of accidentally grading CoT during RL”

Buck7 May 2026 18:06 UTC

76 points

1 comment8 min readLW link

Announcing ControlConf 2026

Buck26 Feb 2026 2:23 UTC

82 points

4 comments2 min readLW link

The inaugural Redwood Research podcast

Buck and ryan_greenblatt

4 Jan 2026 22:11 UTC

146 points

10 comments142 min readLW link

The behavioral selection model for predicting AI motivations

Alex Mallen and Buck

4 Dec 2025 18:46 UTC

209 points

31 comments16 min readLW link

Rogue internal deployments via external APIs

Fabien Roger and Buck

15 Oct 2025 19:34 UTC

34 points

4 comments6 min readLW link

The Thinking Machines Tinker API is good news for AI control and security

Buck9 Oct 2025 15:22 UTC

92 points

10 comments6 min readLW link

Christian homeschoolers in the year 3000

Buck17 Sep 2025 14:44 UTC

209 points

66 comments7 min readLW link

I enjoyed most of IABIED

Buck17 Sep 2025 4:34 UTC

211 points

46 comments8 min readLW link

An epistemic advantage of working as a moderate

Buck20 Aug 2025 17:47 UTC

216 points

95 comments4 min readLW link

Four places where you can put LLM monitoring

Fabien Roger and Buck

9 Aug 2025 23:10 UTC

49 points

0 comments7 min readLW link

Research Areas in AI Control (The Alignment Project by UK AISI)

Julian Stastny, Tomek Korbak, Mojmir, Buck and Alan Cooney

1 Aug 2025 10:27 UTC

25 points

0 comments18 min readLW link

(alignmentproject.aisi.gov.uk)

Why it’s hard to make settings for high-stakes control research

Buck18 Jul 2025 16:33 UTC

50 points

6 comments4 min readLW link

Recent Redwood Research project proposals

ryan_greenblatt, Buck, Julian Stastny, joshc, Alex Mallen, Adam Kaufman , Tyler Tracy, Aryan Bhatt and Joey Yudelson

14 Jul 2025 22:27 UTC

99 points

0 comments3 min readLW link

Lessons from the Iraq War for AI policy

Buck10 Jul 2025 18:52 UTC

202 points

25 comments4 min readLW link

What’s worse, spies or schemers?

Buck and Julian Stastny

9 Jul 2025 14:37 UTC

51 points

2 comments5 min readLW link

How much novel security-critical infrastructure do you need during the singularity?

Buck4 Jul 2025 16:54 UTC

58 points

7 comments5 min readLW link