RSS

Buck

Karma: 17,840

CEO at Redwood Research.

AI safety is a highly collaborative field—almost all the points I make were either explained to me by someone else, or developed in conversation with other people. I’m saying this here because it would feel repetitive to say “these ideas were developed in collaboration with various people” in all my comments, but I want to have it on the record that the ideas I present were almost entirely not developed by me in isolation.

Please contact me via email (bshlegeris@gmail.com) instead of messaging me on LessWrong.

If we are ever arguing on LessWrong and you feel like it’s kind of heated and would go better if we just talked about it verbally, please feel free to contact me and I’ll probably be willing to call to discuss briefly.

How use­ful is the in­for­ma­tion you get from work­ing in­side an AI com­pany?

11 May 2026 15:29 UTC
57 points
6 comments7 min readLW link

A re­view of “In­ves­ti­gat­ing the con­se­quences of ac­ci­den­tally grad­ing CoT dur­ing RL”

Buck7 May 2026 18:06 UTC
76 points
1 comment8 min readLW link

An­nounc­ing Con­trolConf 2026

Buck26 Feb 2026 2:23 UTC
82 points
4 comments2 min readLW link

The inau­gu­ral Red­wood Re­search podcast

4 Jan 2026 22:11 UTC
141 points
10 comments142 min readLW link

The be­hav­ioral se­lec­tion model for pre­dict­ing AI motivations

4 Dec 2025 18:46 UTC
201 points
30 comments16 min readLW link

Rogue in­ter­nal de­ploy­ments via ex­ter­nal APIs

15 Oct 2025 19:34 UTC
34 points
4 comments6 min readLW link

The Think­ing Machines Tinker API is good news for AI con­trol and security

Buck9 Oct 2025 15:22 UTC
92 points
10 comments6 min readLW link

Chris­tian home­school­ers in the year 3000

Buck17 Sep 2025 14:44 UTC
207 points
65 comments7 min readLW link

I en­joyed most of IABIED

Buck17 Sep 2025 4:34 UTC
210 points
46 comments8 min readLW link

An epistemic ad­van­tage of work­ing as a moderate

Buck20 Aug 2025 17:47 UTC
216 points
95 comments4 min readLW link

Four places where you can put LLM monitoring

9 Aug 2025 23:10 UTC
49 points
0 comments7 min readLW link

Re­search Areas in AI Con­trol (The Align­ment Pro­ject by UK AISI)

1 Aug 2025 10:27 UTC
25 points
0 comments18 min readLW link
(alignmentproject.aisi.gov.uk)

Why it’s hard to make set­tings for high-stakes con­trol research

Buck18 Jul 2025 16:33 UTC
49 points
6 comments4 min readLW link

Re­cent Red­wood Re­search pro­ject proposals

14 Jul 2025 22:27 UTC
93 points
0 comments3 min readLW link

Les­sons from the Iraq War for AI policy

Buck10 Jul 2025 18:52 UTC
201 points
25 comments4 min readLW link

What’s worse, spies or schemers?

9 Jul 2025 14:37 UTC
51 points
2 comments5 min readLW link

How much novel se­cu­rity-crit­i­cal in­fras­truc­ture do you need dur­ing the sin­gu­lar­ity?

Buck4 Jul 2025 16:54 UTC
57 points
7 comments5 min readLW link

There are two fun­da­men­tally differ­ent con­straints on schemers

Buck2 Jul 2025 15:51 UTC
63 points
0 comments4 min readLW link

Com­par­ing risk from in­ter­nally-de­ployed AI to in­sider and out­sider threats from humans

Buck23 Jun 2025 17:47 UTC
150 points
22 comments3 min readLW link

Mak­ing deals with early schemers

20 Jun 2025 18:21 UTC
129 points
41 comments15 min readLW link