RSS

Buck

Karma: 18,041

CEO at Redwood Research.

AI safety is a highly collaborative field—almost all the points I make were either explained to me by someone else, or developed in conversation with other people. I’m saying this here because it would feel repetitive to say “these ideas were developed in collaboration with various people” in all my comments, but I want to have it on the record that the ideas I present were almost entirely not developed by me in isolation.

Please contact me via email (bshlegeris@gmail.com) instead of messaging me on LessWrong.

If we are ever arguing on LessWrong and you feel like it’s kind of heated and would go better if we just talked about it verbally, please feel free to contact me and I’ll probably be willing to call to discuss briefly.

Effi­cient trade­offs and the safety-use­ful­ness trade­off model

Buck8 Jun 2026 20:28 UTC
42 points
0 comments8 min readLW link

Notes on axes of vari­a­tion in third-party risk assessment

Buck31 May 2026 20:48 UTC
36 points
2 comments10 min readLW link

How use­ful is the in­for­ma­tion you get from work­ing in­side an AI com­pany?

11 May 2026 15:29 UTC
61 points
6 comments7 min readLW link

A re­view of “In­ves­ti­gat­ing the con­se­quences of ac­ci­den­tally grad­ing CoT dur­ing RL”

Buck7 May 2026 18:06 UTC
76 points
1 comment8 min readLW link