RSS

Buck

Karma: 15,308

CEO at Redwood Research.

AI safety is a highly collaborative field—almost all the points I make were either explained to me by someone else, or developed in conversation with other people. I’m saying this here because it would feel repetitive to say “these ideas were developed in collaboration with various people” in all my comments, but I want to have it on the record that the ideas I present were almost entirely not developed by me in isolation.

Please contact me via email (bshlegeris@gmail.com) instead of messaging me on LessWrong.

If we are ever arguing on LessWrong and you feel like it’s kind of heated and would go better if we just talked about it verbally, please feel free to contact me and I’ll probably be willing to call to discuss briefly.

Rogue in­ter­nal de­ploy­ments via ex­ter­nal APIs

15 Oct 2025 19:34 UTC
24 points
0 comments6 min readLW link

The Think­ing Machines Tinker API is good news for AI con­trol and security

Buck9 Oct 2025 15:22 UTC
91 points
10 comments6 min readLW link