CEO at Redwood Research.
AI safety is a highly collaborative field—almost all the points I make were either explained to me by someone else, or developed in conversation with other people. I’m saying this here because it would feel repetitive to say “these ideas were developed in collaboration with various people” in all my comments, but I want to have it on the record that the ideas I present were almost entirely not developed by me in isolation.
Please contact me via email (bshlegeris@gmail.com) instead of messaging me on LessWrong.
If we are ever arguing on LessWrong and you feel like it’s kind of heated and would go better if we just talked about it verbally, please feel free to contact me and I’ll probably be willing to call to discuss briefly.
I basically agree with Zach that based on public information it seems like it would be really hard for them to be robust to this and it seems implausible that they have justified confidence in such robustness.
I agree that he doesn’t say the argument in very much depth. Obviously, I think it’d be great if someone made the argument in more detail. I think Zach’s point is a positive contribution even though it isn’t that detailed.