Richard_Ngo comments on Red-Thing-Ism

Richard_Ngo 1 Aug 2025 13:27 UTC
18 points
10
Strongly upvoted. Alignment researchers often feel so compelled to quickly contribute to decreasing x-risk that they end up studying non-robust categories that won’t generalize very far, and sometimes actively make the field more confused. I wish that most people doing this were just trying to do the best science they could instead.