While we’re on the topic, I am kinda worried about Anthropic employees who might be talking to Claude all day and falling into a trap. (thinking of Amanda Askell in particular who’s day job is basically this)
I’ve been worried about this type of thing for a long time, but still didn’t foresee or warn people that AI company employees, and specifically alignment/safety workers, could be one of the first victims (which seems really obvious in retrospect). Yet another piece of evidence for how strategically incompetent humans are.
I’ve been worried about this type of thing for a long time, but still didn’t foresee or warn people that AI company employees, and specifically alignment/safety workers, could be one of the first victims (which seems really obvious in retrospect). Yet another piece of evidence for how strategically incompetent humans are.