making sure there are really high standards for safety and that there isn’t going to be danger what these AIs are doing
Ah yes, a great description of Anthropic’s safety actions. I don’t think anyone serious at Anthropic believes that they “made sure there isn’t going to be danger from these AIs are doing”. Indeed, many (most?) of their safety people assign double-digits probabilities to catastrophic outcomes from advanced AI system.
I do think this was a predictable quite bad consequence of Dario’s essay (as well as his other essays which heavily downplay or completely omit any discussion of risks). My guess is it will majorly contribute to reckless racing while giving people a false impression of how good we are doing on actually making things safe.
Anthropic has put WAY more effort into safety, way way more effort into making sure there are really high standards for safety and that there isn’t going to be danger what these AIs are doing
implies it’s just the amount of effort is larger than other companies (which I agree with), and not the Youtuber believing they’ve solved alignment or are doing enough, see:
but he’s also a realist and is like “AI is going to really potentially fuck up our world”
and
But he’s very realistic. There is a lot of bad shit that is going to happen with AI. I’m not denying that at all.
So I’m not confident that it’s “giving people a false impression of how good we are doing on actually making things safe.” in this case.
I do know DougDoug has recommended Anthropic’s Alignment Faking paper to another youtuber, which is more of a “stating a problem” paper than saying they’ve solved it.
Ah yes, a great description of Anthropic’s safety actions. I don’t think anyone serious at Anthropic believes that they “made sure there isn’t going to be danger from these AIs are doing”. Indeed, many (most?) of their safety people assign double-digits probabilities to catastrophic outcomes from advanced AI system.
I do think this was a predictable quite bad consequence of Dario’s essay (as well as his other essays which heavily downplay or completely omit any discussion of risks). My guess is it will majorly contribute to reckless racing while giving people a false impression of how good we are doing on actually making things safe.
I think the fuller context,
implies it’s just the amount of effort is larger than other companies (which I agree with), and not the Youtuber believing they’ve solved alignment or are doing enough, see:
and
So I’m not confident that it’s “giving people a false impression of how good we are doing on actually making things safe.” in this case.
I do know DougDoug has recommended Anthropic’s Alignment Faking paper to another youtuber, which is more of a “stating a problem” paper than saying they’ve solved it.