I think the mundane answer is that it’s an anecdote buried in the ‘safety behaviors’ section of a capabilities paper from one of the less famous (relatively speaking) AI companies. Most such sections are boilerplate, and, accordingly, most readers gloss over them.
I absolutely agree that that must have been the answer. But surely at least one person could’ve seen it (and genuinely processed its implications), no? Or at the very least, the researchers themselves could’ve shared it with the world.
It makes me wonder what other secrets may be hiding in unpopular research papers, waiting to be mined.
I think the mundane answer is that it’s an anecdote buried in the ‘safety behaviors’ section of a capabilities paper from one of the less famous (relatively speaking) AI companies. Most such sections are boilerplate, and, accordingly, most readers gloss over them.
I absolutely agree that that must have been the answer. But surely at least one person could’ve seen it (and genuinely processed its implications), no? Or at the very least, the researchers themselves could’ve shared it with the world.
It makes me wonder what other secrets may be hiding in unpopular research papers, waiting to be mined.