A point to add which I believe is pertinent, OpenAI didn’t release GPT-2 to the public 7 years ago, citing risks related to malicious use. This point specifically targets your conjecture “the attitude among top large companies—those in power—is that AI models with a certain level of capability will need to have strict usage controls,” OpenAI decided they’d reached this level of capability 7 years ago, reasonably quickly determined that their concerns were misplaced, and proceeded to release the model in full.
Since the advent of ChatGPT, and more specifically the race for market share / usage data, we have blown through vastly more capable models than GPT-2 without a second thought. Now, we reach Mythos which has apparently made at least one large company re-evaluate this stance. The main, perhaps cynical, feeling I have is that it is not just easy but advantageous for a company such as Anthropic to adopt this posture: if you know you are slightly ahead of the curve, and as such are not risking any loss to market share / usage data, what could be better marketing than telling the world ‘our models are too good’?
We have heard from Anthropic that the BSD bug both cost ~$20,000 to find, and that the vulnerability was a null pointer dereference, which is almost never exploitable for remote code execution. To me, this is very far from supporting the framing that Mythos represents a genuine capability threshold that forced Anthropic’s hand. Given the GPT-2 precedent, and the market incentive, I’d want much stronger evidence before accepting that we have seen any real shift.
A point to add which I believe is pertinent, OpenAI didn’t release GPT-2 to the public 7 years ago, citing risks related to malicious use. This point specifically targets your conjecture “the attitude among top large companies—those in power—is that AI models with a certain level of capability will need to have strict usage controls,” OpenAI decided they’d reached this level of capability 7 years ago, reasonably quickly determined that their concerns were misplaced, and proceeded to release the model in full.
Since the advent of ChatGPT, and more specifically the race for market share / usage data, we have blown through vastly more capable models than GPT-2 without a second thought. Now, we reach Mythos which has apparently made at least one large company re-evaluate this stance. The main, perhaps cynical, feeling I have is that it is not just easy but advantageous for a company such as Anthropic to adopt this posture: if you know you are slightly ahead of the curve, and as such are not risking any loss to market share / usage data, what could be better marketing than telling the world ‘our models are too good’?
We have heard from Anthropic that the BSD bug both cost ~$20,000 to find, and that the vulnerability was a null pointer dereference, which is almost never exploitable for remote code execution. To me, this is very far from supporting the framing that Mythos represents a genuine capability threshold that forced Anthropic’s hand. Given the GPT-2 precedent, and the market incentive, I’d want much stronger evidence before accepting that we have seen any real shift.