It’s unclear if Mythos is much more impactful for cybersecurity overall than a new fuzzing or static analysis tool. Such tools always find a lot of previously unknown bugs and vulnerabilities if they use a new method, even an absurdly simple method, or merely a slightly unusual method (which would happen to some extent for most major version updates of the tool). There is a lot of code in the world to find bugs in, and the bugs that only the new tool finds in the latest version of the code will be the bugs that were never fixed before. The unusual thing about Mythos is automation of exploitation or fixing of some of the bugs, which in particular automates high confidence estimation of correctness and severity of some of the issues.
On the other hand, if Mythos is indeed a 10T+ total param model, it will only be efficient to serve on TPUv7[1], which might only become available to Anthropic in sufficient numbers later in the year (they have 1 GW of them scheduled to go online in 2026). Serving Mythos before that happens would make it perhaps at least 2x more expensive than it becomes once TPUv7 are available, if somehow there is enough Trainium 2 Ultra to serve it. Serving it on 8-chip Nvidia servers DeepSeek-V3 style would be even more expensive and seriously slow.
Finally, Anthropic’s competitors are a bit behind. OpenAI might’ve only finished pretraining their Spud in March[2], whereas Anthropic was making an internal deployment decision about Mythos in February[3]. xAI is only now training a 6T model and a 10T model[4]. So perhaps the concern about cybersecurity is not central to the decision to delay the release, though the slack of being in the lead will undoubtedly be put to good use in making the model better before it’s released. Still, I’m guessing Mythos’s release won’t actually happen significantly later than OpenAI releases their Spud (if Spud is better than Opus 5), even if the cost of Mythos tokens would need to remain very high before their TPUv7 datacenters get online.
There’s also liquid cooled Teton 3 Max (a 2-rack scale-up system with 144 Trainium 3 chips) that has 20.7 TB of HBM3E. But if a significant buildout of this system happens, it might be even later, sometime in 2027.
“The company has finished pretraining “Spud,” Altman said in the memo. He told staff that the company expects to have a “very strong model” in “a few weeks” that the team believes “can really accelerate the economy.”” The Information, 24 Mar 2026.
“Following a successful alignment review, the first early version of Claude Mythos Preview was made available for internal use on February 24.” Mythos Preview System Card, page 12.
That’s not a real price. That’s just what they’re giving their partners as part of Glasswing, a charitable endeavour to try to stem the worst of the global damage, and is presumably more about encouraging the partners to economize on scarce Mythos tokens by avoiding setting the price to literally $0 (where people would be lazy and wasteful). It may or may not have much of anything to do with a ‘real’ price (whatever that means in a situation where hardware is so limited and demand so vast for what is an unpriceable ephemerally unique capability/possibility etc).
GB300 NVL72 (but not GB200) would probably also do when serving via clouds, there’s just not a lot of it yet (when compared to everything else put together). But some GB300 might be available earlier in the clouds than TPUv7 for first-party API, so that’s a possibility. Also, the smaller rack-scale servers (GB200 NVL72, Trainium 2 Ultra, maybe there’ll be some Trainium 3 NL32x2 soon) won’t be 10x worse, just maybe 2x worse (if it’s a 10T+ param model deployed in FP8).
It’s unclear if Mythos is much more impactful for cybersecurity overall than a new fuzzing or static analysis tool
The bull case here is that “scale LLMs” is turning out to be a way to predictably and consistently produce ever-better tools for discovering exploits, right? Probably with said tools’ power scaling exponentially (in some relevant sense), like everything else with LLMs.
That is, Mythos by itself is probably just on the level of a new fuzzing tool, able to let humans find a new reference class of exploits. But then we’d have Mythos 2 three to six months later, etc. Which potentially shifts the cybersecurity world into a new operating regime, even if each individual perturbation is something that already happened before.
Or is there an argument that it would still be on-model for how the cybersecurity world operates? I’m not very familiar.
It’s unclear if Mythos is much more impactful for cybersecurity overall than a new fuzzing or static analysis tool. Such tools always find a lot of previously unknown bugs and vulnerabilities if they use a new method, even an absurdly simple method, or merely a slightly unusual method (which would happen to some extent for most major version updates of the tool). There is a lot of code in the world to find bugs in, and the bugs that only the new tool finds in the latest version of the code will be the bugs that were never fixed before. The unusual thing about Mythos is automation of exploitation or fixing of some of the bugs, which in particular automates high confidence estimation of correctness and severity of some of the issues.
On the other hand, if Mythos is indeed a 10T+ total param model, it will only be efficient to serve on TPUv7 [1] , which might only become available to Anthropic in sufficient numbers later in the year (they have 1 GW of them scheduled to go online in 2026). Serving Mythos before that happens would make it perhaps at least 2x more expensive than it becomes once TPUv7 are available, if somehow there is enough Trainium 2 Ultra to serve it. Serving it on 8-chip Nvidia servers DeepSeek-V3 style would be even more expensive and seriously slow.
Finally, Anthropic’s competitors are a bit behind. OpenAI might’ve only finished pretraining their Spud in March [2] , whereas Anthropic was making an internal deployment decision about Mythos in February [3] . xAI is only now training a 6T model and a 10T model [4] . So perhaps the concern about cybersecurity is not central to the decision to delay the release, though the slack of being in the lead will undoubtedly be put to good use in making the model better before it’s released. Still, I’m guessing Mythos’s release won’t actually happen significantly later than OpenAI releases their Spud (if Spud is better than Opus 5), even if the cost of Mythos tokens would need to remain very high before their TPUv7 datacenters get online.
There’s also liquid cooled Teton 3 Max (a 2-rack scale-up system with 144 Trainium 3 chips) that has 20.7 TB of HBM3E. But if a significant buildout of this system happens, it might be even later, sometime in 2027.
“The company has finished pretraining “Spud,” Altman said in the memo. He told staff that the company expects to have a “very strong model” in “a few weeks” that the team believes “can really accelerate the economy.”” The Information, 24 Mar 2026.
“Following a successful alignment review, the first early version of Claude Mythos Preview was made available for internal use on February 24.” Mythos Preview System Card, page 12.
“SpaceXAI Colossus 2 now has 7 models in training … 6T … 10T.” Musk’s post on X, 8 Apr 2026.
FWIW Mythos Preview is available on Amazon Bedrock and Microsoft Foundry which don’t use TPUs (presumably at the same price as the first-party API?).
That’s not a real price. That’s just what they’re giving their partners as part of Glasswing, a charitable endeavour to try to stem the worst of the global damage, and is presumably more about encouraging the partners to economize on scarce Mythos tokens by avoiding setting the price to literally $0 (where people would be lazy and wasteful). It may or may not have much of anything to do with a ‘real’ price (whatever that means in a situation where hardware is so limited and demand so vast for what is an unpriceable ephemerally unique capability/possibility etc).
GB300 NVL72 (but not GB200) would probably also do when serving via clouds, there’s just not a lot of it yet (when compared to everything else put together). But some GB300 might be available earlier in the clouds than TPUv7 for first-party API, so that’s a possibility. Also, the smaller rack-scale servers (GB200 NVL72, Trainium 2 Ultra, maybe there’ll be some Trainium 3 NL32x2 soon) won’t be 10x worse, just maybe 2x worse (if it’s a 10T+ param model deployed in FP8).
The bull case here is that “scale LLMs” is turning out to be a way to predictably and consistently produce ever-better tools for discovering exploits, right? Probably with said tools’ power scaling exponentially (in some relevant sense), like everything else with LLMs.
That is, Mythos by itself is probably just on the level of a new fuzzing tool, able to let humans find a new reference class of exploits. But then we’d have Mythos 2 three to six months later, etc. Which potentially shifts the cybersecurity world into a new operating regime, even if each individual perturbation is something that already happened before.
Or is there an argument that it would still be on-model for how the cybersecurity world operates? I’m not very familiar.