What are major indicators for their lead? Is this view partly based on project glasswing and the published examples of vulnerabilities that Mythos Preview has found?
TBC I don’t think they have a big lead. I just mean they just barely overtook OpenAI in my estimation. A major indicator is that they seem to have caught up, or even slightly surpassed, on revenue run rate. Also, they’ve done everything with less compute and fewer employees than OpenAI, meaning they’ve accomplished as much or more, with less, meaning they have an inherent quality/taste/talent advantage. There are various other things as well, e.g. Mythos. (But we’ll see how good Spud is.)
There are very significant efficiency gains from larger scale-up world sizes. It’s 2-3x faster generation per request (and so 2-3x more training steps in RLVR), or 2-3x higher throughput per chip at the same speed per request (which is like having 2-3x more chips), for the same chip but with different scale-up world size (8xB200 vs. GB200 NVL72).
So Anthropic’s access to Trainium 2 Ultra racks plausibly gave them more access to compute in some regimes (such as for experimenting with RLVR on larger models) than OpenAI had with their 8-chip Nvidia servers, at least starting late 2025, probably months earlier at a meaningful scale for R&D than when they got to flagship model inference scale and reduced prices for Opus 4. (Though your point is probably more about what happened prior to late or even not-late 2025.)
What are major indicators for their lead? Is this view partly based on project glasswing and the published examples of vulnerabilities that Mythos Preview has found?
TBC I don’t think they have a big lead. I just mean they just barely overtook OpenAI in my estimation. A major indicator is that they seem to have caught up, or even slightly surpassed, on revenue run rate. Also, they’ve done everything with less compute and fewer employees than OpenAI, meaning they’ve accomplished as much or more, with less, meaning they have an inherent quality/taste/talent advantage. There are various other things as well, e.g. Mythos. (But we’ll see how good Spud is.)
There are very significant efficiency gains from larger scale-up world sizes. It’s 2-3x faster generation per request (and so 2-3x more training steps in RLVR), or 2-3x higher throughput per chip at the same speed per request (which is like having 2-3x more chips), for the same chip but with different scale-up world size (8xB200 vs. GB200 NVL72).
So Anthropic’s access to Trainium 2 Ultra racks plausibly gave them more access to compute in some regimes (such as for experimenting with RLVR on larger models) than OpenAI had with their 8-chip Nvidia servers, at least starting late 2025, probably months earlier at a meaningful scale for R&D than when they got to flagship model inference scale and reduced prices for Opus 4. (Though your point is probably more about what happened prior to late or even not-late 2025.)