There are new Huawei Ascend 910C CloudMatrix 384 systems that form scale-up worlds comparable to GB200 NVL72, which is key to being able to run long reasoning inference for large models much faster and cheaper than possible using systems with significantly smaller world sizes like the current H100/H200 NVL8 (and also makes it easier to run training, though not as essential unless RL training really does scale to the moon).
Apparently TSMC produced ~2.1M compute dies for these systems in 2024-2025, which is 1.1M chips, and an Ascend 910C chip is 0.8e15 dense BF16 FLOP/s (compared to 2.5e15 for a GB200 chip). So the compute is about the same as that of ~350K GB200 chips (not dies or superchips), which is close to 400K-500K GB200 chips that will be installed at the Abilene site of Crusoe/Stargate/OpenAI in 2026. There also seems to be potential to produce millions more without TSMC.
These systems are 2.3x less power-efficient per FLOP/s than GB200 NVL72. They are using 7nm process instead of 4nm process of Blackwell, the scale-up network is using optical transceivers instead of copper, and the same compute needs more chips to produce it, so they are probably significantly more expensive per FLOP/s. But if there is enough funding and the 2.1M compute dies from TSMC are used to build a single training/inference system (about 2.5 GW), there is in principle some potential for parity between US and China at the level of a single frontier AI company for late 2026 compute (with no direct implications for 2027+ compute, in particular Nvidia Rubin buildout will begin around that time).
(The relevance is that whatever the plans are, they need to be grounded in what’s technically feasible, and this piece of news changed my mind on what might be technically feasible in 2026 on short notice. The key facts are systems with a large scale-up world size, and enough compute dies to match the compute of Abilene site in 2026, neither of which was obviously possible without more catch-up time, by which time the US training systems would’ve already moved on to an even greater scale.)
There are new Huawei Ascend 910C CloudMatrix 384 systems that form scale-up worlds comparable to GB200 NVL72, which is key to being able to run long reasoning inference for large models much faster and cheaper than possible using systems with significantly smaller world sizes like the current H100/H200 NVL8 (and also makes it easier to run training, though not as essential unless RL training really does scale to the moon).
Apparently TSMC produced ~2.1M compute dies for these systems in 2024-2025, which is 1.1M chips, and an Ascend 910C chip is 0.8e15 dense BF16 FLOP/s (compared to 2.5e15 for a GB200 chip). So the compute is about the same as that of ~350K GB200 chips (not dies or superchips), which is close to 400K-500K GB200 chips that will be installed at the Abilene site of Crusoe/Stargate/OpenAI in 2026. There also seems to be potential to produce millions more without TSMC.
These systems are 2.3x less power-efficient per FLOP/s than GB200 NVL72. They are using 7nm process instead of 4nm process of Blackwell, the scale-up network is using optical transceivers instead of copper, and the same compute needs more chips to produce it, so they are probably significantly more expensive per FLOP/s. But if there is enough funding and the 2.1M compute dies from TSMC are used to build a single training/inference system (about 2.5 GW), there is in principle some potential for parity between US and China at the level of a single frontier AI company for late 2026 compute (with no direct implications for 2027+ compute, in particular Nvidia Rubin buildout will begin around that time).
(The relevance is that whatever the plans are, they need to be grounded in what’s technically feasible, and this piece of news changed my mind on what might be technically feasible in 2026 on short notice. The key facts are systems with a large scale-up world size, and enough compute dies to match the compute of Abilene site in 2026, neither of which was obviously possible without more catch-up time, by which time the US training systems would’ve already moved on to an even greater scale.)