The reason Rubin NVL576 probably won’t help as much as the current transition from Hopper is that Blackwell NVL72 is already ~sufficient for the model sizes that are compute optimal to train on $30bn Blackwell training systems (which Rubin NVL144 training systems probably won’t significantly leapfrog before Rubin NVL576 comes out, unless there are reliable agents in 2026-2027 and funding goes through the roof).
when we get 576 (194 gpus)
The terminology Huang was advocating for at GTC 2025 (at 1:28:04) is to use “GPU” to refer to compute dies rather than chips/packages, and in these terms a Rubin NVL576 rack has 144 chips and 576 GPUs, rather than 144 GPUs. Even though this seems contentious, the terms compute die and chip/package remain less ambiguous than “GPU”.
The reason Rubin NVL576 probably won’t help as much as the current transition from Hopper is that Blackwell NVL72 is already ~sufficient for the model sizes that are compute optimal to train on $30bn Blackwell training systems (which Rubin NVL144 training systems probably won’t significantly leapfrog before Rubin NVL576 comes out, unless there are reliable agents in 2026-2027 and funding goes through the roof).
The terminology Huang was advocating for at GTC 2025 (at 1:28:04) is to use “GPU” to refer to compute dies rather than chips/packages, and in these terms a Rubin NVL576 rack has 144 chips and 576 GPUs, rather than 144 GPUs. Even though this seems contentious, the terms compute die and chip/package remain less ambiguous than “GPU”.