Paragox comments on An “Optimistic” 2027 Timeline

Paragox 7 Apr 2025 4:48 UTC
2 points
0
Well yes, but that is just because they are whitelisting it to work with NVLink-72 switches. There is no reason a Hoppper GPU could not interface with NVLink-72 if Nvidia didn’t artificially limit it.

Additionally, by saying
>can’t be repeated even with Rubin Ultra NVL576

I think they are indicating there is something else improving besides world size increases, as this improvement would not exist even in 2 more gpu generations when we get 576 (194 gpus) worth of mono-addressable pooled vram, and the giant world / model-head sizes it will enable.
- Vladimir_Nesov 7 Apr 2025 15:10 UTC
  4 points
  0
  Parent
  The reason Rubin NVL576 probably won’t help as much as the current transition from Hopper is that Blackwell NVL72 is already ~sufficient for the model sizes that are compute optimal to train on $30bn Blackwell training systems (which Rubin NVL144 training systems probably won’t significantly leapfrog before Rubin NVL576 comes out, unless there are reliable agents in 2026-2027 and funding goes through the roof).
  
  when we get 576 (194 gpus)
  
  The terminology Huang was advocating for at GTC 2025 (at 1:28:04) is to use “GPU” to refer to compute dies rather than chips/packages, and in these terms a Rubin NVL576 rack has 144 chips and 576 GPUs, rather than 144 GPUs. Even though this seems contentious, the terms compute die and chip/package remain less ambiguous than “GPU”.