sanxiyn comments on Serving LLM on Huawei CloudMatrix

sanxiyn 18 Jun 2025 3:43 UTC
2 points
0
CloudMatrix was not, but Huawei Ascend has been there for a long time, and was used to train LLM even back in 2022. I didn’t realize AI 2027 predated CloudMatrix but I still think ignoring China for Compute Production was unjustified.
- Vladimir_Nesov 18 Jun 2025 14:59 UTC
  4 points
  −1
  Parent
  A central premise of AI-2027 is takeoff via very fast large reasoning models, which necessarily means a lot of compute specifically in the form of large scale-up world systems. Compute with smaller scale-up worlds (such as H100s) can be used to pretrain large models, but not to run inference of large reasoning models at a high speed, and not to train large reasoning models with RLVR if that ends up needing pretraining-scale amounts of GPU-time.
  
  Before CloudMatrix 384, China had all the ingredients except chip/HBM manufacturing capability, large scale-up world systems, and possibly feeling the AGI (I’m not sure if companies like Alibaba won’t be going full Google/DeepMind on AGI if they had the compute). Now that they have large scale-up world systems, there are fewer missing ingredients for entering the AGI race in earnest. This is no small thing, GB200 NVL72 is essentially the first and only modern large scale-up world size system for AI in the world that uses all-to-all topology (sufficient for fast inference or reasoning training of a reasoning model at the scale of a hypothetical GPT-4.5-thinking). The only other alternative is Google’s TPUs, which use 3D torus topology that constrains applications somewhat but seems sufficient for AI. The new Gemini 2.5 report says they were using TPU-v5p in training it, strong systems built out of relatively weak 0.5e15 BF16 FLOP/s chips, 2x slower individually than H100.
  
  Unlike any other country that is not USA, China has manufacturing of everything else down, has enough AI researchers, and potential to quickly fund and execute construction of sufficiently large training systems, if only they had the chips. And in principle they can produce 7nm chips, just not at a low enough defect density that the yield for the reticle-sized AI compute dies is good enough to ramp their production. (The situation with HBM might be even worse, but then the export controls for HBM remain more porous.)
  
  So the point about not having capability to produce the chips remains crucial and keeps China out of the AGI race that follows AI-2027 rules, unless export controls sufficiently loosen or fail. And without CloudMatrix 384, even obtaining a lot of chips wouldn’t have helped with large reasoning models.
  
  (This framing is mostly only relevant for the AI-2027 timeline, since without AGI a few years down the line either 7nm chips become too power-inefficient to matter compared to hypothetical USA’s future 5GW+ training systems build out of 1nm chips, or thanks to the pressure of chips-but-not-tools export controls China sufficiently progresses in domestic chip manufacturing that they move on to being able to produce their own 5nm and then 3nm chips without being left too far behind in power-efficiency.)
  What links here?
  - Musings on AI Companies of 2025-2026 (Jun 2025) by Vladimir_Nesov (20 Jun 2025 17:14 UTC; 66 points)
  - StanislavKrym 18 Jun 2025 21:12 UTC
    −1 points
    0
    Parent
    in principle they can produce 7nm chips, just not at a low enough defect density that the yield for the reticle-sized AI compute dies is good enough to ramp their production.
    Unfortunately, I doubt that China will fail to mitigate the effects of defective chips. Adding noise to the weights is already used to, for example, uncover sandbagging.
    China sufficiently progresses in domestic chip manufacturing that they move on to being able to produce their own 5nm and then 3nm chips without being left too far behind in power-efficiency.
    Xiaomi already asks us to hold its beer while it tries to produce the 3nm chips. Hopefully for the USA, China could end up receiving the chips at an insufficient rate.
    unless export controls sufficiently loosen or fail
    China is already likely to openly buy NVIDIA-produced chips or to undermine the USA’s project by invading Taiwan. If I make the flawed assumption that China has no smuggled chips and forever increases its compute production five times a year (while the USA increases 1.5 times per four months^[1]), then, as I tried to show in my severely downvoted post, the USA are unlikely to keep leadership by slowing down. What about the world without any flawed assumptions?
    ^
    The latter assumption is almost verbatim lifted from the AI-2027 forecast and didn’t take into account the USA’s potential weakness.