I suspect DeepSeek is unusually vulnerable to the problem of switching hardware because my expectation for their cost advantage fundamentally boils down to having invested a lot of effort in low-level performance optimization to reduce training/inference costs.
Switching the underlying hardware breaks all this work. Further, I don’t expect the Huawei chips to be as easy to optimize as the Nvidia H-series, because the H-series are built mostly the same way as Nvidia has always built them (CUDA), and Huawei’s Ascend is supposed to be a new architecture entirely. Lots of people know CUDA; only Huawei’s people know how the memory subsystem for Ascend works.
If I am right, it looks like they got hurt by bad timing this round same way as they benefited from good timing last round.
I suspect DeepSeek is unusually vulnerable to the problem of switching hardware because my expectation for their cost advantage fundamentally boils down to having invested a lot of effort in low-level performance optimization to reduce training/inference costs.
Switching the underlying hardware breaks all this work. Further, I don’t expect the Huawei chips to be as easy to optimize as the Nvidia H-series, because the H-series are built mostly the same way as Nvidia has always built them (CUDA), and Huawei’s Ascend is supposed to be a new architecture entirely. Lots of people know CUDA; only Huawei’s people know how the memory subsystem for Ascend works.
If I am right, it looks like they got hurt by bad timing this round same way as they benefited from good timing last round.