MoritzG comments on Matrix Multiplication

MoritzG 5 Mar 2020 19:56 UTC
1 point
The entire SIMD vector approach is good for many dot products but it is not the same as a systolic array for rank two on rank two multiplication.
If the job would be to multiply two 1024x1024 matrices then a systolic array of 256x256 MACs would be a good choice. It would work four times on 256x1024 by 1024x256 matrices for 1024+256 steps.
- MoritzG 23 Apr 2020 8:14 UTC
  1 point
  Parent
  Because the SIMD approach is bad for 2D on 2D matrix multiplication NVIDIA has introduced:
  Tensor Cores in the Volta architecture.
  Article about it:
  https://www.anandtech.com/show/12673/titan-v-deep-learning-deep-dive/3