Because the SIMD approach is bad for 2D on 2D matrix multiplication NVIDIA has introduced:
Tensor Cores in the Volta architecture.
Article about it:
https://www.anandtech.com/show/12673/titan-v-deep-learning-deep-dive/3
Because the SIMD approach is bad for 2D on 2D matrix multiplication NVIDIA has introduced:
Tensor Cores in the Volta architecture.
Article about it:
https://www.anandtech.com/show/12673/titan-v-deep-learning-deep-dive/3