quantization
Quantization advances actually go hand-in-hand with hardware development, check the columns on the right in https://en.wikipedia.org/wiki/Nvidia_DGX#Accelerators (a GPU from 2018 is pretty useless for inferencing an 8-bit quant)
UPD: Actually, this point was already been made in comments in other wording yesterday!
Quantization advances actually go hand-in-hand with hardware development, check the columns on the right in https://en.wikipedia.org/wiki/Nvidia_DGX#Accelerators (a GPU from 2018 is pretty useless for inferencing an 8-bit quant)
UPD: Actually, this point was already been made in comments in other wording yesterday!