There are a few reasons why “flops for dollars” maybe not the best measure.
1) One may confuse GPU performance with AI-related hardware performance. However, longterm trends in GPU performance is irrelevant, as GPUs started to be used in AI research only in 2012, which resulted in immediate jump of the performance 50 times compared with CPU.
2) Moreover, AI-hardware jump from classical GPU to tensor cores starting from 2016, and this provided more than order of magnitude increase in inference in Google’s TPU1.
(2014 NVIDIA 2014 K80 – 30 gflops/watt (Smith, 2014). vs. 2016 Google TPU1,– 575-2300 gflops/watt, but inference only, link)
In other words, slow progress in GPU should not be taken as evidence of the slow growth in AI hardware, as AI-computing is often jumping from one type of hardware to another with 1-2 order of magnitude gain of performance during each jump. The next jump will be probably to the spiking neural net chips.
Also, flops for dollar is a wrong measure of the total price of the ownership of AI-related hardware.
The biggest price is not hardware itself but electricity, data-center usage and human AI-scientists salaries.
3) Most of contemporary TPU are optimised for lower energy consumption as it is the biggest part of its price of the ownership, and energy consumption (in dollar-flops for AI-related computing) declines quicker than the price of the hardware itself (in flops-dollar). For example, 2019-planned, AI accelerator for Convolutional neural networks 24 TeraOPs/Watt (Gyrphalcon, 2018) which is 3 orders of magnitude better than NVIDIA K80 from 2014 which had 30 gigaflops per watt.
4) Also, one should calculate the total price of using of the data-center, not only the price of KWh consumed by a GPU. “The average annual data center cost per kW ranges from $5,467 for data centers larger than 50,000 square feet to $26,495 for facilities that are between 500 and 5,000 square feet in size” (link). This 6-30 times more than the price of the KWh, and the price is smaller for larger data centers. This means that larger computers are cheaper in the ownership—but this property can’t be measured by flops-dollar analysis.
5) The biggest part of the price of the ownership now is AI-scientists’ salaries. A team of good scientists could cost a few million dollars a year, let’s say 5 millions. This means that every additional day of calculations costs tens of thousands of dollars only for “idle” human power and thus the total price of AI research is smaller if very powerful computers are used. That is why there is a lot of attempts to scale up the size of the computers for AI. For example, Sony in November 2018 found the way to train Resnet-50 on massive GPU cluster only in 4 minutes.
6) “Dollar-flops” measure is not capable to present an important capability of parallelization. Early GPUs were limited in data-exchange capabilities, and could train only a few millions parameters nets using their dozen gigabytes memory. Now the progress in programming and hardware allows the connection of thousands GPUs into a one system, and the training of neural nets with hundred millions of parameters.
7) “Dollar-flops” also don’t present the progress of the “intellect stack” – that is, the vertical integration between high-level programming languages, low-level assembler like languages and the hardware which is an important part of the NVIDIA success. It includes CUDA, Keras, libraries, Youtube tutorials, you name it.
In other words, “flops for dollar” is not a good proxy of AI hardware growth, and using them may result in underestimating AI timing.
There are a few reasons why “flops for dollars” maybe not the best measure.
1) One may confuse GPU performance with AI-related hardware performance. However, longterm trends in GPU performance is irrelevant, as GPUs started to be used in AI research only in 2012, which resulted in immediate jump of the performance 50 times compared with CPU.
2) Moreover, AI-hardware jump from classical GPU to tensor cores starting from 2016, and this provided more than order of magnitude increase in inference in Google’s TPU1.
(2014 NVIDIA 2014 K80 – 30 gflops/watt (Smith, 2014). vs. 2016 Google TPU1,– 575-2300 gflops/watt, but inference only, link)
In other words, slow progress in GPU should not be taken as evidence of the slow growth in AI hardware, as AI-computing is often jumping from one type of hardware to another with 1-2 order of magnitude gain of performance during each jump. The next jump will be probably to the spiking neural net chips.
Also, flops for dollar is a wrong measure of the total price of the ownership of AI-related hardware.
The biggest price is not hardware itself but electricity, data-center usage and human AI-scientists salaries.
3) Most of contemporary TPU are optimised for lower energy consumption as it is the biggest part of its price of the ownership, and energy consumption (in dollar-flops for AI-related computing) declines quicker than the price of the hardware itself (in flops-dollar). For example, 2019-planned, AI accelerator for Convolutional neural networks 24 TeraOPs/Watt (Gyrphalcon, 2018) which is 3 orders of magnitude better than NVIDIA K80 from 2014 which had 30 gigaflops per watt.
4) Also, one should calculate the total price of using of the data-center, not only the price of KWh consumed by a GPU. “The average annual data center cost per kW ranges from $5,467 for data centers larger than 50,000 square feet to $26,495 for facilities that are between 500 and 5,000 square feet in size” (link). This 6-30 times more than the price of the KWh, and the price is smaller for larger data centers. This means that larger computers are cheaper in the ownership—but this property can’t be measured by flops-dollar analysis.
5) The biggest part of the price of the ownership now is AI-scientists’ salaries. A team of good scientists could cost a few million dollars a year, let’s say 5 millions. This means that every additional day of calculations costs tens of thousands of dollars only for “idle” human power and thus the total price of AI research is smaller if very powerful computers are used. That is why there is a lot of attempts to scale up the size of the computers for AI. For example, Sony in November 2018 found the way to train Resnet-50 on massive GPU cluster only in 4 minutes.
6) “Dollar-flops” measure is not capable to present an important capability of parallelization. Early GPUs were limited in data-exchange capabilities, and could train only a few millions parameters nets using their dozen gigabytes memory. Now the progress in programming and hardware allows the connection of thousands GPUs into a one system, and the training of neural nets with hundred millions of parameters.
7) “Dollar-flops” also don’t present the progress of the “intellect stack” – that is, the vertical integration between high-level programming languages, low-level assembler like languages and the hardware which is an important part of the NVIDIA success. It includes CUDA, Keras, libraries, Youtube tutorials, you name it.
In other words, “flops for dollar” is not a good proxy of AI hardware growth, and using them may result in underestimating AI timing.