Well, within reason that can happen—I am not saying the metric is going to be perfect. But it’s probably a decent first order approximation because that logic can’t stretch forever. If instead of a factor of 2 it was a factor of 10 the trade off would probably not be worth it.
Thanks! I guess my original statement came off a bit too strong, but what I meant is that while there is a frontier for trade offs (maybe the GPUs’ greater flexibility is worth the 2x energy cost?), I didn’t expect the gap to be orders of magnitude. That’s good enough for me with the understanding that any such estimates will never be particularly accurate anyway and just give us a rough idea of how much compute these companies are actually fielding. What they squeeze out of that will depend on a bunch of other details anyway, so scale is the best we can guess.
Because it’s what they can get. A factor of two or more in compute is plausibly less important than a delay of a year.
This may or may not be the case, but the argument for why it can’t be very different fails.
Well, within reason that can happen—I am not saying the metric is going to be perfect. But it’s probably a decent first order approximation because that logic can’t stretch forever. If instead of a factor of 2 it was a factor of 10 the trade off would probably not be worth it.
Data. Find out the answer.
https://www.wevolver.com/article/tpu-vs-gpu-a-comprehensive-technical-comparison
Looks like they arehwitin 2x of the H200s, albeit with some complexity in details.
Thanks! I guess my original statement came off a bit too strong, but what I meant is that while there is a frontier for trade offs (maybe the GPUs’ greater flexibility is worth the 2x energy cost?), I didn’t expect the gap to be orders of magnitude. That’s good enough for me with the understanding that any such estimates will never be particularly accurate anyway and just give us a rough idea of how much compute these companies are actually fielding. What they squeeze out of that will depend on a bunch of other details anyway, so scale is the best we can guess.