My perspective is a bit different: my impression is that: for any algorithm whatsoever, an ASIC tailored to that algorithm will run the algorithm much better than a general-purpose chip can.
Why do you think that? ASICs seem to benefit primarily from hardwiring control flow and removing overhead. The more control flow, the less the ASIC helps. Cryptocurrencies, starting with memory-hard PoWs, have been experimenting with ASIC resistance for a long time now. As I understand it, ASIC-resistance has succeeded in the sense that despite enormous financial incentives and over half a decade, the best ASICs for PoWs designed to be ASIC-resistant typically have a small constant factor improvement like 3x, and nothing remotely like the 10,000x speedup you might get from CPU video codec → ASIC. You can also point to lots of AI algorithms which people don’t bother putting on GPUs because they lack the intrinsic parallelism & are control-flow heavy.
I don’t think I disagree much. When I said “much better” I was thinking to myself “as much as 10x!” not “as much as 10,000x!”
Yes there are lots of AI algorithms that people don’t put on GPUs. I just suspect that if people were spending many millions of dollars running those particular AI algorithms, for many consecutive years, they would probably eventually find it worth their while to make an ASIC for that algorithm. (And that ASIC might or might not look anything like a GPU).
If crypto people are specifically designing algorithms to be un-ASIC-able, I’m not sure we should draw broader lessons from that. Like, of course off-the-shelf CPUs are going to be almost perfectly optimal for some algorithm out of the space of all possible algorithms.
Anyway, even if my previous comment (“any algorithm whatsoever”) is wrong (taken literally, it certainly is, see previous sentence, sorry for being sloppy), I’m somewhat more confident about the subset of algorithms that are AGI-relevant, since those will (I suspect) have quite a bit of parallelizability. For example, the Chen etc al. algorithm described in OP sounds pretty parallelizable (IIUC), even if it can’t be parallelized by today’s GPUs.
Why do you think that? ASICs seem to benefit primarily from hardwiring control flow and removing overhead. The more control flow, the less the ASIC helps. Cryptocurrencies, starting with memory-hard PoWs, have been experimenting with ASIC resistance for a long time now. As I understand it, ASIC-resistance has succeeded in the sense that despite enormous financial incentives and over half a decade, the best ASICs for PoWs designed to be ASIC-resistant typically have a small constant factor improvement like 3x, and nothing remotely like the 10,000x speedup you might get from CPU video codec → ASIC. You can also point to lots of AI algorithms which people don’t bother putting on GPUs because they lack the intrinsic parallelism & are control-flow heavy.
I don’t think I disagree much. When I said “much better” I was thinking to myself “as much as 10x!” not “as much as 10,000x!”
Yes there are lots of AI algorithms that people don’t put on GPUs. I just suspect that if people were spending many millions of dollars running those particular AI algorithms, for many consecutive years, they would probably eventually find it worth their while to make an ASIC for that algorithm. (And that ASIC might or might not look anything like a GPU).
If crypto people are specifically designing algorithms to be un-ASIC-able, I’m not sure we should draw broader lessons from that. Like, of course off-the-shelf CPUs are going to be almost perfectly optimal for some algorithm out of the space of all possible algorithms.
Anyway, even if my previous comment (“any algorithm whatsoever”) is wrong (taken literally, it certainly is, see previous sentence, sorry for being sloppy), I’m somewhat more confident about the subset of algorithms that are AGI-relevant, since those will (I suspect) have quite a bit of parallelizability. For example, the Chen etc al. algorithm described in OP sounds pretty parallelizable (IIUC), even if it can’t be parallelized by today’s GPUs.