Its really just a simple consequence of scaling geometry. Compute scales with device surface area (for 2d chips) or volume (for 3d systems like the brain), while bandwidth/interconnect scales with dimension minus one.
A few years back VCs were fooled by a number of well meaning startups based on the pitch “We can just make a big matmul chip like a GPU but with far more on chip SRAM and thereby avoid the VN bottleneck!” But Nvidia is in fact pretty smart, and understands why exactly this approach doesn’t actually work (at least not yet with SRAM), and much money was wasted.
I used to be pretty excited about neuromorphic computing around 2010 ish. I still am—but today it still seems to be about a decade away.
A few years back VCs were fooled by a number of well meaning startups based on the pitch “We can just make a big matmul chip like a GPU but with far more on chip SRAM and thereby avoid the VN bottleneck!”
I point this—the VN bottleneck—out now and then.
Its really just a simple consequence of scaling geometry. Compute scales with device surface area (for 2d chips) or volume (for 3d systems like the brain), while bandwidth/interconnect scales with dimension minus one.
A few years back VCs were fooled by a number of well meaning startups based on the pitch “We can just make a big matmul chip like a GPU but with far more on chip SRAM and thereby avoid the VN bottleneck!” But Nvidia is in fact pretty smart, and understands why exactly this approach doesn’t actually work (at least not yet with SRAM), and much money was wasted.
I used to be pretty excited about neuromorphic computing around 2010 ish. I still am—but today it still seems to be about a decade away.
Including Cerebras?