Many of the points you make are technically correct but aren’t binding constraints. As an example, diffusion is slow over small distances but biology tends to work on µm scales where it is more than fast enough and gives quite high power densities. Tiny fractal-like microstructure is nature’s secret weapon.
The points about delay (synapse delay and conduction velocity) are valid though phrasing everything in terms of diffusion speed is not ideal. In the long run, 3d silicon+ devices should beat the brain on processing latency and possibly on energy efficiency
Still, pointing at diffusion as the underlying problem seems a little odd.
You’re ignoring things like:
ability to separate training and running of a model
spending much more on training to improve model efficiency is worthwhile since training costs are shared across all running instances
ability to train in parallel using a lot of compute
current models are fully trained in <0.5 years
ability to keep going past current human tradeoffs and do rapid iteration
Human brain development operates on evolutionary time scales
increasing human brain size by 10x won’t happen anytime soon but can be done for AI models.
People like Hinton Typically point to those as advantages and that’s mostly down to the nature of digital models as copy-able data, not anything related to diffusion.
Energy processing
Lungs are support equipment. Their size isn’t that interesting. Normal computers, once you get off chip, have large structures for heat dissipation. Data centers can spend quite a lot of energy/equipment-mass getting rid of heat.
Highest biological power to weight ratio is bird muscle which produces around 1 w/cm³ (mechanical power). Mitochondria in this tissue produces more than 3w/cm³ of chemical ATP power. Brain power density is a lot lower. A typical human brain is 80 watts/1200cm³ = 0.067W/cm³.
synapse delay
This is a legitimate concern. Biology had to make some tradeoffs here. There are a lot of places where direct mechanical connections would be great but biology uses diffusing chemicals.
Electrical synapses exist and have negligible delay. though they are much less flexible (can’t do inhibitory connections && signals can pass both ways through connection)
conduction velocity
Slow diffiusion speed of charge carriers is a valid point and is related to the 10^8 factor difference in electrical conductivity between neuron saltwater and copper. Conduction speed is an electrical problem. There’s a 300x difference in conduction speed between myelinated(300m/s) and un-myelinated neurons(1m/s).
compensating disadvantages to current digital logic
The brain runs at 100-1000 Hz vs 1GHz for computers (10^6 − 10^7 x slower). It would seem at first glance that digital logic is much better.
The brain has the advantage of being 3D compared to 2D chips which means less need to move data long distances. Modern deep learning systems need to move all their synapse-weight-like data from memory into the chip during each inference cycle. You can do better by running a model across a lot of chips but this is expensive and may be inneficient.
In the long run, silicon (or something else) will beat brains in speed and perhaps a little in energy efficiency. If this fellow is right about lower loss interconnects then you get another + 3OOM in energy efficiency.
But again, that’s not what’s making current models work. It’s their nature as copy-able digital data that matters much more.
Many of the points you make are technically correct but aren’t binding constraints. As an example, diffusion is slow over small distances but biology tends to work on µm scales where it is more than fast enough and gives quite high power densities. Tiny fractal-like microstructure is nature’s secret weapon.
The points about delay (synapse delay and conduction velocity) are valid though phrasing everything in terms of diffusion speed is not ideal. In the long run, 3d silicon+ devices should beat the brain on processing latency and possibly on energy efficiency
Still, pointing at diffusion as the underlying problem seems a little odd.
You’re ignoring things like:
ability to separate training and running of a model
spending much more on training to improve model efficiency is worthwhile since training costs are shared across all running instances
ability to train in parallel using a lot of compute
current models are fully trained in <0.5 years
ability to keep going past current human tradeoffs and do rapid iteration
Human brain development operates on evolutionary time scales
increasing human brain size by 10x won’t happen anytime soon but can be done for AI models.
People like Hinton Typically point to those as advantages and that’s mostly down to the nature of digital models as copy-able data, not anything related to diffusion.
Energy processing
Lungs are support equipment. Their size isn’t that interesting. Normal computers, once you get off chip, have large structures for heat dissipation. Data centers can spend quite a lot of energy/equipment-mass getting rid of heat.
Highest biological power to weight ratio is bird muscle which produces around 1 w/cm³ (mechanical power). Mitochondria in this tissue produces more than 3w/cm³ of chemical ATP power. Brain power density is a lot lower. A typical human brain is 80 watts/1200cm³ = 0.067W/cm³.
synapse delay
This is a legitimate concern. Biology had to make some tradeoffs here. There are a lot of places where direct mechanical connections would be great but biology uses diffusing chemicals.
Electrical synapses exist and have negligible delay. though they are much less flexible (can’t do inhibitory connections && signals can pass both ways through connection)
conduction velocity
Slow diffiusion speed of charge carriers is a valid point and is related to the 10^8 factor difference in electrical conductivity between neuron saltwater and copper. Conduction speed is an electrical problem. There’s a 300x difference in conduction speed between myelinated(300m/s) and un-myelinated neurons(1m/s).
compensating disadvantages to current digital logic
The brain runs at 100-1000 Hz vs 1GHz for computers (10^6 − 10^7 x slower). It would seem at first glance that digital logic is much better.
The brain has the advantage of being 3D compared to 2D chips which means less need to move data long distances. Modern deep learning systems need to move all their synapse-weight-like data from memory into the chip during each inference cycle. You can do better by running a model across a lot of chips but this is expensive and may be inneficient.
In the long run, silicon (or something else) will beat brains in speed and perhaps a little in energy efficiency. If this fellow is right about lower loss interconnects then you get another + 3OOM in energy efficiency.
But again, that’s not what’s making current models work. It’s their nature as copy-able digital data that matters much more.