I mean, the easiest solution is just “make it smaller and use active cooling”. The relevant loopholes in Jacob’s argument are in the Density and Temperature section of his Brain Efficiency post.
Jacob is using a temperature formula for blackbody radiators, which is basically just irrelevant to temperature of realistic compute substrate—brains, chips, and probably future compute substrates are all cooled by conduction through direct contact with something cooler (blood for the brain, heatsink/air for a chip). The obvious law to use instead would just be the standard thermal conduction law: heat flow per unit area proportional to temperature gradient.
Jacob’s analysis in that section also fails to adjust for how, by his own model in the previous section, power consumption scales linearly with system size (and also scales linearly with temperature).
Put all that together, and we get:
qA=C1TSRR2=C2(TS−TE)R
… where:
R is radius of the system
A is surface area of thermal contact
q is heat flow out of system
TS is system temperature
TE is environment temperature (e.g. blood or heat sink temperature)
C1,C2 are constants with respect to system size and temperature
(Of course a spherical approximation is not great, but we’re mostly interested in change as all the dimensions scale linearly, so the geometry shouldn’t matter for our purposes.)
First key observation: all the R’s cancel out. If we scale down by a factor of 2, the power consumption is halved (since every wire is half as long), the area is quartered (so power density over the surface is doubled), and the temperature gradient is doubled since the surface is half as thick. So, overall, equilibrium temperature stays the same as the system scales down.
So in fact scaling down is plausibly free, for purposes of heat management. (Though I’m not highly confident that would work in practice. In particular, I’m least confident about the temperature gradient scaling with system size, in practice. If that failed, then the temperature delta relative to the environment would scale at-worst ~linearly with inverse size, i.e. halving the size would double the temperature delta.)
On top of that, we could of course just use a colder environment, i.e. pump liquid nitrogen or even liquid helium over the thing. According to this meta-analysis, the average temperature delta between e.g. brain and blood is at most ~2.5 C, so even liquid nitrogen would be enough to achieve ~100x larger temperature delta if the system were at the same temperature as the brain; we don’t even need to go to liquid helium for that.
In terms of scaling, our above formula says that TS will scale proportionally to TE. Halve the environment temperature, halve the system temperature. And that result I do expect to be pretty robust (for systems near Jacob’s interconnect Landauer limit), since it just relies on temperature scaling of the Landauer limit plus heat flow being proportional to temperature delta.
I mean, the easiest solution is just “make it smaller and use active cooling”.
The brain already uses active liquid cooling of course, so this is just make it smaller and cool it harder.
I have not had time to investigate your claimed physics on how cooling scales, but I”m skeptical—pumping a working coolant through the compute volume can only extract a limited constant amount of heat from the volume per unit of coolant flowing per time step (this should be obvious?), and thus the amount of heat that can be removed must scale strictly with the surface area (assuming that you’ve already maxed out the cooling effect per unit coolant).
So reduce radius by 2x and you reduce surface area and thus heat pumped out by 4x, but only reduce heat production via reducing wire length by at most 2x as I described in the article.
Active cooling ends up using more energy as you are probably aware. Moving to a colder environment is of course feasible (and used to some extent by some datacenters), but that hardly gets OOM gains on earth.
I mean, the easiest solution is just “make it smaller and use active cooling”. The relevant loopholes in Jacob’s argument are in the Density and Temperature section of his Brain Efficiency post.
Jacob is using a temperature formula for blackbody radiators, which is basically just irrelevant to temperature of realistic compute substrate—brains, chips, and probably future compute substrates are all cooled by conduction through direct contact with something cooler (blood for the brain, heatsink/air for a chip). The obvious law to use instead would just be the standard thermal conduction law: heat flow per unit area proportional to temperature gradient.
Jacob’s analysis in that section also fails to adjust for how, by his own model in the previous section, power consumption scales linearly with system size (and also scales linearly with temperature).
Put all that together, and we get:
qA=C1TSRR2=C2(TS−TE)R
… where:
R is radius of the system
A is surface area of thermal contact
q is heat flow out of system
TS is system temperature
TE is environment temperature (e.g. blood or heat sink temperature)
C1,C2 are constants with respect to system size and temperature
(Of course a spherical approximation is not great, but we’re mostly interested in change as all the dimensions scale linearly, so the geometry shouldn’t matter for our purposes.)
First key observation: all the R’s cancel out. If we scale down by a factor of 2, the power consumption is halved (since every wire is half as long), the area is quartered (so power density over the surface is doubled), and the temperature gradient is doubled since the surface is half as thick. So, overall, equilibrium temperature stays the same as the system scales down.
So in fact scaling down is plausibly free, for purposes of heat management. (Though I’m not highly confident that would work in practice. In particular, I’m least confident about the temperature gradient scaling with system size, in practice. If that failed, then the temperature delta relative to the environment would scale at-worst ~linearly with inverse size, i.e. halving the size would double the temperature delta.)
On top of that, we could of course just use a colder environment, i.e. pump liquid nitrogen or even liquid helium over the thing. According to this meta-analysis, the average temperature delta between e.g. brain and blood is at most ~2.5 C, so even liquid nitrogen would be enough to achieve ~100x larger temperature delta if the system were at the same temperature as the brain; we don’t even need to go to liquid helium for that.
In terms of scaling, our above formula says that TS will scale proportionally to TE. Halve the environment temperature, halve the system temperature. And that result I do expect to be pretty robust (for systems near Jacob’s interconnect Landauer limit), since it just relies on temperature scaling of the Landauer limit plus heat flow being proportional to temperature delta.
The brain already uses active liquid cooling of course, so this is just make it smaller and cool it harder.
I have not had time to investigate your claimed physics on how cooling scales, but I”m skeptical—pumping a working coolant through the compute volume can only extract a limited constant amount of heat from the volume per unit of coolant flowing per time step (this should be obvious?), and thus the amount of heat that can be removed must scale strictly with the surface area (assuming that you’ve already maxed out the cooling effect per unit coolant).
So reduce radius by 2x and you reduce surface area and thus heat pumped out by 4x, but only reduce heat production via reducing wire length by at most 2x as I described in the article.
Active cooling ends up using more energy as you are probably aware. Moving to a colder environment is of course feasible (and used to some extent by some datacenters), but that hardly gets OOM gains on earth.