The Second Law of Thermodynamics, and Engines of Cognition

The first law of ther­mo­dy­nam­ics, bet­ter known as Con­ser­va­tion of En­ergy, says that you can’t cre­ate en­ergy from noth­ing: it pro­hibits per­pet­ual mo­tion ma­chines of the first type, which run and run in­definitely with­out con­sum­ing fuel or any other re­source. Ac­cord­ing to our mod­ern view of physics, en­ergy is con­served in each in­di­vi­d­ual in­ter­ac­tion of par­ti­cles. By math­e­mat­i­cal in­duc­tion, we see that no mat­ter how large an as­sem­blage of par­ti­cles may be, it can­not pro­duce en­ergy from noth­ing—not with­out vi­o­lat­ing what we presently be­lieve to be the laws of physics.

This is why the US Pa­tent Office will sum­mar­ily re­ject your amaz­ingly clever pro­posal for an as­sem­blage of wheels and gears that cause one spring to wind up an­other as the first runs down, and so con­tinue to do work for­ever, ac­cord­ing to your calcu­la­tions. There’s a fully gen­eral proof that at least one wheel must vi­o­late (our stan­dard model of) the laws of physics for this to hap­pen. So un­less you can ex­plain how one wheel vi­o­lates the laws of physics, the as­sem­bly of wheels can’t do it ei­ther.

A similar ar­gu­ment ap­plies to a “re­ac­tion­less drive”, a propul­sion sys­tem that vi­o­lates Con­ser­va­tion of Mo­men­tum. In stan­dard physics, mo­men­tum is con­served for all in­di­vi­d­ual par­ti­cles and their in­ter­ac­tions; by math­e­mat­i­cal in­duc­tion, mo­men­tum is con­served for phys­i­cal sys­tems what­ever their size. If you can vi­su­al­ize two par­ti­cles knock­ing into each other and always com­ing out with the same to­tal mo­men­tum that they started with, then you can see how scal­ing it up from par­ti­cles to a gi­gan­tic com­pli­cated col­lec­tion of gears won’t change any­thing. Even if there’s a trillion quadrillion atoms in­volved, 0 + 0 + … + 0 = 0.

But Con­ser­va­tion of En­ergy, as such, can­not pro­hibit con­vert­ing heat into work. You can, in fact, build a sealed box that con­verts ice cubes and stored elec­tric­ity into warm wa­ter. It isn’t even difficult. En­ergy can­not be cre­ated or de­stroyed: The net change in en­ergy, from trans­form­ing (ice cubes + elec­tric­ity) to (warm wa­ter), must be 0. So it couldn’t vi­o­late Con­ser­va­tion of En­ergy, as such, if you did it the other way around...

Per­pet­ual mo­tion ma­chines of the sec­ond type, which con­vert warm wa­ter into elec­tri­cal cur­rent and ice cubes, are pro­hibited by the Se­cond Law of Ther­mo­dy­nam­ics.

The Se­cond Law is a bit harder to un­der­stand, as it is es­sen­tially Bayesian in na­ture.

Yes, re­ally.

The es­sen­tial phys­i­cal law un­der­ly­ing the Se­cond Law of Ther­mo­dy­nam­ics is a the­o­rem which can be proven within the stan­dard model of physics: In the de­vel­op­ment over time of any closed sys­tem, phase space vol­ume is con­served.

Let’s say you’re hold­ing a ball high above the ground. We can de­scribe this state of af­fairs as a point in a mul­ti­di­men­sional space, at least one of whose di­men­sions is “height of ball above the ground”. Then, when you drop the ball, it moves, and so does the di­men­sion­less point in phase space that de­scribes the en­tire sys­tem that in­cludes you and the ball. “Phase space”, in physics-speak, means that there are di­men­sions for the mo­men­tum of the par­ti­cles, not just their po­si­tion—i.e., a sys­tem of 2 par­ti­cles would have 12 di­men­sions, 3 di­men­sions for each par­ti­cle’s po­si­tion, and 3 di­men­sions for each par­ti­cle’s mo­men­tum.

If you had a mul­ti­di­men­sional space, each of whose di­men­sions de­scribed the po­si­tion of a gear in a huge as­sem­blage of gears, then as you turned the gears a sin­gle point would swoop and dart around in a rather high-di­men­sional phase space. Which is to say, just as you can view a great big com­plex ma­chine as a sin­gle point in a very-high-di­men­sional space, so too, you can view the laws of physics de­scribing the be­hav­ior of this ma­chine over time, as de­scribing the tra­jec­tory of its point through the phase space.

The Se­cond Law of Ther­mo­dy­nam­ics is a con­se­quence of a the­o­rem which can be proven in the stan­dard model of physics: If you take a vol­ume of phase space, and de­velop it for­ward in time us­ing stan­dard physics, the to­tal vol­ume of the phase space is con­served.

For ex­am­ple:

Let there be two sys­tems, X and Y: where X has 8 pos­si­ble states, Y has 4 pos­si­ble states, and the joint sys­tem (X,Y) has 32 pos­si­ble states.

The de­vel­op­ment of the joint sys­tem over time can be de­scribed as a rule that maps ini­tial points onto fu­ture points. For ex­am­ple, the sys­tem could start out in X7Y2, then de­velop (un­der some set of phys­i­cal laws) into the state X3Y3 a minute later. Which is to say: if X started in 7, and Y started in 2, and we watched it for 1 minute, we would see X go to 3 and Y go to 3. Such are the laws of physics.

Next, let’s carve out a sub­space S of the joint sys­tem state. S will be the sub­space bounded by X be­ing in state 1 and Y be­ing in states 1-4. So the to­tal vol­ume of S is 4 states.

And let’s sup­pose that, un­der the laws of physics gov­ern­ing (X,Y) the states ini­tially in S be­have as fol­lows:

X1Y1 → X2Y1
X1Y2 → X4Y1
X1Y3 → X6Y1
X1Y4 → X8Y1

That, in a nut­shell, is how a re­friger­a­tor works.

The X sub­sys­tem be­gan in a nar­row re­gion of state space—the sin­gle state 1, in fact—and Y be­gan dis­tributed over a wider re­gion of space, states 1-4. By in­ter­act­ing with each other, Y went into a nar­row re­gion, and X ended up in a wide re­gion; but the to­tal phase space vol­ume was con­served. 4 ini­tial states mapped to 4 end states.

Clearly, so long as to­tal phase space vol­ume is con­served by physics over time, you can’t squeeze Y harder than X ex­pands, or vice versa—for ev­ery sub­sys­tem you squeeze into a nar­rower re­gion of state space, some other sub­sys­tem has to ex­pand into a wider re­gion of state space.

Now let’s say that we’re un­cer­tain about the joint sys­tem (X,Y), and our un­cer­tainty is de­scribed by an equiprob­a­ble dis­tri­bu­tion over S. That is, we’re pretty sure X is in state 1, but Y is equally likely to be in any of states 1-4. If we shut our eyes for a minute and then open them again, we will ex­pect to see Y in state 1, but X might be in any of states 2-8. Ac­tu­ally, X can only be in some of states 2-8, but it would be too costly to think out ex­actly which states these might be, so we’ll just say 2-8.

If you con­sider the Shan­non en­tropy of our un­cer­tainty about X and Y as in­di­vi­d­ual sys­tems, X be­gan with 0 bits of en­tropy be­cause it had a sin­gle definite state, and Y be­gan with 2 bits of en­tropy be­cause it was equally likely to be in any of 4 pos­si­ble states. (There’s no mu­tual in­for­ma­tion be­tween X and Y.) A bit of physics oc­curred, and lo, the en­tropy of Y went to 0, but the en­tropy of X went to log2(7) = 2.8 bits. So en­tropy was trans­ferred from one sys­tem to an­other, and de­creased within the Y sub­sys­tem; but due to the cost of book­keep­ing, we didn’t bother to track some in­for­ma­tion, and hence (from our per­spec­tive) the over­all en­tropy in­creased.

If there was a phys­i­cal pro­cess that mapped past states onto fu­ture states like this:

X2,Y1 → X2,Y1
X2,Y2 → X2,Y1
X2,Y3 → X2,Y1
X2,Y4 → X2,Y1

Then you could have a phys­i­cal pro­cess that would ac­tu­ally de­crease en­tropy, be­cause no mat­ter where you started out, you would end up at the same place. The laws of physics, de­vel­op­ing over time, would com­press the phase space.

But there is a the­o­rem, Liou­ville’s The­o­rem, which can be proven true of our laws of physics, which says that this never hap­pens: phase space is con­served.

The Se­cond Law of Ther­mo­dy­nam­ics is a corol­lary of Liou­ville’s The­o­rem: no mat­ter how clever your con­figu­ra­tion of wheels and gears, you’ll never be able to de­crease en­tropy in one sub­sys­tem with­out in­creas­ing it some­where else. When the phase space of one sub­sys­tem nar­rows, the phase space of an­other sub­sys­tem must widen, and the joint space keeps the same vol­ume.

Ex­cept that what was ini­tially a com­pact phase space, may de­velop squig­gles and wig­gles and con­volu­tions; so that to draw a sim­ple bound­ary around the whole mess, you must draw a much larger bound­ary than be­fore—this is what gives the ap­pear­ance of en­tropy in­creas­ing. (And in quan­tum sys­tems, where differ­ent uni­verses go differ­ent ways, en­tropy ac­tu­ally does in­crease in any lo­cal uni­verse. But omit this com­pli­ca­tion for now.)

The Se­cond Law of Ther­mo­dy­nam­ics is ac­tu­ally prob­a­bil­is­tic in na­ture—if you ask about the prob­a­bil­ity of hot wa­ter spon­ta­neously en­ter­ing the “cold wa­ter and elec­tric­ity” state, the prob­a­bil­ity does ex­ist, it’s just very small. This doesn’t mean Liou­ville’s The­o­rem is vi­o­lated with small prob­a­bil­ity; a the­o­rem’s a the­o­rem, af­ter all. It means that if you’re in a great big phase space vol­ume at the start, but you don’t know where, you may as­sess a tiny lit­tle prob­a­bil­ity of end­ing up in some par­tic­u­lar phase space vol­ume. So far as you know, with in­finites­i­mal prob­a­bil­ity, this par­tic­u­lar glass of hot wa­ter may be the kind that spon­ta­neously trans­forms it­self to elec­tri­cal cur­rent and ice cubes. (Ne­glect­ing, as usual, quan­tum effects.)

So the Se­cond Law re­ally is in­her­ently Bayesian. When it comes to any real ther­mo­dy­namic sys­tem, it’s a strictly lawful state­ment of your be­liefs about the sys­tem, but only a prob­a­bil­is­tic state­ment about the sys­tem it­self.

“Hold on,” you say. “That’s not what I learned in physics class,” you say. “In the lec­tures I heard, ther­mo­dy­nam­ics is about, you know, tem­per­a­tures. Uncer­tainty is a sub­jec­tive state of mind! The tem­per­a­ture of a glass of wa­ter is an ob­jec­tive prop­erty of the wa­ter! What does heat have to do with prob­a­bil­ity?”

Oh ye of lit­tle trust.

In one di­rec­tion, the con­nec­tion be­tween heat and prob­a­bil­ity is rel­a­tively straight­for­ward: If the only fact you know about a glass of wa­ter is its tem­per­a­ture, then you are much more un­cer­tain about a hot glass of wa­ter than a cold glass of wa­ter.

Heat is the zip­ping around of lots of tiny molecules; the hot­ter they are, the faster they can go. Not all the molecules in hot wa­ter are trav­el­ling at the same speed—the “tem­per­a­ture” isn’t a uniform speed of all the molecules, it’s an av­er­age speed of the molecules, which in turn cor­re­sponds to a pre­dictable statis­ti­cal dis­tri­bu­tion of speeds—any­way, the point is that, the hot­ter the wa­ter, the faster the wa­ter molecules could be go­ing, and hence, the more un­cer­tain you are about the ve­loc­ity (not just speed) of any in­di­vi­d­ual molecule. When you mul­ti­ply to­gether your un­cer­tain­ties about all the in­di­vi­d­ual molecules, you will be ex­po­nen­tially more un­cer­tain about the whole glass of wa­ter.

We take the log­a­r­ithm of this ex­po­nen­tial vol­ume of un­cer­tainty, and call that the en­tropy. So it all works out, you see.

The con­nec­tion in the other di­rec­tion is less ob­vi­ous. Sup­pose there was a glass of wa­ter, about which, ini­tially, you knew only that its tem­per­a­ture was 72 de­grees. Then, sud­denly, Saint Laplace re­veals to you the ex­act lo­ca­tions and ve­loc­i­ties of all the atoms in the wa­ter. You now know perfectly the state of the wa­ter, so, by the in­for­ma­tion-the­o­retic defi­ni­tion of en­tropy, its en­tropy is zero. Does that make its ther­mo­dy­namic en­tropy zero? Is the wa­ter colder, be­cause we know more about it?

Ig­nor­ing quan­tum­ness for the mo­ment, the an­swer is: Yes! Yes it is!

Maxwell once asked: Why can’t we take a uniformly hot gas, and par­ti­tion it into two vol­umes A and B, and let only fast-mov­ing molecules pass from B to A, while only slow-mov­ing molecules are al­lowed to pass from A to B? If you could build a gate like this, soon you would have hot gas on the A side, and cold gas on the B side. That would be a cheap way to re­friger­ate food, right?

The agent who in­spects each gas molecule, and de­cides whether to let it through, is known as “Maxwell’s De­mon”. And the rea­son you can’t build an effi­cient re­friger­a­tor this way, is that Maxwell’s De­mon gen­er­ates en­tropy in the pro­cess of in­spect­ing the gas molecules and de­cid­ing which ones to let through.

But sup­pose you already knew where all the gas molecules were?

Then you ac­tu­ally could run Maxwell’s De­mon and ex­tract use­ful work.

So (again ig­nor­ing quan­tum effects for the mo­ment), if you know the states of all the molecules in a glass of hot wa­ter, it is cold in a gen­uinely ther­mo­dy­namic sense: you can take elec­tric­ity out of it and leave be­hind an ice cube.

This doesn’t vi­o­late Liou­ville’s The­o­rem, be­cause if Y is the wa­ter, and you are Maxwell’s De­mon (de­noted M), the phys­i­cal pro­cess be­haves as:

M1,Y1 → M1,Y1
M2,Y2 → M2,Y1
M3,Y3 → M3,Y1
M4,Y4 → M4,Y1

Be­cause Maxwell’s de­mon knows the ex­act state of Y, this is mu­tual in­for­ma­tion be­tween M and Y. The mu­tual in­for­ma­tion de­creases the joint en­tropy of (M,Y): H(M,Y) = H(M) + H(Y) - I(M;Y). M has 2 bits of en­tropy, Y has two bits of en­tropy, and their mu­tual in­for­ma­tion is 2 bits, so (M,Y) has a to­tal of 2 + 2 − 2 = 2 bits of en­tropy. The phys­i­cal pro­cess just trans­forms the “cold­ness” (ne­gen­tropy) of the mu­tual in­for­ma­tion to make the ac­tual wa­ter cold—af­ter­ward, M has 2 bits of en­tropy, Y has 0 bits of en­tropy, and the mu­tual in­for­ma­tion is 0. Noth­ing wrong with that!

And don’t tell me that knowl­edge is “sub­jec­tive”. Knowl­edge has to be rep­re­sented in a brain, and that makes it as phys­i­cal as any­thing else. For M to phys­i­cally rep­re­sent an ac­cu­rate pic­ture of the state of Y, M’s phys­i­cal state must cor­re­late with the state of Y. You can take ther­mo­dy­namic ad­van­tage of that—it’s called a Szilard en­g­ine.

Or as E.T. Jaynes put it, “The old adage ‘knowl­edge is power’ is a very co­gent truth, both in hu­man re­la­tions and in ther­mo­dy­nam­ics.”

And con­versely, one sub­sys­tem can­not in­crease in mu­tual in­for­ma­tion with an­other sub­sys­tem, with­out (a) in­ter­act­ing with it and (b) do­ing ther­mo­dy­namic work.

Other­wise you could build a Maxwell’s De­mon and vi­o­late the Se­cond Law of Ther­mo­dy­nam­ics—which in turn would vi­o­late Liou­ville’s The­o­rem—which is pro­hibited in the stan­dard model of physics.

Which is to say: To form ac­cu­rate be­liefs about some­thing, you re­ally do have to ob­serve it. It’s a very phys­i­cal, very real pro­cess: any ra­tio­nal mind does “work” in the ther­mo­dy­namic sense, not just the sense of men­tal effort.

(It is some­times said that it is eras­ing bits in or­der to pre­pare for the next ob­ser­va­tion that takes the ther­mo­dy­namic work—but that dis­tinc­tion is just a mat­ter of words and per­spec­tive; the math is un­am­bigu­ous.)

(Dis­cov­er­ing log­i­cal “truths” is a com­pli­ca­tion which I will not, for now, con­sider—at least in part be­cause I am still think­ing through the ex­act for­mal­ism my­self. In ther­mo­dy­nam­ics, knowl­edge of log­i­cal truths does not count as ne­gen­tropy; as would be ex­pected, since a re­versible com­puter can com­pute log­i­cal truths at ar­bi­trar­ily low cost. All this that I have said is true of the log­i­cally om­ni­scient: any lesser mind will nec­es­sar­ily be less effi­cient.)

“Form­ing ac­cu­rate be­liefs re­quires a cor­re­spond­ing amount of ev­i­dence” is a very co­gent truth both in hu­man re­la­tions and in ther­mo­dy­nam­ics: if blind faith ac­tu­ally worked as a method of in­ves­ti­ga­tion, you could turn warm wa­ter into elec­tric­ity and ice cubes. Just build a Maxwell’s De­mon that has blind faith in molecule ve­loc­i­ties.

Eng­ines of cog­ni­tion are not so differ­ent from heat en­g­ines, though they ma­nipu­late en­tropy in a more sub­tle form than burn­ing gasoline. For ex­am­ple, to the ex­tent that an en­g­ine of cog­ni­tion is not perfectly effi­cient, it must ra­di­ate waste heat, just like a car en­g­ine or re­friger­a­tor.

“Cold ra­tio­nal­ity” is true in a sense that Hol­ly­wood scriptwrit­ers never dreamed (and false in the sense that they did dream).

So un­less you can tell me which spe­cific step in your ar­gu­ment vi­o­lates the laws of physics by giv­ing you true knowl­edge of the un­seen, don’t ex­pect me to be­lieve that a big, elab­o­rate clever ar­gu­ment can do it ei­ther.