# The Second Law of Thermodynamics, and Engines of Cognition

The first law of ther­mo­dy­nam­ics, bet­ter known as Con­ser­va­tion of En­ergy, says that you can’t cre­ate en­ergy from noth­ing: it pro­hibits per­pet­ual mo­tion ma­chines of the first type, which run and run in­definitely with­out con­sum­ing fuel or any other re­source. Ac­cord­ing to our mod­ern view of physics, en­ergy is con­served in each in­di­vi­d­ual in­ter­ac­tion of par­ti­cles. By math­e­mat­i­cal in­duc­tion, we see that no mat­ter how large an as­sem­blage of par­ti­cles may be, it can­not pro­duce en­ergy from noth­ing—not with­out vi­o­lat­ing what we presently be­lieve to be the laws of physics.

This is why the US Pa­tent Office will sum­mar­ily re­ject your amaz­ingly clever pro­posal for an as­sem­blage of wheels and gears that cause one spring to wind up an­other as the first runs down, and so con­tinue to do work for­ever, ac­cord­ing to your calcu­la­tions. There’s a fully gen­eral proof that at least one wheel must vi­o­late (our stan­dard model of) the laws of physics for this to hap­pen. So un­less you can ex­plain how one wheel vi­o­lates the laws of physics, the as­sem­bly of wheels can’t do it ei­ther.

A similar ar­gu­ment ap­plies to a “re­ac­tion­less drive”, a propul­sion sys­tem that vi­o­lates Con­ser­va­tion of Mo­men­tum. In stan­dard physics, mo­men­tum is con­served for all in­di­vi­d­ual par­ti­cles and their in­ter­ac­tions; by math­e­mat­i­cal in­duc­tion, mo­men­tum is con­served for phys­i­cal sys­tems what­ever their size. If you can vi­su­al­ize two par­ti­cles knock­ing into each other and always com­ing out with the same to­tal mo­men­tum that they started with, then you can see how scal­ing it up from par­ti­cles to a gi­gan­tic com­pli­cated col­lec­tion of gears won’t change any­thing. Even if there’s a trillion quadrillion atoms in­volved, 0 + 0 + … + 0 = 0.

But Con­ser­va­tion of En­ergy, as such, can­not pro­hibit con­vert­ing heat into work. You can, in fact, build a sealed box that con­verts ice cubes and stored elec­tric­ity into warm wa­ter. It isn’t even difficult. En­ergy can­not be cre­ated or de­stroyed: The net change in en­ergy, from trans­form­ing (ice cubes + elec­tric­ity) to (warm wa­ter), must be 0. So it couldn’t vi­o­late Con­ser­va­tion of En­ergy, as such, if you did it the other way around...

Per­pet­ual mo­tion ma­chines of the sec­ond type, which con­vert warm wa­ter into elec­tri­cal cur­rent and ice cubes, are pro­hibited by the Se­cond Law of Ther­mo­dy­nam­ics.

The Se­cond Law is a bit harder to un­der­stand, as it is es­sen­tially Bayesian in na­ture.

Yes, re­ally.

The es­sen­tial phys­i­cal law un­der­ly­ing the Se­cond Law of Ther­mo­dy­nam­ics is a the­o­rem which can be proven within the stan­dard model of physics: In the de­vel­op­ment over time of any closed sys­tem, phase space vol­ume is con­served.

Let’s say you’re hold­ing a ball high above the ground. We can de­scribe this state of af­fairs as a point in a mul­ti­di­men­sional space, at least one of whose di­men­sions is “height of ball above the ground”. Then, when you drop the ball, it moves, and so does the di­men­sion­less point in phase space that de­scribes the en­tire sys­tem that in­cludes you and the ball. “Phase space”, in physics-speak, means that there are di­men­sions for the mo­men­tum of the par­ti­cles, not just their po­si­tion—i.e., a sys­tem of 2 par­ti­cles would have 12 di­men­sions, 3 di­men­sions for each par­ti­cle’s po­si­tion, and 3 di­men­sions for each par­ti­cle’s mo­men­tum.

If you had a mul­ti­di­men­sional space, each of whose di­men­sions de­scribed the po­si­tion of a gear in a huge as­sem­blage of gears, then as you turned the gears a sin­gle point would swoop and dart around in a rather high-di­men­sional phase space. Which is to say, just as you can view a great big com­plex ma­chine as a sin­gle point in a very-high-di­men­sional space, so too, you can view the laws of physics de­scribing the be­hav­ior of this ma­chine over time, as de­scribing the tra­jec­tory of its point through the phase space.

The Se­cond Law of Ther­mo­dy­nam­ics is a con­se­quence of a the­o­rem which can be proven in the stan­dard model of physics: If you take a vol­ume of phase space, and de­velop it for­ward in time us­ing stan­dard physics, the to­tal vol­ume of the phase space is con­served.

For ex­am­ple:

Let there be two sys­tems, X and Y: where X has 8 pos­si­ble states, Y has 4 pos­si­ble states, and the joint sys­tem (X,Y) has 32 pos­si­ble states.

The de­vel­op­ment of the joint sys­tem over time can be de­scribed as a rule that maps ini­tial points onto fu­ture points. For ex­am­ple, the sys­tem could start out in X7Y2, then de­velop (un­der some set of phys­i­cal laws) into the state X3Y3 a minute later. Which is to say: if X started in 7, and Y started in 2, and we watched it for 1 minute, we would see X go to 3 and Y go to 3. Such are the laws of physics.

Next, let’s carve out a sub­space S of the joint sys­tem state. S will be the sub­space bounded by X be­ing in state 1 and Y be­ing in states 1-4. So the to­tal vol­ume of S is 4 states.

And let’s sup­pose that, un­der the laws of physics gov­ern­ing (X,Y) the states ini­tially in S be­have as fol­lows:

X1Y1 → X2Y1
X1Y2 → X4Y1
X1Y3 → X6Y1
X1Y4 → X8Y1

That, in a nut­shell, is how a re­friger­a­tor works.

The X sub­sys­tem be­gan in a nar­row re­gion of state space—the sin­gle state 1, in fact—and Y be­gan dis­tributed over a wider re­gion of space, states 1-4. By in­ter­act­ing with each other, Y went into a nar­row re­gion, and X ended up in a wide re­gion; but the to­tal phase space vol­ume was con­served. 4 ini­tial states mapped to 4 end states.

Clearly, so long as to­tal phase space vol­ume is con­served by physics over time, you can’t squeeze Y harder than X ex­pands, or vice versa—for ev­ery sub­sys­tem you squeeze into a nar­rower re­gion of state space, some other sub­sys­tem has to ex­pand into a wider re­gion of state space.

Now let’s say that we’re un­cer­tain about the joint sys­tem (X,Y), and our un­cer­tainty is de­scribed by an equiprob­a­ble dis­tri­bu­tion over S. That is, we’re pretty sure X is in state 1, but Y is equally likely to be in any of states 1-4. If we shut our eyes for a minute and then open them again, we will ex­pect to see Y in state 1, but X might be in any of states 2-8. Ac­tu­ally, X can only be in some of states 2-8, but it would be too costly to think out ex­actly which states these might be, so we’ll just say 2-8.

If you con­sider the Shan­non en­tropy of our un­cer­tainty about X and Y as in­di­vi­d­ual sys­tems, X be­gan with 0 bits of en­tropy be­cause it had a sin­gle definite state, and Y be­gan with 2 bits of en­tropy be­cause it was equally likely to be in any of 4 pos­si­ble states. (There’s no mu­tual in­for­ma­tion be­tween X and Y.) A bit of physics oc­curred, and lo, the en­tropy of Y went to 0, but the en­tropy of X went to log2(7) = 2.8 bits. So en­tropy was trans­ferred from one sys­tem to an­other, and de­creased within the Y sub­sys­tem; but due to the cost of book­keep­ing, we didn’t bother to track some in­for­ma­tion, and hence (from our per­spec­tive) the over­all en­tropy in­creased.

If there was a phys­i­cal pro­cess that mapped past states onto fu­ture states like this:

X2,Y1 → X2,Y1
X2,Y2 → X2,Y1
X2,Y3 → X2,Y1
X2,Y4 → X2,Y1

Then you could have a phys­i­cal pro­cess that would ac­tu­ally de­crease en­tropy, be­cause no mat­ter where you started out, you would end up at the same place. The laws of physics, de­vel­op­ing over time, would com­press the phase space.

But there is a the­o­rem, Liou­ville’s The­o­rem, which can be proven true of our laws of physics, which says that this never hap­pens: phase space is con­served.

The Se­cond Law of Ther­mo­dy­nam­ics is a corol­lary of Liou­ville’s The­o­rem: no mat­ter how clever your con­figu­ra­tion of wheels and gears, you’ll never be able to de­crease en­tropy in one sub­sys­tem with­out in­creas­ing it some­where else. When the phase space of one sub­sys­tem nar­rows, the phase space of an­other sub­sys­tem must widen, and the joint space keeps the same vol­ume.

Ex­cept that what was ini­tially a com­pact phase space, may de­velop squig­gles and wig­gles and con­volu­tions; so that to draw a sim­ple bound­ary around the whole mess, you must draw a much larger bound­ary than be­fore—this is what gives the ap­pear­ance of en­tropy in­creas­ing. (And in quan­tum sys­tems, where differ­ent uni­verses go differ­ent ways, en­tropy ac­tu­ally does in­crease in any lo­cal uni­verse. But omit this com­pli­ca­tion for now.)

The Se­cond Law of Ther­mo­dy­nam­ics is ac­tu­ally prob­a­bil­is­tic in na­ture—if you ask about the prob­a­bil­ity of hot wa­ter spon­ta­neously en­ter­ing the “cold wa­ter and elec­tric­ity” state, the prob­a­bil­ity does ex­ist, it’s just very small. This doesn’t mean Liou­ville’s The­o­rem is vi­o­lated with small prob­a­bil­ity; a the­o­rem’s a the­o­rem, af­ter all. It means that if you’re in a great big phase space vol­ume at the start, but you don’t know where, you may as­sess a tiny lit­tle prob­a­bil­ity of end­ing up in some par­tic­u­lar phase space vol­ume. So far as you know, with in­finites­i­mal prob­a­bil­ity, this par­tic­u­lar glass of hot wa­ter may be the kind that spon­ta­neously trans­forms it­self to elec­tri­cal cur­rent and ice cubes. (Ne­glect­ing, as usual, quan­tum effects.)

So the Se­cond Law re­ally is in­her­ently Bayesian. When it comes to any real ther­mo­dy­namic sys­tem, it’s a strictly lawful state­ment of your be­liefs about the sys­tem, but only a prob­a­bil­is­tic state­ment about the sys­tem it­self.

“Hold on,” you say. “That’s not what I learned in physics class,” you say. “In the lec­tures I heard, ther­mo­dy­nam­ics is about, you know, tem­per­a­tures. Uncer­tainty is a sub­jec­tive state of mind! The tem­per­a­ture of a glass of wa­ter is an ob­jec­tive prop­erty of the wa­ter! What does heat have to do with prob­a­bil­ity?”

Oh ye of lit­tle trust.

In one di­rec­tion, the con­nec­tion be­tween heat and prob­a­bil­ity is rel­a­tively straight­for­ward: If the only fact you know about a glass of wa­ter is its tem­per­a­ture, then you are much more un­cer­tain about a hot glass of wa­ter than a cold glass of wa­ter.

Heat is the zip­ping around of lots of tiny molecules; the hot­ter they are, the faster they can go. Not all the molecules in hot wa­ter are trav­el­ling at the same speed—the “tem­per­a­ture” isn’t a uniform speed of all the molecules, it’s an av­er­age speed of the molecules, which in turn cor­re­sponds to a pre­dictable statis­ti­cal dis­tri­bu­tion of speeds—any­way, the point is that, the hot­ter the wa­ter, the faster the wa­ter molecules could be go­ing, and hence, the more un­cer­tain you are about the ve­loc­ity (not just speed) of any in­di­vi­d­ual molecule. When you mul­ti­ply to­gether your un­cer­tain­ties about all the in­di­vi­d­ual molecules, you will be ex­po­nen­tially more un­cer­tain about the whole glass of wa­ter.

We take the log­a­r­ithm of this ex­po­nen­tial vol­ume of un­cer­tainty, and call that the en­tropy. So it all works out, you see.

The con­nec­tion in the other di­rec­tion is less ob­vi­ous. Sup­pose there was a glass of wa­ter, about which, ini­tially, you knew only that its tem­per­a­ture was 72 de­grees. Then, sud­denly, Saint Laplace re­veals to you the ex­act lo­ca­tions and ve­loc­i­ties of all the atoms in the wa­ter. You now know perfectly the state of the wa­ter, so, by the in­for­ma­tion-the­o­retic defi­ni­tion of en­tropy, its en­tropy is zero. Does that make its ther­mo­dy­namic en­tropy zero? Is the wa­ter colder, be­cause we know more about it?

Ig­nor­ing quan­tum­ness for the mo­ment, the an­swer is: Yes! Yes it is!

Maxwell once asked: Why can’t we take a uniformly hot gas, and par­ti­tion it into two vol­umes A and B, and let only fast-mov­ing molecules pass from B to A, while only slow-mov­ing molecules are al­lowed to pass from A to B? If you could build a gate like this, soon you would have hot gas on the A side, and cold gas on the B side. That would be a cheap way to re­friger­ate food, right?

The agent who in­spects each gas molecule, and de­cides whether to let it through, is known as “Maxwell’s De­mon”. And the rea­son you can’t build an effi­cient re­friger­a­tor this way, is that Maxwell’s De­mon gen­er­ates en­tropy in the pro­cess of in­spect­ing the gas molecules and de­cid­ing which ones to let through.

But sup­pose you already knew where all the gas molecules were?

Then you ac­tu­ally could run Maxwell’s De­mon and ex­tract use­ful work.

So (again ig­nor­ing quan­tum effects for the mo­ment), if you know the states of all the molecules in a glass of hot wa­ter, it is cold in a gen­uinely ther­mo­dy­namic sense: you can take elec­tric­ity out of it and leave be­hind an ice cube.

This doesn’t vi­o­late Liou­ville’s The­o­rem, be­cause if Y is the wa­ter, and you are Maxwell’s De­mon (de­noted M), the phys­i­cal pro­cess be­haves as:

M1,Y1 → M1,Y1
M2,Y2 → M2,Y1
M3,Y3 → M3,Y1
M4,Y4 → M4,Y1

Be­cause Maxwell’s de­mon knows the ex­act state of Y, this is mu­tual in­for­ma­tion be­tween M and Y. The mu­tual in­for­ma­tion de­creases the joint en­tropy of (M,Y): H(M,Y) = H(M) + H(Y) - I(M;Y). M has 2 bits of en­tropy, Y has two bits of en­tropy, and their mu­tual in­for­ma­tion is 2 bits, so (M,Y) has a to­tal of 2 + 2 − 2 = 2 bits of en­tropy. The phys­i­cal pro­cess just trans­forms the “cold­ness” (ne­gen­tropy) of the mu­tual in­for­ma­tion to make the ac­tual wa­ter cold—af­ter­ward, M has 2 bits of en­tropy, Y has 0 bits of en­tropy, and the mu­tual in­for­ma­tion is 0. Noth­ing wrong with that!

And don’t tell me that knowl­edge is “sub­jec­tive”. Knowl­edge has to be rep­re­sented in a brain, and that makes it as phys­i­cal as any­thing else. For M to phys­i­cally rep­re­sent an ac­cu­rate pic­ture of the state of Y, M’s phys­i­cal state must cor­re­late with the state of Y. You can take ther­mo­dy­namic ad­van­tage of that—it’s called a Szilard en­g­ine.

Or as E.T. Jaynes put it, “The old adage ‘knowl­edge is power’ is a very co­gent truth, both in hu­man re­la­tions and in ther­mo­dy­nam­ics.”

And con­versely, one sub­sys­tem can­not in­crease in mu­tual in­for­ma­tion with an­other sub­sys­tem, with­out (a) in­ter­act­ing with it and (b) do­ing ther­mo­dy­namic work.

Other­wise you could build a Maxwell’s De­mon and vi­o­late the Se­cond Law of Ther­mo­dy­nam­ics—which in turn would vi­o­late Liou­ville’s The­o­rem—which is pro­hibited in the stan­dard model of physics.

Which is to say: To form ac­cu­rate be­liefs about some­thing, you re­ally do have to ob­serve it. It’s a very phys­i­cal, very real pro­cess: any ra­tio­nal mind does “work” in the ther­mo­dy­namic sense, not just the sense of men­tal effort.

(It is some­times said that it is eras­ing bits in or­der to pre­pare for the next ob­ser­va­tion that takes the ther­mo­dy­namic work—but that dis­tinc­tion is just a mat­ter of words and per­spec­tive; the math is un­am­bigu­ous.)

(Dis­cov­er­ing log­i­cal “truths” is a com­pli­ca­tion which I will not, for now, con­sider—at least in part be­cause I am still think­ing through the ex­act for­mal­ism my­self. In ther­mo­dy­nam­ics, knowl­edge of log­i­cal truths does not count as ne­gen­tropy; as would be ex­pected, since a re­versible com­puter can com­pute log­i­cal truths at ar­bi­trar­ily low cost. All this that I have said is true of the log­i­cally om­ni­scient: any lesser mind will nec­es­sar­ily be less effi­cient.)

“Form­ing ac­cu­rate be­liefs re­quires a cor­re­spond­ing amount of ev­i­dence” is a very co­gent truth both in hu­man re­la­tions and in ther­mo­dy­nam­ics: if blind faith ac­tu­ally worked as a method of in­ves­ti­ga­tion, you could turn warm wa­ter into elec­tric­ity and ice cubes. Just build a Maxwell’s De­mon that has blind faith in molecule ve­loc­i­ties.

Eng­ines of cog­ni­tion are not so differ­ent from heat en­g­ines, though they ma­nipu­late en­tropy in a more sub­tle form than burn­ing gasoline. For ex­am­ple, to the ex­tent that an en­g­ine of cog­ni­tion is not perfectly effi­cient, it must ra­di­ate waste heat, just like a car en­g­ine or re­friger­a­tor.

“Cold ra­tio­nal­ity” is true in a sense that Hol­ly­wood scriptwrit­ers never dreamed (and false in the sense that they did dream).

So un­less you can tell me which spe­cific step in your ar­gu­ment vi­o­lates the laws of physics by giv­ing you true knowl­edge of the un­seen, don’t ex­pect me to be­lieve that a big, elab­o­rate clever ar­gu­ment can do it ei­ther.

• Another solid es­say.

To form ac­cu­rate be­liefs about some­thing, you re­ally do have to ob­serve it.

How do we model the fact that I know the Uni­verse was in a spe­cific low-en­tropy state (space­time was flat) shortly af­ter the Big Bang? It’s a small re­gion in the phase space, but I don’t have enough bits of ob­ser­va­tions to di­rectly pick that re­gion out of all the points in phase space.

• How does the ther­mo­dy­nam­ics laws work on a so­lar ther­mo­dy­namic pan­els that are now used in the re­new­able in­dus­try?

It hap­pens to be seen as a rev­olu­tion­ary ther­mo­dy­nam­ics sys­tem that has change the game for the rest of the so­lar panel tech­nol­ogy.

I found these guys on the in­ter­net when do­ing some re­search for my the­sis on do­mes­tic im­ple­men­ta­tions of ther­mo­dy­nam­ics into heat­ing wa­ter. http://​​www.sks-thermo.co.uk

• Without dis­count­ing the pre­dic­tive power of the sec­ond law, my con­fi­dence in our un­der­stand­ing of its phys­i­cal ba­sis has been se­ri­ously re­duced. This af­ter view­ing a re­cent se­ries of talks held at MIT, Meet­ing the En­tropy Challenge.

Of par­tic­u­lar in­ter­est were the dis­cus­sions about the lack of an es­tab­lished defi­ni­tion of en­tropy.

• the more un­cer­tain you are about the ve­loc­ity (not just speed)

Isn’t speed the same as ve­loc­ity?

What is phase space? Is it the same as state space? You didn’t define it.

Does that make its ther­mo­dy­namic en­tropy zero? Is the wa­ter colder, be­cause we know more

Ig­nor­ing quan­tum­ness for the mo­ment, the an­swer is: Yes! Yes it is!

I guess that if you stick your finger into the wa­ter it will still get burned, am I wrong?

And con­versely, one sub­sys­tem can­not in­crease in mu­tual in­for­ma­tion with an­other sub­sys­tem,

with­out (a) in­ter­act­ing with it and (b) do­ing ther­mo­dy­namic work.

It is not en­tirely clear to me how you ar­rived at this con­clu­sion.

• Your finger will not get burned; it will suffer the cu­mu­la­tive dam­age re­sult­ing from an un­usu­ally high quan­tity of un­re­lated high-speed molecule at­tacks.

• You could imag­ine a par­ti­cle gun that shoots wa­ter molecules with the ex­act same speed dis­tri­bu­tion as hot wa­ter (care­fully al­igned so they don’t col­lide mid-beam), but all with the same di­rec­tion—straight to­wards you.

The re­sult of stick­ing your hand in such a beam would be roughly the same as putting it in hot wa­ter, ig­nor­ing the asym­met­ric mo­men­tum trans­fer. How­ever, it is easy to see that you can ex­tract use­ful en­ergy from the beam.

• Another per­ti­nent ex­am­ple might be this: a metal shaft could spin so fast that its atoms’ ve­loc­ity dis­tri­bu­tion could be the same as that of the (hot­ter!) gaseous form of the same metal. Yet the spin­ning of the shaft does not evap­o­rate the metal.

Why? Be­cause, to a typ­i­cal ob­server of the shaft, its de­grees of free­dom are sig­nifi­cantly more con­strained. So, since the ob­server knows more about the shaft (that its atoms are in a solid lat­tice that moves in rel­a­tive uni­son), that makes the shaft colder—and it al­lows you to ex­tract more me­chan­i­cal work from the shaft than if it were a hot gas with the same av­er­age par­ti­cle ve­loc­i­ties!

• That is not in gen­eral pos­si­ble. The speed at ra­dius r is v = w r. Tak­ing an ar­bi­trary ax­isym­met­ric mass-dis­tri­bu­tion rho(r), we have a dis­tri­bu­tion of mass at speed v = w r that is U(r) = 2 pi h r^2 rho(r) and U(v) = 2 pi h v^2/​w^2 rho(v/​w) A monatomic gas at tem­per­a­ture T has a ki­netic en­ergy dis­tri­bu­tion of 2 Sqrt(E /​ Pi (kT)^3) exp(-E/​kT) (dE), and a speed dis­tri­bu­tion of sqrt(2(m/​kT)^3 /​ pi) v^2 exp(-m v^2 /​ 2 kT) (dv). By care­fully tai­lor­ing rho(r) to an ex­po­nen­tial, you can match this dis­tri­bu­tion (up to some finite cut-off, of course), at one spe­cific match of an­gu­lar speed w and tem­per­a­ture T.

This, of course, only matches speeds, not ve­loc­i­ties, which will be a three-di­men­sional dis­tri­bu­tion. This spin­ning shaft of course, v_z = 0.

• For a gas in a mass-sealed con­tainer, av­er­age vx, vy, vz are also zero, just as in the shaft, so this triv­ially matches.

You’re cor­rect that I should have said speeds, and I should have men­tioned that the shaft re­quires a spe­cial shape/​den­sity dis­tri­bu­tion, but the point stands: The molec­u­lar prop­erty dis­tri­bu­tion doesn’t by it­self tell you how cold some­thing is, or how much en­ergy can be ex­tracted—it’s also rele­vant how its de­grees of free­dom are con­strained, which shows how your knowl­edge of (mu­tual in­for­ma­tion with) the sys­tem mat­ters.

• It’s a re­ally great point; thank you.

• For a gas in a mass-sealed con­tainer, av­er­age vx, vy, vz are also zero, just as in the shaft, so this triv­ially matches.

Yes, but the av­er­age tells you lit­tle that’s mean­ingful: it’s equiv­a­lent to the over­all ve­loc­ity of the gas as whole.

I agree with your over­all point. Tem­per­a­ture is not di­rectly a prop­erty of the sys­tem, but of how we can char­ac­ter­ize it, in­clud­ing what con­straints there are on it. I just think that this wasn’t a great ex­am­ple for that point of view, pre­cisely be­cause you claimed agree­ment of dis­tri­bu­tions that doesn’t ex­ist.

A bet­ter way to say ex­plain this can use the shaft. A shaft has the very strong con­straint that po­si­tion and ve­loc­ity are perfectly cor­re­lated. This con­straint lets us ex­tract vir­tu­ally all the ki­netic en­ergy out. A gas that had the same dis­tri­bu­tion of ve­loc­i­ties, but lacked this con­straint, would be very hard to ex­tract use­ful en­ergy from. No sim­ple ar­range­ment could do much bet­ter than treat­ing it as a ther­mal­ized gas with the same av­er­age ki­netic en­ergy (and it would quickly evolve so that the ve­loc­ity dis­tri­bu­tion would match this).

• I just think that this wasn’t a great ex­am­ple for that point of view, pre­cisely be­cause you claimed agree­ment of dis­tri­bu­tions that doesn’t ex­ist.

Sure it does—it just re­quires a spe­cial shaft shap­ing/​ma­te­rial. Hence the “a metal shaft could …”

Other­wise, we’re in agree­ment.

• I would call that ‘burned’. If I call ‘stand­ing out­side get­ting hit by lots of UV light’ sun­burned then it seems fair to call get­ting hit by lots of high speed wa­ter molecules burned too.

• “Isn’t speed the same as ve­loc­ity?”

Nope, speed is a scalar, while ve­loc­ity is a vec­tor.

• Eliezer, I was go­ing to point out that you never defined “phase space”, but Roland beat me to it. It’s a small hole in an oth­er­wise ex­cel­lent post.

Roland,

Eliezer is us­ing the physics jar­gon mean­ings of speed and ve­loc­ity: speed is a mag­ni­tude, a raw num­ber; ve­loc­ity is mag­ni­tude and di­rec­tion to­gether. A car might be trav­el­ling at a speed of 65 mph; if you in­clude a di­rec­tion, e.g., 65 mph east, then you’ve got its ve­loc­ity.

• Phase space from Wikipe­dia.

• Added defi­ni­tion of phase space.

I guess that if you stick your finger into the wa­ter it will still get burned, am I wrong?

Only if you’re silly enough to stick in your finger at the wrong mo­ment. Stick in your finger at ex­actly the right mo­ment, and your finger will get colder while the wa­ter gets hot­ter—be­cause you’ve timed it so that all the molecules next to your finger hap­pen to be mov­ing very slowly.

Of course you will usu­ally have to wait so long for the right mo­ment that all the pro­tons evap­o­rate be­fore you see a chance to stick your finger in. And of course the trick re­lies on your know­ing the ex­act be­hav­ior of the wa­ter, so that it has no en­tropy re­gard­less of its tem­per­a­ture.

• I guess that if you stick your finger into the wa­ter it will still get burned, am I wrong?

Only if you’re silly enough to stick in your finger at the wrong mo­ment.

Ok, I thik we have a prob­lem of defi­ni­tion here. You said the wa­ter got colder in a ther­mo­dy­namic defi­ni­tion. But you agree that if I take a ther­mome­ter and in­sert it into the wa­ter and leave it there for a while it will still in­di­cate ‘hot wa­ter’. Right?

What I don’t un­der­stand is your ther­mo­dy­namic defi­ni­tion of colder. And I’m no physi­cist. Btw, I un­der­stand that with the in­for­ma­tion about the ve­loc­i­ties you have power to make the wa­ter colder, but that doesn’t mean that it ac­tu­ally will get colder(at least not right now).

• Ve­loc­ity is the same as speed, where speed or ve­loc­ity is a scalar which has mag­ni­tude only and size. A state of change of speed or ve­loc­ity how­ever is ac­cel­er­a­tion and is a vec­tor, which has both mag­ni­tude (size) and di­rec­tion, but don’t con­fuse an­gu­lar mo­men­tum with di­rec­tion, look into an­gu­lar mo­men­tum. You can also have con­stant ac­cel­er­a­tion that is not to be con­fused with con­stant ve­loc­ity which is in no way ac­cel­er­a­tion. There is also a rate of change of ac­cel­er­a­tion and a con­stant for the rate of change of ac­cel­er­a­tion...

• Good post, lays it down very nicely. Quick ques­tion:

Why is it that you can’t turn warm wa­ter into ice cubes and elec­tric­ity, but re­versible com­put­ing can use an ar­bi­trar­ily small amount of en­ergy? My guess is that com­put­ing (logic?) must be fun­da­men­tally differ­ent from work in this sense. Logic is, in a sense, ‘already there,’ whereas work re­quires en­ergy.

• “But sup­pose you already knew where all the gas molecules were?”

I as­sume by this you mean I have an ex­act knowl­edge of po­si­tion and mo­men­tum. Why should I sup­pose a sce­nario that is con­trary to what I know of the un­cer­tainty prin­ci­ple? Can’t I re­ject it for the same rea­sons you re­ject the pos­si­bil­ity of wak­ing up with a pur­ple ten­ta­cle.

• I sug­gest a lot of cau­tion in think­ing about how en­tropy ap­pears in ther­mo­dy­nam­ics and in­for­ma­tion the­ory. All of statis­ti­cal me­chan­ics is based on the con­cept of en­ergy, which has no analogue in in­for­ma­tion the­ory. Some peo­ple would sug­gest that for this rea­son the two quan­tities should not be called by the same term.

the “tem­per­a­ture” isn’t a uniform speed of all the molecules, it’s an av­er­age speed of the molecules, which in turn cor­re­sponds to a pre­dictable statis­ti­cal dis­tri­bu­tion of speeds

I as­sume you know this, but some read­ers may not: tem­per­a­ture is not ac­tu­ally equiv­a­lent to en­ergy/​speed, but rather to the deriva­tive of en­tropy with re­spect to en­ergy:

1/​T = dS/​dE

This is why we ob­serve tem­per­a­ture equil­ibri­a­tion: two sys­tems in ther­mal con­tact trade en­ergy to max­i­mize the net en­tropy of the en­sem­ble. Thus in equil­ibrium a small shift in en­ergy from one sys­tem to the other must not change the en­sem­ble en­ergy ==> the tem­per­a­ture of the sys­tems must be equal.

In al­most all real sys­tems, tem­per­a­ture and en­ergy are mono­ton­i­cally re­lated, so you won’t go too far astray by think­ing of tem­per­a­ture as en­ergy. How­ever, in the­ory one can imag­ine sys­tems that are forced into a smaller num­ber of states as their en­er­gies in­crease (dS/​dE < 0) and so in fact have nega­tive tem­per­a­ture:

http://​​en.wikipe­dia.org/​​wiki/​​Nega­tive_temperature

• Eliezer Yud­kowsky (I’ll re­move the un­der­scores from now on): I en­joyed read­ing this, but I don’t quite un­der­stand all the refer­ences to the im­pos­si­bil­ity of turn­ing warm wa­ter into elec­tric­ity and ice cubes: You don’t need ex­tra in­for­ma­tion or vi­o­la­tion of the laws of physics to do this. You could run a heat en­g­ine off the tem­per­a­ture differ­ence be­tween the warm wa­ter and the en­vi­ron­ment, and have that work drive a re­friger­a­tor. It’s just that you couldn’t ex­ploit this phe­nomenon to make a per­pet­ual mo­tion ma­chine.

I prob­a­bly missed an im­plicit (or ex­plicit!) qual­ifi­ca­tion of all that, and if so, reader new to ther­mo­dy­nam­ics can just take this as a clar­ifi­ca­tion.

• Will Pear­son, you can in­deed sum­mar­ily re­ject the pos­si­bil­ity, and that’s why I kept say­ing, “ig­nor­ing quan­tum”. For quan­tum­ness, you would need a to­tal de­scrip­tion of the quan­tum state of the wa­ter, and this you can never ob­tain by any phys­i­cal means. Though this is true even un­der clas­si­cal me­chan­ics: The third law of ther­mo­dy­nam­ics, on the im­pos­si­bil­ity of ob­tain­ing ab­solute zero, im­plies the im­pos­si­bil­ity of ob­tain­ing in­finite-pre­ci­sion knowl­edge.

Silas, ’twas ex­plicit: I said “sealed box”.

• I don’t mind peo­ple ig­nor­ing el­e­ments of sci­ence when they are not im­por­tant. e.g. ig­nor­ing gen­eral rel­a­tivity when calcu­lat­ing ball tra­jec­to­ries.

But molecules and atoms are very much in the quan­tum realm. So it seemed to me to be like say­ing, ig­nor­ing spe­cial rel­a­tivity, when things are trav­el­ling faster than the speed of light then this anal­ogy holds, from this we can con­clude blah. To me it seems un­likely to hold any in­sights.

I don’t see why I should ac­cept any con­clu­sion drawn from the premises if I do not hold with the premises. But this brings up an in­ter­est­ing point, when is it valid to ig­nore data? Is it ever?

To me your first point seems el­e­men­tary as I don’t have good ev­i­dence for pyschic pow­ers, and can be de­rived from the un­cer­tainty prin­ci­ple or prob­a­bly prefer­ably from the no clon­ing the­o­rem. Your sec­ond would be bet­ter de­rived from quan­tum physics as well by show­ing a min­i­mum en­ergy re­quired for a bit flip, if it can.

• Maxwell’s de­mon is ruled out by in­for­ma­tion the­ory. That’s not quite the same thing as say­ing that it’s Bayesian.

• “Is the wa­ter colder, be­cause we know more about it? …Yes! Yes it is!”

You’re kid­ding, right? Know­ing some­thing about a sys­tem doesn’t change the sys­tem (ne­glect­ing quan­tum, of course). The statis­ti­cal way to define en­tropy (as you men­tioned) is the log of the num­ber of microstates. The fact that you know all the tra­jec­to­ries/​po­si­tions couldn’t mat­ter less to the glass of wa­ter, the only thing that mat­ters is (us­ing your jar­gon) the phase space vol­ume it oc­cu­pies.

Re­shape the space for a sec­ond. Call it 6-D, with each par­ti­cle a point, in­stead of 6N-D. Now the en­tropy would cor­re­spond to the vol­ume ac­tu­ally oc­cu­pied in 6-D space, rather than the pos­si­ble vol­ume among which your sin­gle point can choose. W

With the sin­gle point, you get sucked into the fal­lacy that be­cause you know where the point is at one time, that’s the only pos­si­ble lo­ca­tion it can have, and you’re tricked into be­liev­ing the en­tropy is much smaller than it is.

Statis­ti­cal physics as­sumes ex­act par­ti­cle tra­jec­to­ries are ran­dom and un­know­able, al­though this was never be­lieved to be fun­da­men­tal. It was just a con­ve­nient way to ig­nore things no­body cared about, and take av­er­ages. Restrict­ing your­self to that one point in phase space, you vi­o­late that as­sump­tion.

• The fact that you know all the tra­jec­to­ries/​po­si­tions couldn’t mat­ter less to the glass of wa­ter, the only thing that mat­ters is (us­ing your jar­gon) the phase space vol­ume it oc­cu­pies.

What, pre­cisely, do you think it means for a statis­ti­cally viewed sys­tem to “oc­cupy a vol­ume of phase space”? When you talk about the “num­ber of microstates”, what ex­actly do you think you are count­ing?

• I think we may have to Ta­boo the word “cold” in this post­ing. As I un­der­stand it, an ob­ject is colder than an­other if it has a lower tem­per­a­ture than an­other (in other words, the av­er­age ki­netic en­ergy of its molecules is lower than those of an­other ob­ject). There­fore, know­ing the ex­act po­si­tion and ve­loc­ity of all the (clas­si­cal) molecules in an ideal gas doesn’t make the gas “colder” un­til you ac­tu­ally DO use Maxwell’s De­mon on it. Say­ing that an ob­ject that you know enough about to use Maxwell’s De­mon on is “colder” than an­other con­flicts with my un­der­stand­ing of the word. It’s not ac­tu­ally colder, it’s only “po­ten­tially colder” (by anal­ogy to po­ten­tial en­ergy), if that makes any sense.

Yes, we’re ar­gu­ing about words, but that’s be­cause we’re get­ting con­fused. :(

• two sys­tems in ther­mal con­tact trade en­ergy to max­i­mize the net en­tropy of the en­sem­ble.

Ac­tu­ally the as­sump­tion is that two sys­tems in ther­mal con­tact come to some equil­ibrium state.

Let this equil­ibrium state max­i­mize some­thing, call it S, and use calcu­lus.

En­ergy is con­served.

There­fore the en­ergy change in on sys­tem equals minus the en­ergy change in the other, and the change in S wrt the en­ergy change in each sys­tem has to be equal in both sys­tems at the max­i­mum of to­tal S.

Call that change wrt en­ergy the (in­verse) tem­per­a­ture. Two sys­tems in ther­mal con­tact come to the same tem­per­a­ture, is then what the as­sump­tion of some equil­ibrium of some­thing comes to, af­ter you re­name the deriva­tives.

Only the as­sump­tion of an equil­ibrium has been in­tro­duced to get this.

That’s where

• You lost me there.

1) If Alice and Bob ob­serve the sys­tem in your first ex­am­ple, and Alice de­cides to keep track pre­cisely of X’s pos­sile states while Bob just says “2-8”, the en­tropy of X+Y is 2 bits for Alice and 2.8 for Bob. Isn’t en­tropy a prop­erty of the sys­tem, not the ob­server? (This is the prob­lem with “sub­jec­tivity”: of course knowl­edge is phys­i­cal, it’s just that it de­pends on the ob­server and the ob­served sys­tem in­stead of just the sys­tem.)

2) If Alice knows all the molecules’ po­si­tions and ve­loc­i­ties, a ther­mome­ter will still dis­play the same num­ber; if she calcu­lates the av­er­age speed of the molecules, she will find this same num­ber; if she sticks her finger in the wa­ter at a ran­dom mo­ment, she should ex­pect to feel the same thing Bob, who just knows the wa­ter’s tem­para­ture, does. How is the wa­ter colder? Ad­mit­tedly, Alice could make it colder (and ex­tract elec­tric­ity), but she doesn’t have to.

• Isn’t en­tropy a prop­erty of the sys­tem, not the ob­server?

Nope. It’s a prop­erty of the ob­server, but one that be­haves in such a lawful and in­escapable way that it seems to you like a prop­erty of the sys­tem.

Your ig­no­rance of next week’s win­ning lot­tery num­bers is a prop­erty of you, not just a prop­erty of the lot­tery balls, but good luck on ig­nor­ing your ig­no­rance.

Some­one el­se­where said: Al­most all the time, I stick with this idea: Tem­per­a­ture of a gas is the mean ki­netic en­ergy of its molecules.

Aren’t there vibra­tional de­grees of free­dom that also con­tribute to ki­netic en­ergy, and isn’t that why differ­ent ma­te­ri­als have differ­ent spe­cific heats? I.e., what mat­ters is ki­netic en­ergy per de­gree of free­dom, not ki­netic en­ergy per molecule? So you ac­tu­ally do have to think about a molecule (not just mea­sure its ki­netic en­ergy per se) to de­ter­mine what its tem­per­a­ture is (which di­rec­tion heat will flow in, com­pared to an­other ma­te­rial), even if you know the to­tal amount of heat—putting the same amount of heat into a kilo of wa­ter or a kilo of iron will yield differ­ent “tem­per­a­tures”.

But the more im­por­tant point: Sup­pose you’ve got an iron fly­wheel that’s spin­ning very rapidly. That’s definitely ki­netic en­ergy, so the av­er­age ki­netic en­ergy per molecule is high. Is it heat? That par­tic­u­lar ki­netic en­ergy, of a spin­ning fly­wheel, doesn’t look to you like heat, be­cause you know how to ex­tract most of it as use­ful work, and leave be­hind some­thing colder (that is, with less mean ki­netic en­ergy per de­gree of free­dom).

If you know the po­si­tions and speeds of all the el­e­ments in a sys­tem, their mo­tion stops look­ing like heat, and starts look­ing like a spin­ning fly­wheel—us­able ki­netic en­ergy that can be ex­tracted right out.

• But the more im­por­tant point: Sup­pose you’ve got an iron fly­wheel that’s spin­ning very rapidly. That’s definitely ki­netic en­ergy, so the av­er­age ki­netic en­ergy per molecule is high. Is it heat? That par­tic­u­lar ki­netic en­ergy, of a spin­ning fly­wheel, doesn’t look to you like heat, be­cause you know how to ex­tract most of it as use­ful work, and leave be­hind some­thing colder (that is, with less mean ki­netic en­ergy per de­gree of free­dom).

Sys­tems in ther­mal con­tact (by ra­di­a­tion of noth­ing else) come to the same tem­per­a­ture. That makes it pretty ob­jec­tive if one of the sys­tems is a ther­mome­ter, whether it’s heat or not.

• To form ac­cu­rate be­liefs about some­thing, you re­ally do have to ob­serve it.

Does this not con­fuse ac­cu­rate be­lief with knowl­edge? Leav­ing aside doubts about whether jus­tified ac­cu­rate be­lief is suffi­cient for knowl­edge (e.g., the Get­tier prob­lem), there is cer­tainly more to knowl­edge than just ac­cu­rate be­lief, and while I ac­cept your state­ment for knowl­edge, it does not seem true for mere ac­cu­rate be­lief.

I sup­pose the is­sue hinges on—and per­haps this is your point—whether ac­cu­rate means prob­a­bil­ity of be­ing cor­rect or whether it turns out to have been cor­rect. On the sec­ond ac­count—which is the com­mon mean­ing of ac­cu­rate—the lotto player who be­lieves she will win and ac­tu­ally does win has an ac­cu­rate be­lief be­fore she wins, though she is of course not jus­tified in hav­ing that be­lief. In terms of the first sense of ac­cu­rate, she is not ac­cu­rate at all, but I think you’ll have a much harder time try­ing to con­vince peo­ple of that than you would if you used knowl­edge in­stead of ac­cu­rate be­lief. The man on the street will not ac­cept that if he be­lieves the Gi­ants will win on Sun­day and they ac­tu­ally win that his be­lief was nev­er­the­less not ac­cu­rate, while he’ll eas­ily ac­knowl­edge that he didn’t re­ally know they would win.

• Joseph Knecht:

The prob­lem with your ar­gu­ment is that jus­tifi­ca­tion is cheap, while ac­cu­racy is ex­pen­sive. The canon­i­cal ex­am­ples of “un­jus­tified” be­liefs in­volve mis-cal­ibra­tion, but cal­ibra­tion is easy to cor­rect just by mak­ing one’s be­liefs va­guer and less pre­cise. Taken to the ex­treme, a max­i­mum-en­tropy prob­a­bil­ity dis­tri­bu­tion is perfectly cal­ibrated, but it adds zero bits of mu­tual in­for­ma­tion with the en­vi­ron­ment.

• anony­mous:

I don’t see how your re­sponse ad­dresses my con­cern that say­ing ac­cu­rate be­lief re­quires ob­ser­va­tion im­plies un­ac­cept­able con­se­quences for the man on the street, such as that his cor­rect be­lief that the Gi­ants would win on Sun­day is nev­er­the­less not an ac­cu­rate be­lief.

• Eliezer, you seem awfully close to Shal­izi’s para­dox. Could you ad­dress it?

• Un­less I’m miss­ing some­thing, Shal­izi usu­ally makes more sense than this.

1) Mea­sure­ments use work (or at least era­sure in prepa­ra­tion for the next mea­sure­ment uses work). They do not sim­ply mag­i­cally re­duce our un­cer­tainty with­out ther­mo­dy­namic cost. Even if you mea­sure and never erase, the mea­sur­ing sys­tem must be in a pre­pared state, can­not be used again, and still pro­duces en­tropy if you are not op­er­at­ing at ab­solute zero /​ in­finite pre­ci­sion, which you can’t do (third law of ther­mo­dy­nam­ics).

2) Be­cause we are not log­i­cally om­ni­scient, we lose in­for­ma­tion we already have as the re­sult of not be­ing will­ing to ex­pend the com­pu­ta­tional cost of fol­low­ing ev­ery atom. Liou­ville’s The­o­rem pre­serves a vol­ume of prob­a­bil­ity but it can get awfully squig­gly, so if you pre­serve a sim­ple bound­ary around your un­cer­tainty, it gets larger.

3) Quan­tum uni­verses branch both ways and cre­ate new un­cer­tainty in their branched agents.

Done.

• If you have the time I would be in­ter­ested in see­ing a math­e­mat­i­cal de­scrip­tion of a sys­tem that in­creases its mu­tual in­for­ma­tion with the en­vi­ron­ment, with the to­tal en­tropy of the sys­tem+en­vi­ron­ment in­creas­ing.

• I en­joyed this ar­ti­cle. As sev­eral com­menters have sug­gested, it seems not just coun­ter­in­tu­itive but ac­tu­ally non-phys­i­cal to say that the warm wa­ter has be­come colder just be­cause I know more about it. The sub­jec­tive na­ture of en­tropy that this seems to im­ply is ab­surd. As has been pointed out, a sta­tion­ary ther­mome­ter in the wa­ter will show the same read­ing af­ter my visit from Saint LaPlace as it did be­fore.

I think the prob­lem is re­solved if we con­sider our sys­tem bound­ary to in­clude not just the wa­ter, but the ob­server as well. The en­tropy of the wa­ter has not changed be­cause I have more in­for­ma­tion, but the to­tal en­tropy of the wa­ter-ob­server sys­tem can be con­sid­ered to have de­creased be­cause of the mu­tual in­for­ma­tion that has been mag­i­cally added (by Saint LaPlace).

To con­tinue the anal­ogy, it is as though my brain now con­tains ne­gen­tropy that has been ar­ranged to ex­actly can­cel the en­tropy of the wa­ter, mak­ing our com­bined net en­tropy smaller. The en­tropy of the wa­ter it­self has not changed and the effect is not sub­jec­tive. It arises only if we con­sider me (the ob­server) with my mag­i­cal cargo of ne­gen­tropy as part of the sys­tem. Should I choose to put my knowl­edge to use as an avatar of Maxwell’s De­mon, then I can ac­tu­ally lower the en­tropy of the wa­ter (by tak­ing it into my­self). If, how­ever, I walk away and do noth­ing to the wa­ter based on my knowl­edge then the en­tropy of the wa­ter it­self re­mains just as it was. (I, how­ever, have been last­ingly changed by my en­counter with LaPlace.)

• Shal­izi usu­ally makes more sense than this

a sign to give it more con­sid­er­a­tion.

Your re­sponse seems to be that Shal­izi as­sumes an ideal ob­server, while you as­sume an ob­server-in-the-sys­tem. That’s fine, as far as it goes, but of­ten you as­sume an ideal ob­server, and statis­ti­cal me­chan­ics is able to func­tion with some kind of ideal ob­server. If you can build a model with an ideal ob­server, you should!

In par­tic­u­lar, when you say that knowl­edge of par­ti­cles makes some­thing colder, makes it pos­si­ble to ex­tract work, you’ve gone back to the ideal ob­server.

More tan­gen­tially: I guess the point of statis­ti­cal me­chan­ics is that there may (er­god­ic­ity) be only a few pos­si­ble ro­bust mea­sure­ments, like tem­per­a­ture, and a real ob­server can draw the same con­clu­sions from such mea­sure­ments as an ideal ob­server. I’m an­noyed that no one ever spel­led that out to me and Shal­izi sounds like he’s an­noyed by Bayesi­ans who don’t spell out their mod­els. At the very least, a straw man gives you a chance to say “here’s how my model differs.”

• If you don’t have an ob­server in the sys­tem, you in­stead have an ob­server out­side the sys­tem, and in or­der to ac­tu­ally be ob­serv­ing must be in­ter­act­ing with the sys­tem—in which case the sys­tem is no longer closed, and there­fore, sim­plis­tic statis­ti­cal me­chan­ics is no longer suffi­cient, and you have to bring in all the open-sys­tem math.

• In par­tic­u­lar, when you say that knowl­edge of par­ti­cles makes some­thing colder, makes it pos­si­ble to ex­tract work, you’ve gone back to the ideal ob­server.

I think em­phat­i­cally not! To ex­tract work, you’ve got to be in­side the sys­tem, ex­tract­ing it.

If you take the per­spec­tive of a log­i­cally om­ni­scient perfect ob­server out­side the sys­tem, the no­tion of “en­tropy” is pretty much mean­ingless, as is “prob­a­bil­ity”—you never have to use statis­ti­cal ther­mo­dy­nam­ics to model any­thing, you just use the de­ter­minis­tic pre­cise wave equa­tion.

• I love this blog. Best blog ever.

• Dou­glas Knight: “In par­tic­u­lar, when you say that knowl­edge of par­ti­cles makes some­thing colder, makes it pos­si­ble to ex­tract work, you’ve gone back to the ideal ob­server.”

Eliezer: “I think em­phat­i­cally not! To ex­tract work, you’ve got to be in­side the sys­tem, ex­tract­ing it.”

I think what Dou­glas may be im­ply­ing is that un­less you are perfectly in­su­lated from what you are get­ting knowl­edge of (e.g. and ideal ob­server), the act of get­ting knowl­edge of some­thing will heat it up. As you are do­ing work and in­creas­ing the en­tropy in the sur­round­ings.

It raises some in­ter­est­ing and quirky ideas in me. I’m pic­tur­ing fu­ture the fu­ture where an in­tel­li­gence ex­plo­sion or at least ex­pan­sion where a statis­ti­cal ma­chine is the dom­i­nant pro­ducer of en­tropy on the planet and it de­cid­ing to not go perform too much statis­tics so it doesn’t go over the hyp­sither­mal limit of its en­vi­ron­ment and make its en­vi­ron­ment less rather than more pre­dictable.

• Ridicu­lously good post, ridicu­lously good com­ments, in my opinion.

• In­deed is one hell of a post, i am from com­puter sci­ence back­ground, had to read the post 5 to six times and most of the com­ments at least twice, its worth it.

If some­one is still fol­low­ing the post, i would like to know, can ran­dom­ness of the par­ti­cles be mea­sured? or is it calcu­lated ac­cord­ing to prob­a­bil­ity? i re­mem­ber vaguely from my col­lege read­ing that en­tropy is ran­dom en­ergy, so, for a perfect trans­fer X-> Y, how is the fi­nal state de­ter­mined (be­cause of the ran­dom­ness).Arent ac­cu­rate be­liefs func­tions of ran­dom­ness?

• Part of the point of this post is that par­ti­cles aren’t ever ran­dom—ran­dom is not a prop­erty of the par­ti­cles, but of our de­scrip­tion of the par­ti­cles.

• So af­ter do­ing the Maxwell’s De­mon thing, you say that mu­tual in­for­ma­tion de­creases, the en­tropy of Y de­creases, so we are left with the same amount of to­tal en­tropy:

M1,Y1 → M1,Y1

M2,Y2 → M2,Y1

M3,Y3 → M3,Y1

M4,Y4 → M4,Y1

How­ever, I don’t see why the mu­tual in­for­ma­tion would be lost; would the De­mon know where he “put” the molecule, thus mak­ing the tran­si­tion look more like:

M1,Y1 → M1,Y1

M2,Y2 → M1,Y1

M3,Y3 → M1,Y1

M4,Y4 → M1,Y1

This would of course shrink the phase space, vi­o­late the sec­ond law, etc. I just do not see how M would stay the same when Y changed (i.e. lose the mu­tual in­for­ma­tion).

• That was a sim­plified ac­count of what is go­ing on. To in­clude the full sys­tem, you would have to in­clude the means by which the De­mon recorded the knowl­edge. How­ever it’s recorded, it over­writes the in­for­ma­tion that was oth­er­wise con­tained in that record­ing mechanism (i.e., mu­tual in­for­ma­tion with some en­vi­ron­ment), and this dele­tion of mu­tual in­for­ma­tion is an in­crease in en­tropy.

But in such an ac­count­ing, you would have three sys­tems, which com­pli­cates the sce­nario. In the ex­am­ple given, the De­mon is im­plic­itly taken to in­clude the De­mon’s record­ing de­vices (even if that’s his brain). The fact that it has de­stroyed some re­la­tion­ship be­tween some sys­tem (the record­ing de­vice) and an­other is rep­re­sented as higher De­mon en­tropy that re­tains in­de­pen­dence from the Y sys­tem. (There are ex­tra states the De­mon can have that have noth­ing to do with Y.)

Did that make any sense?

• I guess it would seem to me that what gets “over­writ­ten” is the (now in­valid) knowl­edge of where Y is, and what it is over­writ­ten with is the new, valid po­si­tion of it. I’ll have to chew on it for a while.

By the way, sort of un­re­lated, but I’ve always won­dered why grav­ity act­ing on things is not con­sid­ered a loss of en­tropy. For ex­am­ple I can drop a bowl­ing ball from mul­ti­ple dis­tances, but it will always end up 0 feet from the ground:

B4 → B0

B3 → B0

B2 → B0

etc.

The only thing I can think of is that, when the ball hits the ground the col­li­sion cre­ates enough heat (i.e. en­tropy) to bal­ance ev­ery­thing out. Is that cor­rect?

• Yes, that’s ba­si­cally cor­rect: the ball ends up at the same place, but differs in an­other state—ve­loc­ity—which gives a differ­ent re­sult for how much mo­men­tum it im­parts to the earth, or heat en­ergy it gen­er­ates through fric­tion, or elas­tic en­ergy in com­press­ing its foun­da­tion.

Btw, note that there is a con­nec­tion be­tween the en­ergy of a sys­tem and the in­for­ma­tion it stores. Higher en­ergy states are less likely and there­fore store more in­for­ma­tion. (See Aca­demi­cian’s re­cent post on in­for­ma­tive­ness in in­for­ma­tion the­ory.) Be­cause en­ergy of a state is rel­a­tive to an­other, this sug­gests a re­search pro­gram that breaks down the laws of physics into rules about changes in in­for­ma­tional con­tent. I’m still in the pro­cess of find­ing out how much work has been done on this and what’s left to do.

• While the no­tion of “en­tropy” seems to make a lot more sense when con­sid­ered as ob­server-de­pen­dent, what con­tinues to con­fuse me about this is what hap­pens when you have time-re­versed ob­servers. If phase space vol­ume is sim­ply con­served, then the same prin­ci­ples ap­ply to time-re­versed ob­servers, i.e., they also see en­tropy in­creas­ing. But this would im­ply that any time-re­versed ob­server would have to draw bound­aries very differ­ently from us, and it’s not at all clear how sim­ply negat­ing the ‘t’ co­or­di­nate causes you to draw your bound­aries in such a way that you know more about two gases when mixed rather than when un­mixed. I feel I must be mak­ing some fun­da­men­tal mis­take here but I can’t tell what.

• That ques­tion does strike at the very ba­sic as­sump­tions that try to make sense of these phe­nom­ena. I read some more about this is­sue in Drescher’s Good and Real, chap­ter 3.

The key is­sue with your ques­tion is, what ex­actly is a time-re­versed ob­server? If you mean to flip the uni­ver­sal “time counter” and ask what the ob­servers ob­serve, well, there are some prob­lems off the bat (no uni­ver­sal space of si­mul­tane­ity, time not be­ing an in­de­pen­dent vari­able but a kind of mea­sure of the other vari­ables). But let’s as­sume those away.

With time re­versed, if you look at each time-slice, the ob­server per­ceives the ex­act same his­tory that they would if time had been go­ing the other way. This is be­cause their makeup con­tains the same ev­i­dence tel­ling them that they had the same past ex­pe­rience. In other words, their mem­o­ries are the same. So they wouldn’t have to draw bound­aries any differ­ently from us.

More gen­er­ally, you shouldn’t look at time as go­ing pos­i­tive or nega­tive along some timeline; you should think of it as go­ing fu­ture­ward (to­ward higher ob­server-per­ceived en­tropy) or past­ward (to­ward lower). As an anal­ogy, think of the shift from mod­el­ing the earth as flat, to mod­el­ing it as a sphere with grav­ity: you re­al­ize that your “up” and “down” are not two unique, con­stant vec­tors, but rather, re­fer to whether you are go­ing to­wards or away from the cen­ter of the earth. Just the same, fu­ture­ward and past­ward are de­ter­mined, not by an in­creas­ing or de­creas­ing time vari­able, but whether en­tropy in­creases or de­creases from that point, so there can be nega­tive time di­rec­tions that go fu­ture­ward.

To sim­plify a bit, it’s not that “Hey, time in­creases, en­tropy in­creases—gosh, they always seem to hap­pen to­gether!” Rather, it’s that it’s not pos­si­ble for us to per­ceive (have a set of mem­o­ries con­sis­tent with) en­tropy de­creas­ing.

(I had been writ­ing up a sum­mary of Drescher’s chap­ter 3 as an ar­ti­cle but never got to finish­ing it. This com­ment draws from both Drescher, and from Bar­bour’s “time­less uni­verse” ideas.)

• I don’t mean re­vers­ing time on the whole uni­verse; that is not re­ally mean­ingful, for the rea­sons you spec­ify. What I mean is, since the laws of physics are (nearly) time-sym­met­ric, it seems that it should be pos­si­ble to have, in our own uni­verse alongside us, some sort of crea­ture that is has a brain that re­ally does re­mem­ber what we would con­sider the fu­ture, and at­tempts to an­ti­ci­pate what we would con­sider the past. How would such a thing arise? Well, pre­sum­ably by a time-re­versed evolu­tion, with mu­ta­tion and nat­u­ral se­lec­tion oc­cur­ing on a “repli­ca­tor” that prop­a­gates it­self back­wards in time; that is, af­ter all, what it would have to op­ti­mize for (well, given the right en­vi­ron­ment).

Yes, if you take us out of the pic­ture, you can just negate the t-co­or­di­nate and say I’m not propos­ing any­thing weird. But the (near) time-re­versibil­ity of the laws of physics means that it should pos­si­ble for us to both oc­cur in the same uni­verse.

Ad­mit­tedly, if we saw such a thing, we would prob­a­bly never rec­og­nize its “genes” as repli­ca­tors of any sort—if we saw the pat­tern at all, they would ap­pear as some sort of anti-repli­ca­tors. And could we even rec­og­nize such crea­tures as con­tain­ing de­ci­sion en­g­ines at all?

...OK, hav­ing writ­ten that out I now can’t help but sus­pect the prob­lem is in pos­ing the ex­is­tence of these things in the first place. After all, the anti-repli­ca­tors that formed their genes, would have to have some causal ori­gin from our point of view, and that seems highly im­prob­a­ble. Anti-repli­ca­tors be­come less com­mon, not more com­mon, mean­ing, but there shouldn’t be any at the start of the uni­verse—or in other words, they should all be ex­tinct by then. What does a time-re­versed ex­tinc­tion event look like? Prob­a­bly not just one thing. But of course, if, for ex­am­ple, we were to hy­po­thet­i­cally nuke the whole planet and de­stroy all life, they’d see the sud­den ap­pear­ance of a whole bunch of anti-repli­ca­tors, which then slowly an­nihilate each other over 4 billion years, and have a good causal ex­pla­na­tion for the whole thing! If they rec­og­nized the pat­tern at all, that is.

This forces me to won­der if, yes, it re­ally is cor­rect that a time-re­versed ob­server nec­es­sar­ily would have to have such a differ­ent point of view that it would be nat­u­ral to draw the bound­aries in a way such that they still saw en­tropy as in­creas­ing.

I don’t ac­tu­ally un­der­stand this very well, so I don’t think I’m close to an ac­tual an­swer, but I think here’s my best at­tempt: While such things are the­o­ret­i­cally pos­si­ble, if they ex­isted, we could never rec­og­nize them (or vice versa), as that would vi­o­late the sec­ond law of ther­mo­dy­nam­ics from our (their) point of view? I think? Though that still is not an­swer­ing the origi­nal ques­tion.

• Ar­ti­cle linked from Red­dit, which I haven’t read: De­monic de­vice con­verts in­for­ma­tion to en­ergy (Scien­tific Amer­i­can).

• So in the fol­low­ing trans­for­ma­tion:

X1Y1 → X2Y1 X1Y2 → X4Y1 X1Y3 → X6Y1 X1Y4 → X8Y1

You say that while true en­tropy has not in­creased (it stays at 2 bits), ap­par­ent en­tropy has, due to the ob­server not keep­ing track of X and just lump­ing its pos­si­ble states into X2-X8. If this is the case, why doesn’t ob­served en­tropy de­crease as well, since phase space is pre­served with the fol­low­ing?

X2Y1 → X1Y1 X4Y1 → X1Y2 X6Y1 → X1Y3 X8Y1 → X1Y4

• Why doesn’t ob­served en­tropy de­crease as well, since phase space is pre­served with the fol­low­ing?

X2Y1 → X1Y1
X4Y1 → X1Y2
X6Y1 → X1Y3
X8Y1 → X1Y4

(I guess DaveInNYC won’t read this but I guess some­one else might.)

If you lump to­gether X’s start­ing state into X2-X8 then you can’t be sure that it isn’t ac­tu­ally X3, X5 or X7. So you have to look at where those pos­si­bil­ities go as well. Then the en­tropy can’t go down (since by Liou­ville’s The­o­rem they have to go some­where differ­ent from X2, X4, X6 and X8).

• Liou­ville’s the­o­rem alone does not suffice to ob­tain the Se­cond Law. You might want to look up the ob­jec­tions to Boltz­mann’s deriva­tion of H-the­o­rem made by Zer­melo (wait long enough and the sys­tem will re­turn to a state ar­bi­trar­ily close to the origi­nal state, due to Poin­care’s re­cur­rence the­o­rem) and Loschmidt (re­verse the speeds of all par­ti­cles and the en­tropy will de­crease to its origi­nal value). Boltz­mann kil­led him­self in a bout of de­pres­sion be­cause he could not find a satis­fac­tory an­swer to these ob­jec­tions. More than a cen­tury later, we still don’t have satis­fac­tory an­swers.

• There’s noth­ing mag­i­cal about re­vers­ing par­ti­cle speeds. For en­tropy to de­crease to the origi­nal value you would have to know and be able to change the speeds with perfect pre­ci­sion, which is of course mean­ingless in physics. If you get it even the tiniest bit off you might ex­pect _some_ en­tropy de­crease for a while but in­evitably the sys­tem will go “off track” (in clas­si­cal chaos the time it’s go­ing to take is only log­a­r­ith­mic in your pre­ci­sion) and onto a differ­ent in­creas­ing-en­tropy tra­jec­tory.

Jaynes’ 1957 pa­per has a nice for­mal ex­pla­na­tion of en­tropy vs. ve­loc­ity re­ver­sal.

• Could the sec­ond law of ther­mo­dy­nam­ics also be un­der­stood as “the func­tion be­tween suc­ces­sive states as de­scribed by the laws of physics is bi­jec­tive”?

• Does this mean, then, that it is not merely difficult, but math­e­mat­i­cally im­pos­si­ble for any mat­ter to ever reach 0 Kelvin? This would seem to vi­o­late Liou­ville’s The­o­rem as stated here.

• When I wor­ld­build with magic, this is some­how au­to­mat­i­cally in­tu­itive—so I always end up as­sum­ing (if not nec­es­sar­ily spec­i­fy­ing ex­plic­itly) a ‘magic field’ or smth that does the ther­mo­dy­namic work and that the bits of en­tropy are shuffled over to. Kind of like how look­ing some­thing up on the in­ter­net is ‘magic’ from an out­side ob­server’s POV if peo­ple only have ac­cess nodes in­side their heads and can­not ac­tu­ally show them to ob­servers, or like how ex­tract­ing power from the elec­tric­ity grid into de­vices is ‘magic’ un­der the same con­di­tions.

Only peo­ple didn’t ex­plic­itly in­vent and build the in­volved in­ter­net and the elec­tric­ity grid first. So more like how speech is ba­si­cally telepa­thy, as Eliezer speci­fied el­se­where~

• I’m con­fused—what does “cold” and “hot” mean in this con­text? What pre­dic­tions that I make on the wa­ter be­fore know­ing the tra­jec­to­ries of all the molecules should change, once that in­for­ma­tion is re­vealed to me, to re­sem­ble the pre­dic­tions I would make if I be­lieved the wa­ter was cold in the tra­di­tional mean­ing of the word?

• [ ]
[deleted]
• Do we have mod­er­a­tors who kill ob­vi­ous spam like this?

• So this is how warm­ing and cool­ing spells work in the Ra­tional Pot­ter­verse?

• “it pro­hibits per­pet­ual mo­tion ma­chines of the first type, which run and run in­definitely with­out con­sum­ing fuel or any other re­source”

That’s only right if you’re able to ex­tract work from it and it still runs undiminished.

Other­wise it’s only a per­pet­ual mo­tion ma­chine of the sec­ond type.