Perceptual Entropy and Frozen Estimates

A Preface

Dur­ing the 1990’s, a sig­nifi­cant stream of re­search ex­isted around how peo­ple pro­cess in­for­ma­tion, which com­bined very differ­ent streams in psy­chol­ogy and re­lated ar­eas with ex­plicit pre­dic­tive mod­els about how ac­tual cog­ni­tive pro­cesses differ from the the­o­ret­i­cal ideal. This is not only the liter­a­ture by Kah­ne­man and Tver­sky about cog­ni­tive bi­ases, but in­cludes re­search about mem­ory, per­cep­tion, scope in­sen­si­tivity, and other ar­eas. The ra­tio­nal­ist com­mu­nity is very fa­mil­iar with some of this liter­a­ture, but fewer are fa­mil­iar with a mas­ter­ful syn­the­sis pro­duced by Richards Heuer for the in­tel­li­gence com­mu­nity in 1999[1], which was in­tended to start com­bat­ing these prob­lems, a goal we share. I’m hop­ing to put to­gether a stream of posts based on that work, po­ten­tially ex­pand­ing on it, or giv­ing my own spin – but en­courage read­ing the book it­self (PDF) as well[2]. (This es­say is based on Chap­ter 3.)

This will hope­fully be my first set of posts, so feed­back is es­pe­cially wel­come, both to help me re­fine the ideas, and to re­fine my pre­sen­ta­tion.

En­tropy, Pres­sure, and Me­taphor­i­cal States of Matter

Eliezer recom­mends up­dat­ing in­cre­men­tally but has noted that it’s hard. The cen­tral point, that it is hard to do so, is one that some in our com­mu­nity have ex­pe­rienced and ex­pli­cated, but there is deep the­ory I’ll at­tempt to out­line, via an anal­ogy, that I think ex­plains how and why it oc­curs. The prob­lem is that we are quick to form opinions and build mod­els, be­cause hu­mans are good at pat­tern find­ing. We are less quick to dis­card them, due to limited men­tal en­ergy. This is es­pe­cially true when the pres­sure of ev­i­dence doesn’t shift over­whelm­ingly and sud­denly.

I’ll at­tempt to an­swer the ques­tion of how this is true by stretch­ing a metaphor and cre­ate an in­tu­ition pump for think­ing about how our minds might be perform some think us­ing un­cer­tainty.

Frozen Perception

Heuer notes a stream of re­search about per­cep­tion, and notes that “once an ob­server has formed an image – that is, once he or she has de­vel­oped a mind set or ex­pec­ta­tion con­cern­ing the phe­nomenon be­ing ob­served – this con­di­tions fu­ture per­cep­tions of that phe­nomenon.” This seems to fol­low a stan­dard Bayesian prac­tice, but in fact, as Eliezer noted, peo­ple fail to up­date. The fol­low­ing set of images, which Heuer re­pro­duced from a 1976 book by Robert Jervis, show ex­actly this point;

Impressions Resist Change - Series of line drawings transitioning between a face and a crouching woman.

Look­ing at each pic­ture, start­ing on the left, and mov­ing to the right, you see a face slowly change. At what point does the face no longer seem to ap­pear? (Try it!) For me, it’s at about the sev­enth image that it’s clear it mor­phed into a sit­ting, bowed figure. But what if you start at the other end? The woman is still clearly there long past the point where we see a face, start­ing in the other di­rec­tion. What’s go­ing on?

We seem to at­tach too strongly to our first ap­proach, de­ci­sion, or idea. Speci­fi­cally, our de­ci­sion seems to “freeze” once it get to one place, and needs much more ev­i­dence to start mov­ing again. This has an analogue in physics, to the no­tion of freez­ing, which I think is more im­por­tant than it first ap­pears.

Entropy

To an­a­lyze this, I’ll drop into some ba­sic prob­a­bil­ity the­ory, and physics, be­fore (hope­fully) we come out on the other side with a con­cep­tu­ally clearer pic­ture. First, I will note that cog­ni­tive ar­chi­tec­ture has some way of rep­re­sent­ing the­o­ries, and im­plic­itly as­signs prob­a­bil­ities to var­i­ous work­ing the­o­ries. This is some sort of prob­a­bil­ity dis­tri­bu­tion over sam­ple the­o­ries. Any prob­a­bil­ity dis­tri­bu­tion has a quan­tity called en­tropy[3], which is sim­ply the prob­a­bil­ity of each state, mul­ti­plied by the log­a­r­ithm of that prob­a­bil­ity, summed over all the states. (The prob­a­bil­ity is less than 1, so the log­a­r­ithm is nega­tive, but we tra­di­tion­ally flip the sign so en­tropy is a pos­i­tive quan­tity.)

Need an ex­am­ple? Sure! I have two dice, and they can each land on any num­ber, 1-6. I’m as­sum­ing they are fair, so each has prob­a­bil­ity of 16, and the log­a­r­ithm (base 2) of 16 is about −2.585. There are 6 states, so the to­tal is 6* (1/​6) * 2.585 = 2.585. (With two dice, I have 36 pos­si­ble com­bi­na­tions, each with prob­a­bil­ity 136, log(1/​36) is −5.17, so the en­tropy is 5.17. You may have no­tices that I dou­bled the num­ber of dice in­volved, and the en­tropy dou­bled – be­cause there is ex­actly twice as much that can hap­pen, but the av­er­age en­tropy is un­changed.) If I only have 2 pos­si­ble states, such as a fair coin, each has prob­a­bil­ity of 12, and log(1/​2)=-1, so for two states, (-0.5*-1)+(-0.5*-1)=1. An un­fair coin, with a ¼ prob­a­bil­ity of tails, and a ¾ prob­a­bil­ity of heads, has an en­tropy of 0.81. Of course, this isn’t the low­est pos­si­ble en­tropy – a trick coin with both sides hav­ing heads only has 1 state, with en­tropy 0. So un­fair coins have lower en­tropy – be­cause we know more about what will hap­pen.

Freez­ing, Melt­ing, and Ideal Gases un­der Pressure

In physics, this has a deeply re­lated con­cept, also called en­tropy, which in the form we see it on a macro­scopic scale, just tem­per­a­ture. If you re­mem­ber your high school sci­ence classes, tem­per­a­ture is a de­scrip­tion of how much molecules move around. I’m not a physi­cist, and this is a bit sim­plified[4], but the en­tropy of an ob­ject is how un­cer­tain we are about its state – gasses ex­pand to fill their con­tainer, and the molecules could be any­where, so they have higher en­tropy than a liquid, which stays in its con­tainer, which still has higher en­tropy than a solid, where the molecules don’t more much, which still has higher en­tropy than a crys­tal, where the molecules are sort of locked into place.

This par­tially lends in­tu­ition to the third law of ther­mo­dy­nam­ics; “the en­tropy of a perfect crys­tal at ab­solute zero is ex­actly equal to zero.” In our terms above, it’s like that trick coin – we know ex­actly where ev­ery­thing is in the crys­tal, and it doesn’t move. In­ter­est­ingly, a perfect crys­tal at 0 Kelvin can­not ex­ist in na­ture; no finite pro­cess can re­duce en­tropy to that point; like in­finite cer­tainty, in­finitely ex­act crys­tals are im­pos­si­ble to ar­rive at, un­less you started there. So far, we could build a clever anal­ogy be­tween tem­per­a­ture and cer­tainty, tel­ling us that “you’re get­ting warmer” means ex­actly the op­po­site of what it does in com­mon us­age – but I think this is mis­lead­ing[5].

In fact, I think that in­for­ma­tion in our anal­ogy doesn’t change the tem­per­a­ture; in­stead, it re­duces the vol­ume! In the anal­ogy, gases can be­come liquids or solids ei­ther by low­er­ing tem­per­a­ture, or by in­creas­ing pres­sure – which is what ev­i­dence does. Speci­fi­cally, ev­i­dence con­strains the set of pos­si­bil­ities, squeez­ing our hy­poth­e­sis space. The phrase “weight of ev­i­dence” is now metaphor­i­cally cor­rect; it will ac­tu­ally con­strain the space by ap­ply­ing pres­sure.

I think that by anal­ogy, this ex­plains the phe­nomenon we see with per­cep­tion. While we are un­cer­tain, in­for­ma­tion in­creases pres­sure, and our con­cep­tual es­ti­mate can con­dense from un­cer­tain to a rel­a­tively con­tained liquid state – not be­cause we have less prob­a­bil­ity to dis­tribute, but be­cause the ev­i­dence has con­strained the space over which we can dis­tribute it. Alter­na­tively, we can set­tle on a lower en­ergy state on our own, unas­sisted by ev­i­dence. If our minds too-quickly set­tle on a the­ory or idea, the gas set­tles into a cor­ner of the available space, and if we fail to ap­ply enough en­ergy to the prob­lem, our un­challenged opinion can even freeze into place.

Our men­tal mod­els can be liquid, gaseous, or frozen in place – ei­ther by our prior cer­tainty, our lack of en­ergy re­quired to up­date, or an im­mense amount of ev­i­den­tial pres­sure. When we look at those faces, our minds set­tle into a model quickly, and once there, fail to ap­ply enough en­ergy to re-evap­o­rate our de­ci­sion un­til the pres­sure of the new pic­tures is rel­a­tively im­mense. If we had started at pic­ture 3 or 6, we could much more eas­ily up­date away from our es­ti­mates; our minds are less will­ing to let the cloud set­tle into a pud­dle of prob­a­ble an­swers, much less freeze into place. We can eas­ily see the face, or the woman, mov­ing be­tween just these two images.

When we be­gin to search for a men­tal model to de­scribe some phe­nom­ena, whether it be pat­terns of black and white on a page, or the way in which our ac­tions will af­fect a friend, I am sug­gest­ing we set­tle into a pud­dle of likely op­tions, and when not ac­tively in­vest­ing en­ergy into the ques­tion, we are likely to freeze into a spe­cific model.

What does this ap­proach retro­d­ict, or bet­ter, for­bid?

Be­cause our minds have limited en­ergy, the pro­cess of main­tain­ing an un­cer­tain stance should be difficult. This seems to be borne out by per­sonal and anec­do­tal ex­pe­rience, but I have not yet searched the aca­demic liter­a­ture to find more spe­cific val­i­da­tion.

We should have more trou­ble up­dat­ing away from a cur­rent model than we do ar­riv­ing at that new model from the be­gin­ning. As Heuer puts it, “Ini­tial ex­po­sure to… am­bigu­ous stim­uli in­terferes with ac­cu­rate per­cep­tion even af­ter more and bet­ter in­for­ma­tion be­comes available.” He notes that this was shown in Brun­der and Pot­ter, 1964 “In­terfer­ence in Vi­sual Recog­ni­tion,” and that “the early but in­cor­rect im­pres­sion tends to per­sist be­cause the amount of in­for­ma­tion nec­es­sary to in­val­i­date a hy­poth­e­sis is con­sid­er­ably greater than the amount of in­for­ma­tion re­quired to make an ini­tial in­ter­pre­ta­tion.”

Po­ten­tial av­enues of fur­ther thought

The pres­sure of ev­i­dence should re­duce the men­tal effort needed to switch mod­els, but “leaky” hy­poth­e­sis sets, where a class of model is not ini­tially con­sid­ered, should al­low the pres­sure to metaphor­i­cally es­cape into the larger hy­poth­e­sis space.

There is a po­ten­tial for mak­ing this anal­ogy more ex­act, but dis­cussing en­tropy in graph­i­cal mod­els (Bayesian Net­works), es­pe­cially in sets of graph­i­cal mod­els with ex­plicit un­cer­tainty at­tached. I don’t have the math needed for this, but would be in­ter­ested in hear­ing from those who did.



[1] I would like to thank both Abram Dem­ski (In­ter­viewed here) from pro­vid­ing a link to this ma­te­rial, and my dis­ser­ta­tion chair, Paul Davis, who was able to point me to­wards how this has been used and ex­tended in the in­tel­li­gence com­mu­nity.

[2] There is a fol­low up book and train­ing course which is also available, but I’ve not read it nor seen it on­line. A shorter ver­sion of the main points of that book is here (PDF), which I have only glanced through.

[3] Eliezer dis­cusses this idea in En­tropy and short codes, but I’m head­ing a slightly differ­ent di­rec­tion.

[4] We have a LW Post, En­tropy and Tem­per­a­ture that ex­plains this a bit. For a differ­ent, sim­plified ex­pla­na­tion, try this: http://​​www.nm­sea.org/​​Cur­ricu­lum/​​Primer/​​what_is_en­tropy.htm. For a slightly more com­plete ver­sion, try Wikipe­dia: https://​​en.wikipe­dia.org/​​wiki/​​In­tro­duc­tion_to_en­tropy. For a much more com­plete ver­sion, learn the math, talk to a PhD in ther­mo­dy­nam­ics, then read some text­books your­self.

[5] I think this, of course, be­cause I was ini­tially head­ing in that di­rec­tion. In­stead, I re­al­ized there was a bet­ter anal­ogy – but if we wanted to de­velop it in this di­rec­tion in­stead, I’d point to the phase change en­ergy re­quired to changed phases of mat­ter as a rea­son that our minds have trou­ble mov­ing from their ini­tial es­ti­mate. On re­flec­tion, I think this should be a small part of the story, if not en­tirely neg­ligible.