# Stuff That Makes Stuff Happen

Fol­lowup to: Causal­ity: The Fabric of Real Things

Pre­vi­ous med­i­ta­tion:

“You say that a uni­verse is a con­nected fabric of causes and effects. Well, that’s a very Western view­point—that it’s all about mechanis­tic, de­ter­minis­tic stuff. I agree that any­thing else is out­side the realm of sci­ence, but it can still be real, you know. My cousin is psy­chic—if you draw a card from his deck of cards, he can tell you the name of your card be­fore he looks at it. There’s no mechanism for it—it’s not a causal thing that sci­en­tists could study—he just does it. Same thing when I com­mune on a deep level with the en­tire uni­verse in or­der to re­al­ize that my part­ner truly loves me. I agree that purely spiritual phe­nom­ena are out­side the realm of causal pro­cesses that can be stud­ied by ex­per­i­ments, but I don’t agree that they can’t be real.

Re­ply:

Fun­da­men­tally, a causal model is a way of fac­tor­iz­ing our un­cer­tainty about the uni­verse. One way of view­ing a causal model is as a struc­ture of de­ter­minis­tic func­tions plus un­cor­re­lated sources of back­ground un­cer­tainty.

Let’s use the Obe­sity-Ex­er­cise-In­ter­net model (re­minder: which is to­tally made up) as an ex­am­ple again:

$p(x_1, x_2, x_3) = p(x_1)p(x_2)p(x_3|x_1, x_2)$

We can also view this as a set of de­ter­minis­tic func­tions Fi, plus un­cor­re­lated back­ground sources of un­cer­tainty Ui:

This says is that the value x3 - how much some­one ex­er­cises—is a func­tion of how obese they are (x1), how much time they spend on the In­ter­net (x2), plus some other back­ground fac­tors U3 which don’t cor­re­late to any­thing else in the di­a­gram, all of which col­lec­tively de­ter­mine, when com­bined by the mechanism F3, how much time some­one spends ex­er­cis­ing.

There might be any num­ber of differ­ent real fac­tors in­volved in the pos­si­ble states of U3 - like whether some­one has a per­sonal taste for jog­ging, whether they’ve ever been to a tram­poline park and liked it, whether they have some gene that af­fects ex­er­cise en­dor­phins. Th­ese are all differ­ent un­known back­ground facts about a per­son, which might af­fect whether or not they ex­er­cise, above and be­yond obe­sity and In­ter­net use.

But from the per­spec­tive of some­body build­ing a causal model, so long as we don’t have any­thing else in our causal graph that cor­re­lates with these fac­tors, we can sum them up into a sin­gle fac­tor of sub­jec­tive un­cer­tainty, our un­cer­tainty U3 about all the other things that might add up to a force for or against ex­er­cis­ing. Once we know that some­one isn’t over­weight and that they spend a lot of time on the In­ter­net, all our un­cer­tainty about those other back­ground fac­tors gets summed up with those two known fac­tors and turned into a 38% con­di­tional prob­a­bil­ity that the per­son ex­er­cises fre­quently.

And the key con­di­tion on a causal graph is that if you’ve prop­erly de­scribed your be­liefs about the con­nec­tive mechanisms Fi, all your re­main­ing un­cer­tainty Ui should be con­di­tion­ally in­de­pen­dent:

$p(u_1, u_2, u_3) = p(u_1)p(u_2)p(u_3)$

or more generally

$p(\mathbf U) = \prod p(U_i)$

And then plug­ging those prob­a­ble Ui into the strictly de­ter­minis­tic Fi should give us back out our whole causal model—the same joint prob­a­bil­ity table over the ob­serv­able Xi.

Hence the idea that a causal model fac­tor­izes un­cer­tainty. It fac­tor­izes out all the mechanisms that we be­lieve con­nect vari­ables, and all re­main­ing un­cer­tainty should be un­cor­re­lated so far as we know.

To put it an­other way, if we our­selves knew about a cor­re­la­tion be­tween two Ui that wasn’t in the causal model, our own ex­pec­ta­tions for the joint prob­a­bil­ity table couldn’t match the model’s product

$p(\mathbf x) = \prod p(x_i|\mathbf{pa_i})$

and all the the­o­rems about causal in­fer­ence would go out the win­dow. Tech­ni­cally, the idea that the Ui are un­cor­re­lated is known as the causal Markov con­di­tion.

What if you re­al­ize that two vari­ables ac­tu­ally are cor­re­lated more than you thought? What if, to make the di­a­gram cor­re­spond to re­al­ity, you’d have to hack it to make some Ua and Ub cor­re­lated?

Then you draw an­other ar­row from Xa to Xb, or from Xb to Xa; or you make a new node rep­re­sent­ing the cor­re­lated part of Ua and Ub, Xc, and draw ar­rows from Xc to Xa and Xb.

vs. vs.

(Or you might have to draw some ex­tra causal ar­rows some­where else; but those three changes are the ones that would solve the prob­lem most di­rectly.)

There was ap­par­ently at one point—I’m not sure if it’s still go­ing on or not—this big de­bate about the true mean­ing of ran­dom­iza­tion in ex­per­i­ments, and what counts as ‘truly ran­dom’. Is your ran­dom­ized ex­per­i­ment in­val­i­dated, if you use a merely pseudo-ran­dom al­gorithm in­stead of a ther­mal noise gen­er­a­tor? Is it okay to use pseudo-ran­dom al­gorithms? Is it okay to use shoddy pseudo-ran­dom­ness that a pro­fes­sional cryp­tog­ra­pher would sneer at? Clearly, us­ing 1-0-1-0-1-0 on a list of pa­tients in alpha­bet­i­cal or­der isn’t ran­dom enough… or is it? What if you pair off pa­tients in alpha­bet­i­cal or­der, and flip a coin to as­sign one mem­ber of each pair to the ex­per­i­men­tal group and the con­trol? How ran­dom is ran­dom?

Un­der­stand­ing that causal mod­els fac­tor­ize un­cer­tainty leads to the re­al­iza­tion that “ran­dom­iz­ing” an ex­per­i­men­tal vari­able means us­ing ran­dom­ness, a Ux for the as­sign­ment, which doesn’t cor­re­late with your un­cer­tainty about any other Ui. Our un­cer­tainty about a ther­mal noise gen­er­a­tor seems strongly guaran­teed to be un­cor­re­lated with our un­cer­tainty about a sub­ject’s eco­nomic sta­tus, their up­bring­ing, or any­thing else in the uni­verse that might af­fect how they re­act to Drug A...

...un­less some­body wrote down the out­put of the ther­mal noise gen­er­a­tor, and then used it in an­other ex­per­i­ment on the same group of sub­jects to test Drug B. It doesn’t mat­ter how “in­trin­si­cally ran­dom” that out­put was—whether it was the XOR of a ther­mal noise source, a quan­tum noise source, a hu­man be­ing’s so-called free will, and the world’s strongest cryp­to­graphic al­gorithm—once it ends up cor­re­lated to any other un­cer­tain back­ground fac­tor, any other Ui, you’ve in­val­i­dated the ran­dom­iza­tion. That’s the im­plicit prob­lem in the XKCD car­toon above.

But pick­ing a strong ran­dom­ness source, and us­ing the out­put only once, is a pretty solid guaran­tee this won’t hap­pen.

Un­less, ya know, you start out with a list of sub­jects sorted by in­come, and the ran­dom­ness source ran­domly hap­pens to put out 111111000000. Where­upon, as soon as you look at the out­put and are no longer un­cer­tain about it, you might ex­pect cor­re­la­tion and trou­ble. But that’s a differ­ent and much thornier is­sue in Bayesi­anism vs. fre­quen­tism.

If we take fre­quen­tist ideas about ran­dom­iza­tion at face value, then the key re­quire­ment for the­o­rems about ex­per­i­men­tal ran­dom­iza­tion to be ap­pli­ca­ble, is for your un­cer­tainty about pa­tient ran­dom­iza­tion to not cor­re­late with any other back­ground facts about the pa­tients. A dou­ble-blinded study (where the doc­tors don’t know pa­tient sta­tus) en­sures that pa­tient sta­tus doesn’t cor­re­late with the doc­tor’s be­liefs about a pa­tient lead­ing them to treat pa­tients differ­ently. Even plug­ging in the fixed string “1010101010” would be suffi­ciently ran­dom if that pat­tern wasn’t cor­re­lated to any­thing im­por­tant; the trou­ble is that such a sim­ple pat­tern could very eas­ily cor­re­late with some back­ground effect, and we can be­lieve in this pos­si­ble cor­re­la­tion even if we’re not sure what the ex­act cor­re­la­tion would be.

(It’s worth not­ing that the Cen­ter for Ap­plied Ra­tion­al­ity ran the June mini­camp ex­per­i­ment us­ing a stan­dard but un­usual statis­ti­cal method of sort­ing ap­pli­cants into pairs that seemed of roughly matched prior abil­ity /​ prior ex­pected out­come, and then flip­ping a coin to pick one mem­ber of each pair to be ad­mit­ted or not ad­mit­ted that year. This pro­ce­dure means you never ran­domly im­prob­a­bly get an ex­per­i­men­tal group that would, once you ac­tu­ally looked at the ran­dom num­bers, seem much more promis­ing or much worse than the con­trol group in ad­vance—where the fre­quen­tist guaran­tee that you used an ex­per­i­men­tal pro­ce­dure where this usu­ally doesn’t hap­pen ‘in the long run’, might be cold com­fort if it ob­vi­ously had hap­pened this time once you looked at the ran­dom num­bers. Roughly, this choice re­flects a differ­ence be­tween fre­quen­tist ideas about pro­ce­dures that make it hard for sci­en­tists to ob­tain re­sults un­less their the­o­ries are true, and then not car­ing about the ac­tual ran­dom num­bers so long as it’s still hard to get fake re­sults on av­er­age; ver­sus a Bayesian goal of try­ing to get the max­i­mum ev­i­dence out of the up­date we’ll ac­tu­ally have to perform af­ter look­ing at the re­sults, in­clud­ing how the ran­dom num­bers turned out on this par­tic­u­lar oc­ca­sion. Note that fre­quen­tist ethics are still be­ing obeyed—you can’t game the ex­pected statis­ti­cal sig­nifi­cance of ex­per­i­men­tal vs. con­trol re­sults by pick­ing bad pairs, so long as the coin­flips them­selves are fair!)

Okay, let’s look at that med­i­ta­tion again:

“You say that a uni­verse is a con­nected fabric of causes and effects. Well, that’s a very Western view­point—that it’s all about mechanis­tic, de­ter­minis­tic stuff. I agree that any­thing else is out­side the realm of sci­ence, but it can still be real, you know. My cousin is psy­chic—if you draw a card from his deck of cards, he can tell you the name of your card be­fore he looks at it. There’s no mechanism for it—it’s not a causal thing that sci­en­tists could study—he just does it. Same thing when I com­mune on a deep level with the en­tire uni­verse in or­der to re­al­ize that my part­ner truly loves me. I agree that purely spiritual phe­nom­ena are out­side the realm of causal pro­cesses that can be stud­ied by ex­per­i­ments, but I don’t agree that they can’t be real.

Well, you know, you can stand there all day, shout­ing all you like about how some­thing is out­side the realm of sci­ence, but if a pic­ture of the world has this...

...then we’re ei­ther go­ing to draw an ar­row from the top card to the pre­dic­tion; an ar­row from the pre­dic­tion to the top card (the pre­dic­tion makes it hap­pen!); or ar­rows from a third source to both of them (aliens are pick­ing the top card and us­ing telepa­thy on your cousin… or some­thing; there’s no rule you have to la­bel your nodes).

More gen­er­ally, for me to ex­pect your be­liefs to cor­re­late with re­al­ity, I have to ei­ther think that re­al­ity is the cause of your be­liefs, ex­pect your be­liefs to al­ter re­al­ity, or be­lieve that some third fac­tor is in­fluenc­ing both of them.

This is the more gen­eral ar­gu­ment that “To draw an ac­cu­rate map of a city, you have to open the blinds and look out the win­dow and draw lines on pa­per cor­re­spond­ing to what you see; sit­ting in your liv­ing-room with the blinds closed, mak­ing stuff up, isn’t go­ing to work.”

Cor­re­la­tion re­quires causal in­ter­ac­tion; and ex­pect­ing be­liefs to be true means ex­pect­ing the map to cor­re­late with the ter­ri­tory. To open your eyes and look at your shoelaces is to let those shoelaces have a causal effect on your brain—in gen­eral, look­ing at some­thing, gain­ing in­for­ma­tion about it, re­quires let­ting it causally af­fect you. Learn­ing about X means let­ting your brain’s state be causally de­ter­mined by X’s state. The first thing that hap­pens is that your shoelace is un­tied; the next thing that hap­pens is that the shoelace in­ter­acts with your brain, via light and eyes and the vi­sual cor­tex, in a way that makes your brain be­lieve your shoelace is un­tied.

 p(Shoelace=tied, Belief=”tied”) 0.931 p(Shoelace=tied, Belief=”un­tied”) 0.003 p(Shoelace=un­tied, Belief=”un­tied”) 0.053 p(Shoelace=un­tied, Belief=”tied”) 0.012

This is re­lated in spirit to the idea seen ear­lier on LW that hav­ing knowl­edge ma­te­ri­al­ize from nowhere di­rectly vi­o­lates the sec­ond law of ther­mo­dy­nam­ics be­cause mu­tual in­for­ma­tion counts as ther­mo­dy­namic ne­gen­tropy. But the causal form of the proof is much deeper and more gen­eral. It ap­plies even in uni­verses like Con­way’s Game of Life where there’s no equiv­a­lent of the sec­ond law of ther­mo­dy­nam­ics. It ap­plies even if we’re in the Ma­trix and the aliens can vi­o­late physics at will. Even when en­tropy can go down, you still can’t learn about things with­out be­ing causally con­nected to them.

The fun­da­men­tal ques­tion of ra­tio­nal­ity, “What do you think you know and how do you think you know it?”, is on its strictest level a re­quest for a causal model of how you think your brain ended up mir­ror­ing re­al­ity—the causal pro­cess which ac­counts for this sup­posed cor­re­la­tion.

You might not think that this would be a use­ful ques­tion to ask—that when your brain has an ir­ra­tional be­lief, it would au­to­mat­i­cally have ir­ra­tional be­liefs about pro­cess.

But “the hu­man brain is not illog­i­cally om­ni­scient”, we might say. When our brain un­der­goes mo­ti­vated cog­ni­tion or other fal­la­cies, it of­ten ends up strongly be­liev­ing in X, with­out the un­con­scious ra­tio­nal­iza­tion pro­cess hav­ing been so­phis­ti­cated enough to also in­vent a causal story ex­plain­ing how we know X. “How could you pos­si­bly know that, even if it was true?” is a more skep­ti­cal form of the same ques­tion. If you can suc­cess­fully stop your brain from ra­tio­nal­iz­ing-on-the-spot, there ac­tu­ally is this use­ful thing you can some­times catch your­self in, wherein you go, “Oh, wait, even if I’m in a world where AI does get de­vel­oped on March 4th, 2029, there’s no lawful story which could ac­count for me know­ing that in ad­vance—there must’ve been some other pres­sure on my brain to pro­duce that be­lief.”

Since it illus­trates an im­por­tant gen­eral point, I shall now take a mo­ment to re­mark on the idea that sci­ence is merely one mag­is­terium, and there’s other mag­is­te­ria which can’t be sub­jected to stan­dards of mere ev­i­dence, be­cause they are spe­cial. That see­ing a ghost, or know­ing some­thing be­cause God spoke to you in your heart, is an ex­cep­tion to the or­di­nary laws of episte­mol­ogy.

That ex­cep­tion would be con­ve­nient for the speaker, per­haps. But causal­ity is more gen­eral than that; it is not ex­cepted by such hy­pothe­ses. “I saw a ghost”, “I mys­te­ri­ously sensed a ghost”, “God spoke to me in my heart”—there’s no difficulty draw­ing those causal di­a­grams.

The meth­ods of sci­ence—even so­phis­ti­cated meth­ods like the con­di­tions for ran­dom­iz­ing a trial—aren’t just about atoms, or quan­tum fields.

They’re about stuff that makes stuff hap­pen, and hap­pens be­cause of other stuff.

In this world there are well-paid pro­fes­sional mar­keters, in­clud­ing philo­soph­i­cal and the­olog­i­cal mar­keters, who have thou­sands of hours of prac­tice con­vinc­ing cus­tomers that their be­liefs are be­yond the reach of sci­ence. But those mar­keters don’t know about causal mod­els. They may know about—know how to lie per­sua­sively rel­a­tive to—the episte­mol­ogy used by a Tra­di­tional Ra­tion­al­ist, but that’s crude by the stan­dards of to­day’s ra­tio­nal­ity-with-math. Highly Ad­vanced Episte­mol­ogy hasn’t diffused far enough for there to be ex­plicit anti-episte­mol­ogy against it.

And so we shouldn’t ex­pect to find any­one with a back­ground story which would jus­tify evad­ing sci­ence’s skep­ti­cal gaze. As a mat­ter of cog­ni­tive sci­ence, it seems ex­tremely likely that the hu­man brain na­tively rep­re­sents some­thing like causal struc­ture—that this na­tive rep­re­sen­ta­tion is how your own brain knows that “If the ra­dio says there was an earth­quake, it’s less likely that your bur­glar alarm go­ing off im­plies a bur­glar.” Peo­ple who want to evade the gaze of sci­ence haven’t read Judea Pearl’s book; they don’t know enough about for­mal causal­ity to not au­to­mat­i­cally rea­son this way about things they claim are in sep­a­rate mag­is­te­ria. They can say words like “It’s not mechanis­tic”, but they don’t have the math­e­mat­i­cal fluency it would take to de­liber­ately de­sign a sys­tem out­side Judea Pearl’s box.

So in all prob­a­bil­ity, when some­body says, “I com­muned holis­ti­cally and in a purely spiritual fash­ion with the en­tire uni­verse—that’s how I know my part­ner loves me, not be­cause of any mechanism”, their brain is just rep­re­sent­ing some­thing like this:

Part­ner loves Uni­verse knows I hear uni­verse %
p u h 0.44
p u ¬h 0.023
p ¬u h 0.01
p ¬u ¬h 0.025
¬p u h 0.43
¬p u ¬h 0.023
¬p ¬u h 0.015
¬p ¬u ¬h 0.035

True, false, or mean­ingless, this be­lief isn’t be­yond in­ves­ti­ga­tion by stan­dard ra­tio­nal­ity.

Be­cause causal­ity isn’t a word for a spe­cial, re­stricted do­main that sci­en­tists study. ‘Causal pro­cess’ sounds like an im­pres­sive for­mal word that would be used by peo­ple in lab coats with doc­torates, but that’s not what it means.

‘Cause and effect’ just means “stuff that makes stuff hap­pen and hap­pens be­cause of other stuff”. Any time there’s a noun, a verb, and a sub­ject, there’s causal­ity. If the uni­verse spoke to you in your heart—then the uni­verse would be mak­ing stuff hap­pen in­side your heart! All the stan­dard the­o­rems would still ap­ply.

What­ever peo­ple try to imag­ine that sci­ence sup­pos­edly can’t an­a­lyze, it just ends up as more “stuff that makes stuff hap­pen and hap­pens be­cause of other stuff”.

Main­stream sta­tus.

Part of the se­quence Highly Ad­vanced Episte­mol­ogy 101 for Beginners

Next post: “Causal Refer­ence

Pre­vi­ous post: “Causal Di­a­grams and Causal Models

• p(a,b,c) = p(a)p(b)p(c) isn’t a state­ment of un­cor­re­lat­ed­ness but of in­de­pen­dence. Us­ing the term “un­cor­re­lated” with that mean­ing might be defen­si­ble but prob­a­bly mer­its men­tion as some­thing not-main­stream.

• It’s helpful to go a bit fur­ther for these cor­rec­tions. What’s the rea­son not to use “un­cor­re­lated” here?

In or­di­nary English, “un­cor­re­lated” is in­deed used for this (and a host of other things, be­cause or­di­nary English is very vague). The prob­lem is that it means some­thing else in prob­a­bil­ity the­ory, namely the much weaker state­ment E(a-E(a)) E(b-E(b)) = E((a-E(a)(b-E(b)), which is im­plied by in­de­pen­dence (p(a,b) = p(a)p(b)), but not does not im­ply in­de­pen­dence. If we want to speak to those who know some prob­a­bil­ity the­ory, this clash of mean­ing is a prob­lem. If we want to ed­u­cate those who don’t know prob­a­bil­ity the­ory to un­der­stand the liter­a­ture and be able to talk with those who do know prob­a­bil­ity the­ory, this is also a prob­lem.

(Note too that un­cor­re­lat­ed­ness is only in­var­i­ant un­der af­fine remap­pings (X and Y cho­sen as the co­or­diantes of a ran­dom point on the unit cir­cle are un­cor­re­lated. X^2 and Y^2 are perfectly cor­re­lated. Nor does cor­re­lated di­rectly make any sense for non-nu­mer­i­cal vari­ables (though you could prob­a­bly lift to the sim­plex and use ho­mo­ge­neous co­or­di­nates to get a rea­son­able mean­ing).)

• I know that Eliezer knows quite a lot of math­e­mat­ics. His ar­ti­cle was clearly writ­ten for peo­ple who are at least a bit com­fortable with math­e­mat­ics. So it’s rea­son­able to sup­pose (1) that a sub­stan­tial frac­tion of read­ers will have en­coun­tered some­thing like the math­e­mat­i­cal no­tion of “un­cor­re­lated” and might there­fore be con­fused by hav­ing the word used to de­note some­thing else, and (2) that in no­tify­ing Eliezer of this it’s OK to be pretty terse about it.

For the avoidance of doubt, I’m not dis­agree­ing with any­thing you said, just ex­plain­ing why I just made the brief state­ment I did rather than offer­ing more ex­pla­na­tion.

• The prob­lem is that it means some­thing else in prob­a­bil­ity the­ory, namely the much weaker state­ment E(a-E(a)) E(b-E(b)) = E((a-E(a)(b-E(b))

E(a-E(a)) and E(b-E(b)) are both iden­ti­cally zero, so this would be more sim­ply put (and restor­ing some miss­ing paren­the­ses) as E((a-E(a))(b-E(b))) = 0. Or af­ter shift­ing the means of both vari­ables to zero, E(ab) = 0.

• Don’t bother, he’s “write-only.”

edit: There is stuff in the origi­nal ‘causal di­a­grams’ post from nearly two weeks ago that is fac­tu­ally wrong (not a minor nit­pick ei­ther), was pointed out as such, and is still un­cor­rected. “Write-only.”

• It’s worth not­ing that the Cen­ter for Ap­plied Ra­tion­al­ity ran the June mini­camp ex­per­i­ment us­ing a stan­dard but un­usual statis­ti­cal method of sort­ing ap­pli­cants into pairs that seemed of roughly matched prior abil­ity /​ prior ex­pected out­come, and then flip­ping a coin to pick one mem­ber of each pair to be ad­mit­ted or not

As an aside, if you’re in­ter­ested in look­ing up more about this nifty ex­per­i­men­tal de­sign trick, the magic key­word is “block­ing”. The idea of ran­dom­ized block de­signs dates back to Fisher.

• I’ve found block­ing to be re­ally use­ful for my small-scale ex­per­i­ments for 2 differ­ent rea­sons:

1. Often, I’m wor­ried about sim­ple ran­dom­iza­tion lead­ing to an im­bal­ance in sam­ple vs ex­per­i­men­tal; if I’m only get­ting 20 to­tal dat­a­points on some­thing, then ran­dom­iza­tion could eas­ily lead to some­thing like 14 con­trol and 6 ex­per­i­men­tal dat­a­points—throw­ing out a lot of statis­ti­cal power com­pared to 10 con­trol and 10 ex­per­i­men­tal. If I pair days, then I know I will get 1010, with­out wor­ry­ing about break­ing blind­ing.

2. Block­ing is the nat­u­ral way to han­dle mul­ti­ple-day effects or trends: if I think lithium op­er­ates slowly, I will pair en­tire weeks or months, rather than days and hop­ing enough ex­per­i­men­tal and con­trol days form runs which will re­veal any trend rather than wash it out in av­er­ag­ing.

• 18 Oct 2012 2:00 UTC
12 points
• Look at it as an ex­er­cise for the ac­tively dis­be­liev­ing mini-skill. :)

• Mini-trick for the mini-skill: Pre­tend he’s talk­ing about a fic­tional uni­verse where any­thing ex­plic­itly men­tioned is ar­bi­trary.

• Main­stream sta­tus:

As pre­vi­ously stated, the take on causal­ity’s math is meant to be aca­dem­i­cally stan­dard; this in­cludes the idea of de­com­pos­ing the X(i) into de­ter­minis­tic F(i) and un­cor­re­lated U(i).

I haven’t par­tic­u­larly seen any­one else ob­serve that claiming you know about X with­out X af­fect­ing you, you af­fect­ing X, or X and your be­lief hav­ing a com­mon cause, vi­o­lates the Markov con­di­tion on causal graphs.

I haven’t ac­tu­ally seen any­one cite the Markov con­di­tion as a re­ply to the old “What con­sti­tutes ran­dom­iza­tion?” de­bates I’ve glimpsed, but I would be gen­uinely sur­prised if Pearl & co. hadn’t pointed it out by now—my un­der­stand­ing is that he’s spend­ing most of his time evan­ge­liz­ing causal­ity to ex­per­i­men­tal statis­ti­ci­ans these days. It seems pretty ob­vi­ous once you have causal mod­els as a back­ground.

The con­cept of “sep­a­rate mag­is­te­ria” is as old as sci­en­tific cri­tique of re­li­gion, but the ac­tual phrase was coined by Stephen Jay Gould (speak­ing fa­vor­ably of the sep­a­ra­tion, natch). So far as I know, the con­cept of anti-episte­mol­ogy is an LW origi­nal; like­wise the view that causal­ity is more gen­eral than any­one try­ing to sep­a­rate their mag­is­terium would have the math­e­mat­i­cal com­pe­tence to suc­cess­fully es­cape, even as an at­tempted ex­cuse. In gen­eral, when I write about the skep­ti­cal ap­pli­ca­tions, I’m usu­ally writ­ing things I haven’t read be­fore and that wouldn’t be ex­pected to ap­pear some­where like Pearl’s Causal­ity book—which doesn’t im­ply that no­body else has writ­ten about them, of course. If you know of similar the­ses, com­ment here.

• I haven’t ac­tu­ally seen any­one cite the Markov con­di­tion as a re­ply to the old “What con­sti­tutes ran­dom­iza­tion?” de­bates I’ve glimpsed

Isn’t this es­sen­tially im­plied by the well-known ideas of “nat­u­ral ex­per­i­ments” and “in­stru­men­tal vari­ables”? Pearl does deal with these ideas in Causal­ity.

• What­ever peo­ple try to imag­ine that sci­ence sup­pos­edly can’t an­a­lyze, it just ends up as more “stuff that makes stuff hap­pen and hap­pens be­cause of other stuff”.

I think said peo­ple would ob­ject to this—e.g. “God cer­tainly isn’t stuff, God is meta­phys­i­cal!” This, of course, is not prob­lem for causal di­a­grams. The math al­lows you to have ar­rows from meta­phys­i­cal stuff to phys­i­cal stuff, which al­lows you to see oc­cam’s ra­zor vi­su­ally. But it’s in­ter­est­ing to think about how to best counter this ar­gu­ment when you’re try­ing to con­vince your op­po­nent and not just your­self.

• Fun gam­ble: Make a huge causal di­a­gram as part of the dis­cus­sion, and once peo­ple bring up the meta­phys­i­cal God ar­gu­ment, point at the whole di­a­gram and say “Okay, if God is meta­phys­i­cal, then he’s the rules by which the di­a­grams op­er­ate. There, look at this di­a­gram. You’re look­ing at God.”

I doubt it’d work, but the thought made me chuckle.

• Upvoted for:

The math al­lows you to have ar­rows from meta­phys­i­cal stuff to phys­i­cal stuff, which al­lows you to see oc­cam’s ra­zor vi­su­ally.

:)

• I sus­pect the best counter would have been to have seen more steps ahead and given them some ab­stract causal di­a­gram prac­tice.

• Twice in this ar­ti­cle, there are ta­bles of num­bers. They’re clearly made-up, not mea­sured from ex­per­i­ment, but I don’t re­ally un­der­stand ex­actly how made-up they are—are they care­fully or ca­su­ally cooked?

Could peo­ple in­stead use let­ters (vari­ables), with re­la­tions like ‘a > b’, ‘a >> b’, ‘a/​b > c’ and so on in the fu­ture? Then I could un­der­stand bet­ter what prop­er­ties of the table are in­ten­tional.

• In my ex­pe­rience, us­ing vari­ables in­stead of num­bers when it’s not ab­solutely nec­es­sary makes things ridicu­lously harder to un­der­stand for some­one not com­fortable with ab­stract math.

• we are talk­ing about the math­e­mat­ics of causal­ity. I would ex­pect peo­ple to be fa­mil­iar with free vari­ables and alge­bra.

I for one would find ex­plicit alge­braic ex­pes­sions much clearer than a bunch of mean­ingless num­bers.

• Depends what you mean by “fa­mil­iar”. I’d imag­ine any­one read­ing the es­say can do alge­bra, but that they’re still likely to be more com­fortable when pre­sented with spe­cific num­bers. Peo­ple are weird like that—we can learn gen­eral prin­ci­ples from ex­am­ples more eas­ily than from hav­ing the gen­eral prin­ci­ples ex­plained to us ex­plic­itly.

Ex­cep­tions abound, ob­vi­ously.

• There’s noth­ing about the ta­bles that was not ex­plained in the pre­vi­ous in­stal­l­ment of this se­ries; click the links if you’re still con­fused. I came to this know­ing noth­ing about that type of no­ta­tion, but the ta­bles told me even more than the bub­ble di­a­grams—and here’s the se­cret. Look­ing at the table tells you next to noth­ing. It’s only when you think about the situ­a­tions that the prob­a­bil­ities quan­tify, then they make sense. Although, as an ad­di­tional step, he could have ex­plained each of the situ­a­tions in sen­tence form in a para­graph, but prob­a­bly felt the table spoke for it­self.

The sec­ond table, for in­stance, (if I am in­ter­pret­ing cor­rectly) can be para­phrased as:

I be­lieve that my part­ner loves me, and that the uni­verse knows it, and I can get this an­swer from the uni­verse. I would also know that if my part­ner didn’t love me, be­cause the uni­verse would know it and I would hear that. It’s prob­a­bly one of those two. Of course it could be that I don’t hear the uni­verse, or the uni­verse is ly­ing to me, or that the uni­verse doesn’t mag­i­cally pick up our thoughts (how un­ro­man­tic!), but I re­ally don’t be­lieve that to be true, I only ad­mit that it’s pos­si­ble. I am ra­tio­nal, af­ter all.

• I agree that if you don’t look at the num­bers, but at the sur­round­ing text, you get the sense that the num­bers could be para­phrased in that way.

So does h, la­beled “I hear uni­verse” mean “I hear the uni­verse tell me some­thing at all”, or “I hear the uni­verse tell me that they love me” or “I hear the uni­verse tell me what it knows, which (tac­itly ac­cord­ing to the mean­ing of knows) is ac­cu­rate”?

I thought it meant “I have a sen­sa­tion as if the uni­verse were tel­ling me that they love me”, but the high­est prob­a­bil­ity sce­nar­ios are p&u&h, and -p&u&h, which would sug­gest that re­gard­less of their love, I’m likely to ex­pe­rience a sen­sa­tion as if the uni­verse were tel­ling me that they love me. That seems rea­son­able from a skep­ti­cal view­point, but not from a be­liever’s view­point.

• Con­grat­u­la­tions, you’ve cleared the hid­den test of mak­ing sure that this isn’t all just a pass­word in your head!

IMO, which one it was in­tended to be is ir­rele­vant as long as you un­der­stand both cases. Un­der­stand­ing these things enough to be able to un­tan­gle them like this sounds like it’s re­ally the whole point of the ar­ti­cle.

• I took h to mean “I ac­cu­rately re­ceive the in­for­ma­tion that the uni­verse con­veys”, which in this case re­gard­ing the state of my part­ner lov­ing me or not, I would still ac­cu­rately hear the uni­verse, oth­er­wise it would be not-h. Since I am con­sid­er­ing pos­si­ble states, part­ner-not-lov­ing-me/​uni­verse-tells-me/​me-hear­ing-that would be the sec­ond most likely pos­si­bil­ity, be­cause the other two vari­ables are less in doubt (for the per­son in the ex­am­ple).

If this per­son were in real life, they prob­a­bly are frus­trated, won­der­ing why on earth it feels like their part­ner is try­ing to drive a wedge in the re­la­tion­ship, when ob­vi­ously they are in love, be­cause the uni­verse can mag­i­cally read their minds and the crys­tal auras never lie.

• Com­menter His­tor­i­calLing does have a point. Kat­suki Sek­ida ex­plains:

“Now, ‘Mu’ means ‘noth­ing’ and is the first koan in Zen. You might sup­pose that, as you sit say­ing ‘Mu’, you are in­ves­ti­gat­ing the mean­ing of noth­ing­ness. But that is quite in­cor­rect. It is true that your teacher, who has in­structed you to work on Mu, may re­peat­edly say to you, ‘What is Mu?’ ‘Show me Mu,’ and so on, but he is not ask­ing you to in­dulge in con­cep­tual spec­u­la­tion. He wants you to ex­pe­rience Mu. And in or­der to do this, tech­ni­cally speak­ing, you have to take Mu sim­ply as the sound of your own breath and en­ter­tain no other idea.”

In Zen prac­tice, the pur­pose of a “koan” is to oc­cupy the mind with a fruitless ques­tion (or in LW par­lance, a wrong ques­tion). (Although “Mu” isn’t even a ques­tion!) This helps the med­i­ta­tor to main­tain con­cen­tra­tion, since by dwelling on a dead-end like “What is the samadhi in par­ti­cle af­ter par­ti­cle?” he isn’t dis­tracted by the nor­mal flux of flit­ting thoughts.

The stu­dent is still ex­pected to provide an an­swer, even­tu­ally, but not one ar­rived at by ra­tio­nal thought—rather, it is sup­posed to strike him spon­ta­neously. Of course, this isn’t a gen­er­ally wise ap­proach to an­swer­ing ques­tions; but if the Zen mas­ter were to tell his stu­dent that the koan can’t be an­swered, he might not take the ex­er­cise se­ri­ously. (I ex­pect that Bayesi­ans find it difficult to med­i­tate us­ing koans, since they are so keenly aware of wrong ques­tions.)

A koan is a de­liber­ately fu­tile ques­tion, gen­er­ally short and in­tended to ob­scure thought. To use this word also to re­fer to puz­zles which are not skew to re­al­ity and which are in­tended to be an­swered sen­si­bly, is likely to cause bad in­fer­ences about the pur­pose of koans in Zen—and is jar­ring in this con­text!

• Suggest a bet­ter word? Keep in mind that words which are not bet­ter will be re­jected (peo­ple of­ten seem to for­get this while mak­ing al­ter­nate sug­ges­tions).

• I think the di­vi­sion into prob­lems and ex­er­cises usu­ally seen in math­e­mat­ics texts would be use­ful: A task is con­sid­ered an ex­er­cise if it’s rou­tine ap­pli­ca­tion of pre­vi­ous ma­te­rial, it’s a prob­lem if it re­quires some kind of in­sight or origi­nal­ity. So far most of the Koans have seemed more like prob­lems than like ex­er­cises, but de­pend­ing on con­tent both may be use­ful. I might be slightly bi­ased to­wards this as I greatly en­joy math­e­mat­ics texts and am used to that style.

• “Prob­lem” sug­gests some­thing differ­ent in philos­o­phy than in math. A philos­o­phy “prob­lem” is a seem­ing dilemma, e.g. Get­tier, New­comb’s, or Trol­ley. So I’d sug­gest “ex­er­cise” here.

“Ex­er­cise” dom­i­nates “kōan” in that both have the sense of some­thing to stop and think about and try to solve, but ① “ex­er­cise” avoids the mis­con­strual of Zen prac­tice (the pur­pose of a Zen kōan is not to come up with a solu­tion, nor to set up for an ex­pla­na­tion), ② the Ori­en­tal­ism (the du­bios­ity of say­ing some­thing in Ja­panese to make it sound 20% cooler), and ③ the dis­trac­tion of hav­ing to ex­plain what a kōan is to those who don’t know the word.

EDIT: The claim that a pur­pose of a Zen kōan is not to come up with a solu­tion ap­pears to be a mat­ter of dis­agree­ment, so dis­count ①. I think ② and ③ stand, though.

• the pur­pose of a Zen kōan is not to come up with a solu­tion, nor to set up for an explanation

The ac­count in the Wikipe­dia ar­ti­cle says differ­ently:

How­ever, in Zen prac­tice, a kōan is not mean­ingless, and not a rid­dle or a puz­zle. Teach­ers do ex­pect stu­dents to pre­sent an ap­pro­pri­ate re­sponse when asked about a kōan.

Ac­cord­ing to the his­tory of the word given there, it origi­nally meant ac­counts of le­gal de­ci­sions (and liter­ally, a mag­is­trate’s bench). In Chi­nese Bud­dhism it came to re­fer to snip­pets of di­alogue be­tween mas­ters. From there it mu­tated to the con­tem­pla­tion of mys­te­ri­ous say­ings, and even­tu­ally to what looks very like an ex­er­cise in guess­ing the teacher’s pass­word, with au­tho­rised an­swers that were speci­fi­cally taught and had to be given to ac­quire pro­mo­tion in the Ja­panese monastery sys­tem. (I have this book, which is sub­ti­tled “281 Zen Koans with An­swers”.)

The mod­ern mean­ing of “koan” dealt with in the sec­tion “Koan-prac­tice” de­scribes what looks very like Eliezer’s in­ten­tion in us­ing the word here: a prob­lem that can­not be an­swered by merely ap­ply­ing known rules to new ex­am­ples, but re­quires new thoughts and ideas: a prob­lem that be­gins by seem­ing im­pos­si­ble: a prob­lem that can­not be solved with­out in the pro­cess learn­ing some­thing that one has not been taught.

Per­haps there is, some­where, a bet­ter word, but I think “koan” will be hard to beat.

• So… the main thing I want to con­vey over and above “ex­er­cise” is that rather than there be­ing a straight­for­ward task-to-solve, you’re sup­posed to pon­der the state­ment and say, “What do I think of this?”

A word other than “koan” which con­veys this in­tent-to-pon­der would in­deed be ap­pre­ci­ated.

• What about “rid­dle” or “puz­zle”?

• The only trou­ble I see is that “koan” makes it to­tally okay to think about it for a while with­out find­ing the an­swer, while “puz­zle” might cause peo­ple to pro­pose solu­tions.

• Given that most peo­ple seem likely to look at the koan and think “yeah, I could solve that if I thought about it for a while” and then move on with­out ac­tu­ally think­ing about it, any­thing that ac­tu­ally gets peo­ple to think about it seems like a good thing.

• The only trou­ble is if peo­ple then have to un­think things, which hu­mans are no­to­ri­ously bad at :P

• Peo­ple have already been propos­ing solu­tions to the “koans”, and I don’t un­der­stand why that’s a bad thing.

• The goal is to ap­ply those al­gorithms we call “ra­tio­nal­ity” to­wards solv­ing the koan, one of which in­volves with­hold­ing even just men­tally for­mu­lat­ing solu­tions as much as pos­si­ble, and in­stead just think­ing about the el­e­ments and prop­er­ties of the prob­lem prop­erly with­out sub­ject­ing one­self to hack heuris­tics.

The word puz­zle is, for most peo­ple, loaded with a trained im­pulse to shoot the first solu­tion-sound­ing thing that pops to mind so that you can see whether you get a he­don /​ tribal sta­tus coin for a good an­swer or not.

• Alright. I see where you’re com­ing from, though I doubt that “puz­zle” and “koan” have as many deep con­no­ta­tions as you claim.

Maybe the right thing to do is to ac­tu­ally write some­thing to the effect of “Here is how you should be ap­proach­ing these puz­zles/​koans”?

• What about “rid­dle” or “puz­zle”?

“Puz­zle” is good be­cause it sug­gests that there is a solu­tion, whereas some “prob­lems” don’t have solu­tions, be­cause they are sim­ply con­fused.

• How­ever, the trained be­hav­ior of most peo­ple when fac­ing a puz­zle is to look at it for a few sec­onds and then throw the first good-sound­ing solu­tion you can think of.

• Which isn’t nec­es­sar­ily a bad thing. Either they’ll get a right an­swer de­spite throw­ing the first pos­si­ble solu­tion at it, or they’ll widely miss the mark, in which case they might ac­tu­ally re­al­ize that they’ve learned some­thing by the time that the right an­swer is demon­strated.

• You have a point. My (sub­con­scious) pri­ors on that end are skewed to­wards “Never, ever throw out solu­tions be­fore you’ve laid things out prop­erly” be­cause of lots and lots of lit­tle per­sonal ex­pe­riences with com­plete failure modes due to stop­ping with the first solu­tion I found.

• I don’t think “noo­dle” is taken, yet.

• The word “pab­u­lum” (from Latin for “fod­der”) was once used in English to mean “food for thought”. How­ever, it (or “pablum”) is now more likely to de­note in­sipid fare. We could re­claim the origi­nal mean­ing—in which case these state­ments-to-be-pon­dered are “pab­ula”.

• Con­sider adding “straight­for­ward” ex­er­cises for the lesser mor­tals, and mark the harder ones, (koans?) with stars, like the stan­dard text­books do.

• “Pon­der­ing ex­er­cise” maybe?

In­ter­est­ing that “pon­der­ing” is a cog­ni­tive skill that needs ex­er­cised. The term de­rives from a latin term for “weight”. Per­haps this can be thought of as some­thing analo­gous to bar­bells or dumb­ells for episte­molog­i­cal strength-train­ing.

• Per­haps this can be thought of as some­thing analo­gous to bar­bells or dumb­ells for episte­molog­i­cal strength-train­ing.

I like the way you think. Care to elab­o­rate?

• Pon­der­ing means think­ing about some­thing in a way that makes it “heavy” or difficult for the mind to pro­cess (just as heavy ob­jects are difficult to lift). Like the metaphor­i­cal “bur­den of proof,” it refer­ences men­tal difficulty of pro­cess­ing ideas to phys­i­cal difficulty in lift­ing ob­jects. The way this hap­pens in­volves in­creas­ing the com­plex­ity of your men­tal in­stan­ti­a­tion of an idea, thereby bring­ing more cog­ni­tive al­gorithms to bear on it.

The strength-train­ing metaphor only works if it can be con­trasted with en­durance-train­ing. Other­wise it would just be a generic kind of train­ing. Strength train­ing in­volves shorter bursts of fo­cused effort fol­lowed by a re­cov­ery pe­riod. Th­ese koans are short and in­tended for 5-15-min­utes of fo­cused thought, so they are prob­a­bly more on that end of the spec­trum than length­ier ar­ti­cles that de­scribe com­plex con­cepts.

Episte­molog­i­cal en­durance train­ing (as­sum­ing there is such a thing) would be where you use longer pe­ri­ods of time think­ing about a prob­lem that has a fair de­gree of men­tal effort re­quired but not over­whelming. That would analo­gize to run­ning, bik­ing, and so forth where rather than do­ing the hard­est thing you can do, you are do­ing some­thing rather hard for a longer time.

• Ooops I mis­com­mu­ni­cated. I think the sur­face anal­ogy isn’t the most in­ter­est­ing part of this.

I was more in­ter­ested in what ideas you had for train­ing episte­molog­i­cal abil­ity. The burst vs en­durance thing could be in­ter­est­ing if it could be de­tailed in on its own terms (ie. in­side view in­stead of analo­giz­ing).

I’ve been think­ing a lot about ra­tio­nal­ity train­ing re­cently, so any­thing that looks like a pos­si­ble ex­cer­cise re­ally catches my at­ten­tion.

• So it must have been “pon­der­ing as a ra­tio­nal­ity skill” which got your at­ten­tion. Sorry for mis­in­ter­pret­ing. :)

For me it’s not hard to pon­der. I do that nat­u­rally. But I don’t always pon­der ex­actly what I’m told to pon­der, even when I have ev­ery rea­son to think the per­son who told me to pon­der some­thing knows what they’re talk­ing about and this is some­thing that if I pon­der it I will benefit from the re­sult­ing en­light­en­ment. It’s like there is some­thing in the na­ture of pon­der­ing that is per­verse and re­bel­lious (at least for the way my mind works, some of the time).

Per­haps a good ex­er­cise would be to de­liber­ately pon­der spe­cific things that you aren’t (yet) nat­u­rally cu­ri­ous about. Maybe set a timer and com­mit to only fo­cus on that par­tic­u­lar topic un­til the timer goes off. I won­der what an op­ti­mal time length would be? Also, what kinds of top­ics could/​should be used for the ex­er­cise?

• Whether it’s bet­ter or not for your pur­poses is of course your call, but as I said to chaos­mo­sis above, I re­solve this ten­sion in my own mind by un­der­stand­ing “koan” as you use it to mean “ex­er­cise.”

Then again, I also re­place all of your Ja­panese phrases in my head with their cor­re­spond­ing English.

I sus­pect this just re­flects my not valu­ing a par­tic­u­lar kind of myth-build­ing very much in this con­text, so I just ex­pe­rience it as a mildly an­noy­ing dis­trac­tion.
If you find it valuable, by all means con­tinue with it.

• I re­solve this ten­sion in my own mind by un­der­stand­ing “koan” as you use it to mean “ex­er­cise.”

I do the same. I could find no deeper mean­ing in EY’s use of “koan”. Maybe I’m miss­ing some­thing.

I also re­place all of your Ja­panese phrases in my head with their cor­re­spond­ing English.

Same here, ex­cept I have to look up this an­noy­ing pseudo-Ja­panese in-group slang al­most ev­ery time. Is us­ing it in­tended as some kind of sta­tus sig­nal­ing?

• A koan is a de­liber­ately fu­tile ques­tion, gen­er­ally short and in­tended to ob­scure thought. To use this word also to re­fer to puz­zles which are not skew to re­al­ity and which are in­tended to be an­swered sen­si­bly, is likely to cause bad in­fer­ences about the pur­pose of koans in Zen—and is jar­ring in this con­text!

I don’t think re­pur­pos­ing the word ‘koan’ is that ter­rible. We are not go­ing to do Zen koans in this con­text, and I would not be sur­prised to find that many here are more fa­mil­iar with things such as Ruby koans.

Also, there is some dis­agree­ment about the mean­ing and use of koans—Zen (and Chan, Seon) bud­dhism has many fla­vors. Notably, his­tor­i­cally koans (and the Chi­nese say­ings they were based on) did not nec­es­sar­ily have the char­ac­ter you at­tribute to them above; they were origi­nally just teach­ings passed down in the form of say­ings.

• The ori­gins of the word aren’t very rele­vant to its cur­rent mean­ing; al­most no one on this site would have known those ori­gins be­fore now and so those ori­gins don’t have much in­fluence on the way we think about the word now. The stan­dard un­der­stand­ing of koans that dom­i­nates pretty much ev­ery­where is in line with what Do­ri­ana quotes.

Us­ing the word koan is in­ac­cu­rate. I think Yud­kowsky is ei­ther try­ing to do it to as­so­ci­ate feel­ings of mys­tic power with ra­tio­nal­ity, or to at­tack feel­ings of mys­tic power by set­ting up ex­pec­ta­tions and then de­stroy­ing those; I don’t have any idea which. But it some­what an­noys me. It’s not a huge deal, but it’s an­noy­ing.

I’m all for re­pur­pos­ing words, but only if there’s a de­cent jus­tifi­ca­tion to do so and I don’t see one here.

• Us­ing the word koan is in­ac­cu­rate. I think Yud­kowsky is ei­ther try­ing to do it to as­so­ci­ate feel­ings of mys­tic power with ra­tio­nal­ity, or to at­tack feel­ings of mys­tic power by set­ting up ex­pec­ta­tions and then de­stroy­ing those; I don’t have any idea which. But it some­what an­noys me. It’s not a huge deal, but it’s an­noy­ing.

The first of those two hy­pothe­ses but yes, it’s an­noy­ing and jar­ring. I had kind of hoped Eliezer got the mys­tic zen mar­tial arts non­sense out of his sys­tem years ago and could start talk­ing plain sense now.

• I like the mys­tic Zen mar­tial arts non­sense. Looks like it’s the time for a poll.

Eliezer’s mys­tic Zen mar­tial arts non­sense is...

[pol­lid:182]

• I voted “Don’t care”, whereas in re­al­ity it’s more that I like the things like the cult koans and Tsuyoku Nar­i­tai, but find the cur­rent use of “Koan” so-so (I like the ques­tions, the term “koan” is a bit jar­ring, but I can get used to it)

• I find it su­per ob­nox­ious, in ex­actly the same way I felt when my mar­tial arts teach­ers talked about us­ing my dan­tian to fo­cus my chi in­stead of breath­ing with my di­aphragm or what­ever is ac­tu­ally use­ful.

• In gen­eral the “mys­tic Zen mar­tial arts non­sense” is a nice an­ti­dote to the Straw Vul­can stereo­type.

That’s no ex­cuse for mi­sus­ing a word in this spe­cific in­stance, though.

• The prob­lem with reg­u­lar the­ory ex­po­si­tion is that we don’t have a good the­o­ret­i­cal frame­work for dis­cussing how to put the­ory to prac­tice, so the difficult to ex­press parts about ap­ply­ing the the­ory just get omit­ted. I like the mar­tial arts non­sense so far as it con­notes an in­ten­tion that you are sup­posed to ac­tu­ally put the sub­ject mat­ter to use and win with it, in ad­di­tion to just ap­pre­ci­at­ing the the­ory. Since we don’t know how to ex­press gen­eral in­struc­tions for putting the­ory to prac­tice very well in plain speech, some evoca­tive mys­ti­cism may be the best we can do.

• I don’t always dis­like it. “I must be­come stronger” benefited from the ap­proach. I dis­like this spe­cific in­stance be­cause it’s jar­ring and doesn’t fit with the con­text and it’s a mi­suse of the word “koan”.

• The ori­gins of the word aren’t very rele­vant to its cur­rent meaning

If you’ll al­low me to take this a bit out of con­text, please think of typ­i­cal Zen us­age as “ori­gins of the word” and us­age in this se­quence of posts as “its cur­rent mean­ing.”

The differ­ence is ob­vi­ous, of course—you know what the word means, and any­thing else is wrong. Which is to­tally fine. I just wanted to point out that if you try to make your con­clu­sions uni­ver­sal or ab­solute here, you will in fact cre­ate more rel­a­tivism—the solu­tion is to claim the non-uni­ver­sal knowl­edge of how words should be used if you’re the au­di­ence.

• The stan­dard un­der­stand­ing of koans that dom­i­nates pretty much ev­ery­where is in line with what Do­ri­ana quotes.

I dis­agree. I would pre­dict that most peo­ple have no idea what “koan” means, those that have se­ri­ously stud­ied Bud­dhism are aware of the con­tro­versy, and a sig­nifi­cant mass of peo­ple (es­pe­cially rep­re­sented in this de­mo­graphic) are more fa­mil­iar with the use of “koan” in pro­gram­ming, as with Ruby koans.

The con­cern seems to be that those who haven’t ac­tu­ally stud­ied va­ri­eties of Bud­dhism but are some­how aware of the word “koan” might be con­fused—but the word is clearly defined be­fore its first use in this se­quence:

(A ‘koan’ is a puz­zle that the reader is meant to at­tempt to solve be­fore con­tin­u­ing. It’s my some­what awk­ward at­tempt to re­flect the re­search which shows that you’re much more likely to re­mem­ber a fact or solu­tion if you try to solve the prob­lem your­self be­fore read­ing the solu­tion; suc­ceed or fail, the im­por­tant thing is to have tried first . This also re­flects a prob­lem Michael Vas­sar thinks is oc­cur­ring, which is that since LW posts of­ten sound ob­vi­ous in ret­ro­spect, it’s hard for peo­ple to vi­su­al­ize the diff be­tween ‘be­fore’ and ‘af­ter’; and this diff is also use­ful to have for learn­ing pur­poses. So please try to say your own an­swer to the koan—ideally whisper­ing it to your­self, or mov­ing your lips as you pre­tend to say it, so as to make sure it’s fully ex­plicit and available for mem­ory—be­fore con­tin­u­ing; and try to con­sciously note the differ­ence be­tween your re­ply and the post’s re­ply, in­clud­ing any ex­tra de­tails pre­sent or miss­ing, with­out try­ing to min­i­mize or max­i­mize the differ­ence.)

• When I google “koan”, the first re­sult is Wikipe­dia which says a koan is “a story, di­alogue, ques­tion, or state­ment, which is used in Zen prac­tice to pro­voke the “great-doubt”, and test the stu­dents progress in Zen prac­tice”. Very Zen, that sup­ports my side. The sec­ond re­sult is Mer­riam-Web­ster’s dic­tio­nary, which says a koan is “a para­dox to be med­i­tated upon that is used to train Zen Bud­dhist monks to aban­don ul­ti­mate de­pen­dence on rea­son”. My side. The third re­sult is for a page ti­tled “101 Zen Koans”, which again sup­ports my be­lief.

Eliezer has a his­tory of as­so­ci­at­ing mys­ti­cism with ra­tio­nal­ity, as well.

My per­sonal con­cern is that us­ing words wrong is an­noy­ing be­cause I don’t like peo­ple muck­ing up my con­cep­tual spaces. I can’t dis­as­so­ci­ate koans from mys­ti­cism and rid­dles, which makes it awk­ward and aes­thet­i­cally un­pleas­ing for me to ap­proach prob­lems of ra­tio­nal­ity from a “koan”.

That said, it’s prob­a­bly too late to change the for­mat of the prob­lems in this cur­rent se­quence. But I’d like it to never hap­pen again af­ter this gets done.

• I sus­pect it will con­tinue to hap­pen. In­vok­ing the cul­tural trap­pings of a cer­tain kind of mys­ti­cism while dis­cussing tra­di­tion­ally “ra­tio­nal” top­ics is, as you note, a pop­u­lar prac­tice… and not only of Eliezer’s.

I recom­mend treat­ing the word “koan” as used here as a fancy way of say­ing “ex­er­cise”.

• And then we re­al­ize that the use of the word ‘koan’ was not en­tirely se­ri­ous, and get on with the se­quence.

Also, note the side-effect of that karma penalty—re­spond­ing to things with­out or­ga­niz­ing the post ap­pro­pri­ately. Whee.

(note to self: check when I loaded the page be­fore com­ment­ing)

• Clearly, us­ing 1-0-1-0-1-0 on a list of pa­tients in alpha­bet­i­cal or­der isn’t ran­dom enough… or is it?

It’s not, if only be­cause the peo­ple im­ple­ment­ing it can guess it: a text­book I read on do­ing med­i­cal tri­als men­tioned that this pro­ce­dure was done in med­i­cal tri­als, and it led to tam­per­ing where doc­tors would send the pa­tients they liked bet­ter or were sicker or what­ever to the ‘right’ trial arm.

• it led to tam­per­ing where doc­tors would send the pa­tients they liked bet­ter or were sicker or what­ever to the ‘right’ trial arm.

So they changed the per­son’s name, or what?

• Some­thing like that. There are a lot of ways to tam­per with this: par­ti­ci­pa­tion is vol­un­tary, of course, so if a pa­tient would ‘benefit’ from be­ing ran­dom­ized to the ‘right’ arm, you’d en­courage them to do it, while if they weren’t, you’d en­courage them to drop out (and maybe get the tested treat­ment them­selves!). You’d filter the list in the first place, or use al­ter­na­tive names (My le­gal first name starts with one let­ter but I always go by a ver­sion of my mid­dle name which starts with a differ­ent let­ter: which ver­sion does the doc­tor write down?). And so on.

One in­ter­est­ing ex­am­ple, from a ret­ro­spec­tive:

It took only a few months to ac­cu­mu­late the re­quired ex­pe­rience in the two hos­pi­tals (Reese et al. 1952). Allo­ca­tion to ACTH or no ACTH was de­cided by draw­ing mar­bles from a jar con­tain­ing an equal num­ber of white and blue mar­bles: one morn­ing, when a new in­fant be­came el­i­gible for en­rol­l­ment, I no­ticed that our head nurse shook the jar vi­gor­ously, turned her head away, pul­led a mar­ble out (just as she had been in­structed); but be­cause she didn’t like the ‘as­sign­ment’, she put the mar­ble back, shook the jar again, and pul­led out the color that agreed with her bias! The im­por­tance of Brad­ford Hill’s pre­cau­tion in Bri­tain’s fa­mous strep­to­mycin trial to con­ceal the or­der of as­sign­ment in sealed en­velopes was im­me­di­ately ob­vi­ous!

• thanks!

• You seem to be ex­ag­ger­at­ing the gen­er­al­ity of the causal Markov con­di­tion (CMC) when you say it is deeper and more gen­eral than the sec­ond law of ther­mo­dy­nam­ics. In a big world, failures of the CMC abound. Let’s say the cor­re­la­tion be­tween the psy­chic cousin’s pre­dic­tions and the top card of the deck is ex­plained by the per­son perform­ing the test be­ing a stooge, who is giv­ing some non-ver­bal in­di­ca­tion to the pur­ported psy­chic about the top card. So here we have a causal ex­pla­na­tion of the cor­re­la­tion, as the CMC would lead us to ex­pect. But since we are in a big world, there are a mas­sive num­ber of Boltz­mann brains out there, out­side our light cone, whose brain states cor­re­late with the top card in the same way that the cousin’s does. But there is no causal ex­pla­na­tion for this cor­re­la­tion, it’s just the kind of thing one would ex­pect to hap­pen, even non-causally, in a suffi­ciently large world. So the CMC isn’t a uni­ver­sal truth.

Now, the CMC is a re­mark­ably ac­cu­rate rule if we re­strict it to our lo­cal en­vi­ron­ment. But it’s pretty plau­si­ble that this is just be­cause our lo­cal en­vi­ron­ment is mono­ton­i­cally en­tropy-in­creas­ing to­wards the fu­ture and en­tropy-de­creas­ing to­wards the past. Be­cause of this fea­ture of our en­vi­ron­ment, lo­cal in­ter­ven­tions pro­duce cor­re­la­tions that prop­a­gate out spa­tially to­wards the fu­ture, but not to­wards the past. When you drop a rock into a pond, waves origi­nate at the point the rock hit the wa­ter and travel out­wards to­wards the fu­ture, even­tu­ally pro­duc­ing spa­tially dis­tant cor­re­la­tions (like fish at ei­ther end of the pond be­ing dis­turbed from their slum­ber).

Imag­ine that there is a patch some­where in the track­less im­men­sity of space­time that looks ex­actly like our lo­cal en­vi­ron­ment, but time-re­versed. Here we would have a pond with a rock ini­tially ly­ing at its bot­tom. Spon­ta­neously, the edges of the pond fluc­tu­ate so as to pro­duce a co­her­ent in­ward-di­rected wave, which closes in on the rock, trans­fer­ring to it suffi­cient en­ergy to make it shoot out of the pond. If you don’t al­low back­ward cau­sa­tion, then it seems that the ini­tial cor­re­lated fluc­tu­a­tion that pro­duced the co­her­ent wave has no causal ex­pla­na­tion, a vi­o­la­tion of the CMC.

The sec­ond law is of­ten read as a claim about the con­di­tion of the early uni­verse (or some patch of the uni­verse), speci­fi­cally that there were no cor­re­la­tions be­tween differ­ent de­grees of free­dom (such as the po­si­tions and ve­loc­i­ties of par­ti­cles) ex­cept for those im­posed by the macro­scopic state. There were no sneaky micro­scopic cor­re­la­tions that could later pro­duce macro­scopic con­se­quences (see this pa­per). En­tropy in­crease fol­lows from that, the story goes, and, plau­si­bly, the suc­cess of the CMC fol­lows from that as well. There is a strong case to be made that the sec­ond law is prior to the CMC in the or­der of ex­pla­na­tion.

• I have doubts about how mean­ingful it is to talk of cor­re­lat­ing things that are out­side each other’s light cones.

Be­sides that, sup­pose there re­ally are an as­tro­nom­i­cal num­ber of Boltz­mann Brains that you could say are non-causally cor­re­lated with the top card of a par­tic­u­lar deck of cards. Cal­ling this a failure of the Causal Markov Con­di­tion is beg­ging the ques­tion be­cause the only thing iden­ti­fy­ing this set is se­lec­tion based on the cor­re­la­tion it­self. The set you should con­sider, of all Boltz­mann Brains that you could test for cor­re­spon­dence with the top card, will not be cor­re­lated with it at all.

En­tropy in­crease fol­lows from that, the story goes...

Fol­lows from it causally, like? :)

• I have doubts about how mean­ingful it is to talk of cor­re­lat­ing things that are out­side each other’s light cones.

I don’t see why you would have these doubts. Whether or not two vari­ables are cor­re­lated is a purely math­e­mat­i­cal con­di­tion. Why do you think it mat­ters where in space-time the phys­i­cal prop­er­ties those vari­ables de­scribe are in­stan­ti­ated?

Be­sides that, sup­pose there re­ally are an as­tro­nom­i­cal num­ber of Boltz­mann Brains that you could say are non-causally cor­re­lated with the top card of a par­tic­u­lar deck of cards. Cal­ling this a failure of the Causal Markov Con­di­tion is beg­ging the ques­tion be­cause the only thing iden­ti­fy­ing this set is se­lec­tion based on the cor­re­la­tion it­self. The set you should con­sider, of all Boltz­mann Brains that you could test for cor­re­spon­dence with the top card, will not be cor­re­lated with it at all.

Wait, why is the rele­vant refer­ence class the class of all and only Boltz­mann brains? It seems more nat­u­ral to pick a refer­ence class that in­cludes all brains (or brain-states). But in that case, the prob­a­bil­ities of the Boltz­mann brain be­ing in the states that it is in will be ex­actly the same as the prob­a­bil­ities of the psy­chic cousin be­ing in the states that he is in (since the states are the same by hy­poth­e­sis), so if the psy­chic’s brain states are cor­re­lated with the top card the BB’s will be as well.

Fol­lows from it causally, like? :)

Sure, if you want. I’m not deny­ing here that causal­ity is prior to the sec­ond law. I’m deny­ing that the causal Markov con­di­tion is prior to the sec­ond law.

• OK. wrt the light cones, I was post­ing with­out my brain switched on. Ob­vi­ously two events can be out­side each oth­ers light cones and yet a cor­re­la­tion be­tween them still be ob­served where their light cones over­lap in the fu­ture. I was think­ing fairly un­clearly about whether you could be in an epistemic state to con­sider cor­re­la­tion be­tween things out­side your own light cone, but this is kind of ir­rele­vant, so please dis­re­gard.

the prob­a­bil­ities of the Boltz­mann brain be­ing in the states that it is in will be ex­actly the same as the prob­a­bil­ities of the psy­chic cousin be­ing in the states that he is in (since the states are the same by hy­poth­e­sis)

Just be­cause the states are the same doesn’t mean the prob­a­bil­ity of be­ing in that state are the same. It’s only mean­ingful to dis­cuss the prob­a­bil­ity of an out­come in terms of a prob­a­bil­ity dis­tri­bu­tion over pos­si­ble out­comes. If you pick a set of con­di­tions such as “Boltz­mann brains in the same state as that of the psy­chic cousin” you are cre­at­ing the hy­po­thet­i­cal cor­re­la­tion your­self by the way you define it. To my mind, that’s not a thought ex­per­i­ment that can tell you any­thing.

• Just be­cause the states are the same doesn’t mean the prob­a­bil­ity of be­ing in that state are the same. It’s only mean­ingful to dis­cuss the prob­a­bil­ity of an out­come in terms of a prob­a­bil­ity dis­tri­bu­tion over pos­si­ble out­comes.

In my ex­am­ple, I speci­fied that the BB is in a refer­ence class with all other brains, in­clud­ing the psy­chic cousin’s. Given that they are both in the refer­ence class, the fact that the BB and the cousin share the same cog­ni­tive his­tory im­plies that the prob­a­bil­ities of their cog­ni­tive his­to­ries rel­a­tive to this refer­ence class are the same. The refer­ence class is what fixes the prob­a­bil­ity dis­tri­bu­tion over pos­si­ble out­comes if you’re de­ter­min­ing prob­a­bil­ities by rel­a­tive fre­quen­cies, and if they are in the same refer­ence class, they will have the same prob­a­bil­ity dis­tri­bu­tion.

I sus­pect Eliezer was think­ing of a differ­ent prob­a­bil­ity dis­tri­bu­tion over brain states when he said the psy­chic’s brain state is cor­re­lated with the deck of cards. The prob­a­bil­ities he is refer­ring to are some­thing like the rel­a­tive fre­quen­cies of brain states (or brain state types) in a sin­gle ob­server’s cog­ni­tive his­tory (ETA: Or per­haps more ac­cu­rately for a Bayesian, the prob­a­bil­ities you get when you con­di­tion­al­ize some rea­son­able prior on the se­quence of in­stan­ti­ated brain states). Even us­ing this dis­tri­bu­tion, the BB’s brain state will be cor­re­lated with the top card.

• Even if the BB and the psy­chic are in causally dis­con­nected parts of your model, them hav­ing the same prob­a­bil­ity of be­ing cor­re­lated with the card doesn’t im­ply that the Causal Markov Con­di­tion is bro­ken. In or­der to show that, you would need to spec­ify all of the par­ent nodes to the BB in your model, calcu­late the prob­a­bil­ity of it be­ing cor­re­lated with the card, and then see whether hav­ing knowl­edge of the psy­chic would change your prob­a­bil­ity for the BB. Since all physics cur­rently is lo­cal in na­ture, I can’t think of any­thing that would im­ply this is the case if the psy­chic is out­side of the past light cone of the BB. Larger bound­ary con­di­tions on the uni­verse as a whole that may or may not make them cor­re­late have no effect on whether the CMC holds.

• I’m hav­ing trou­ble pars­ing this com­ment. You seem to be grant­ing that the BB’s state is cor­re­lated with the top card (I’m as­sum­ing this is what you mean by “hav­ing the same prob­a­bil­ity”), that there is no di­rect causal link be­tween the BB and the psy­chic, and that there are no com­mon causes, but say­ing that this still doesn’t nec­es­sar­ily vi­o­late the CMC. Am I in­ter­pret­ing you right? If I’m not, could you tell me which one of those premises does not hold in my ex­am­ple?

If I am in­ter­pret­ing you cor­rectly, then you are wrong. The CMC en­tails that if X and Y are cor­re­lated, X is not a cause of Y and Y is not a cause of X, then there are com­mon causes of X and Y such that the vari­ables are in­de­pen­dent con­di­tional on those com­mon causes.

• The CMC is not strictly vi­o­lated in physics as far as we know. If you spec­ify the state of the uni­verse for the en­tire past light cone of some event, then you uniquely spec­ify the event. The ex­am­ple that you gave of the rock shoot­ing out of the pond in­deed does not vi­o­late the laws of physics- you sim­ply shoved the causal­ity un­der the rug by claiming that the edge of the pond fluc­tu­ated “spon­ta­neously”. This is not true. The edge of the pond fluc­tu­at­ing was com­pletely speci­fied by the past light cone of that event. This is the sense in which the CMC runs deeper than the 2nd law of ther­mo­dy­nam­ics- be­cause the 2nd “law” is prob­a­bil­is­tic, you can find coun­terex­am­ples to it in an in­finite uni­verse. If you ac­tu­ally found a coun­terex­am­ple to the CMC, it would make physics es­sen­tially im­pos­si­ble.

• I meant “spon­ta­neous” in the or­di­nary ther­mo­dy­namic sense of spon­tane­ity (like when we say sys­tems spon­ta­neously equil­ibri­ate, or that spon­ta­neous fluc­tu­a­tions oc­cur in ther­mo­dy­namic sys­tems), so no vi­o­la­tion of micro­phys­i­cal law was in­tended. Spon­ta­neous here just means there is no dis­cern­able macro­scopic cause of the event. Now it is true that ev­ery­thing that hap­pened in the sce­nario I de­scribed was micro­scop­i­cally de­ter­mined by phys­i­cal law, but this is not enough to satisfy the CMC. What we need is some com­mon cause ac­count of the macro­scopic cor­re­la­tion that leads to a co­her­ent in­ward-di­rected wave, and sim­ply spec­i­fy­ing that the pro­cess is law-gov­erned does not provide such an ac­count. I guess you could just say that the com­mon cause is the ini­tial con­di­tions of the uni­verse, or some­thing like that. If that kind of move is al­lowed, then the CMC is triv­ially satis­fied for ev­ery cor­re­la­tion. But when peo­ple usu­ally ap­peal to the CMC they in­tend some­thing stronger than this. They’re usu­ally talk­ing about a spa­tially lo­cal­ized cause, not an en­tire spa­tial hy­per­sur­face.

If you al­low en­tire hy­per­sur­faces as nodes in your graph, you run into trou­ble. In a de­ter­minis­tic world, any cor­re­la­tion be­tween two prop­er­ties isn’t just screened off by the con­tents of past hy­per­sur­faces, it’s also screened off by the con­tents of fu­ture hy­per­sur­faces. But a fu­ture hy­per­sur­face can’t be a com­mon cause of the cor­re­lated prop­er­ties, so we have a cor­re­la­tion screened off by a node that doesn’t d-sep­a­rate the cor­re­lated vari­ables. This doesn’t vi­o­late the CMC per se, but it does vi­o­late the Faith­ful­ness Con­di­tion, which says that the only con­di­tional in­de­pen­den­cies in na­ture are the ones de­scribed by the CMC. If the Faith­ful­ness Con­di­tion fails, then the CMC be­comes pretty use­less as a tool for dis­cern­ing cau­sa­tion from cor­re­la­tion. The les­sons of Eliezer’s posts would no longer ap­ply. So to rule out rad­i­cal failure of the Faith­ful­ness Con­di­tion in a de­ter­minis­tic set­ting, we have to dis­al­low the con­tents of an en­tire hy­per­sur­face from be­ing treated as a sin­gle node in a causal graph. Nodes should cor­re­spond to suffi­ciently lo­cally in­stan­ti­ated prop­er­ties. But then that re-opens the pos­si­bil­ity that the cor­re­la­tion de­scribed in my ex­am­ple vi­o­lates the CMC. There is no lo­cally in­stan­ti­ated com­mon cause.

If there is some past screener-off of the cor­re­la­tion in the time-re­versed patch, its coun­ter­part would also be a fu­ture screener-off of the cor­re­la­tion in our patch. If we want to say that the Faith­ful­ness Con­di­tion holds in our patch (or at least in this ex­am­ple), we have to rule out fu­ture screen­ers-off, but that also im­plies that the CMC fails in the time-re­versed patch.

• In­dex­i­cally, though, you wouldn’t ex­pect to be talk­ing to a mind that just hap­pened to is­sue some­thing it called pre­dic­tions, which just hap­pened to be cor­re­lated with some un­ob­served cards, would you? I think the CMC doesn’t say that a mind can never be right with­out be­ing causally en­tan­gled with the sys­tem it’s try­ing to be right about; just that if it is right, it’s down to pure chance.

• I think the CMC doesn’t say that a mind can never be right with­out be­ing causally en­tan­gled with the sys­tem it’s try­ing to be right about; just that if it is right, it’s down to pure chance.

No, the CMC says that if you con­di­tion­al­ize on all of the di­rect causes of some vari­able A in some set of vari­ables, then A will be prob­a­bil­is­ti­cally in­de­pen­dent of all other vari­ables in that set ex­cept its effects. This rules out chance cor­re­la­tion. If there were some other vari­able in the set that just hap­pened to be cor­re­lated with A with­out any causal ex­pla­na­tion, then con­di­tion­al­iz­ing on A’s di­rect causes would not in gen­eral elimi­nate this cor­re­la­tion.

• If co­in­ci­dences were a vi­o­la­tion of the CMC, it wouldn’t be a truth at all, would it?

• Well, one could still say it was true in cer­tain en­vi­ron­ments, or true like the Ideal Gas Law is true.

• I am re­ally en­joy­ing these causal­ity posts. Thank you for them and for the skil­lful writ­ing that makes them so read­able.

• Any time there’s a noun, a verb, and a sub­ject[sic], there’s causal­ity.

Coun­terex­am­ples “I know this.” “Ra­tional peo­ple with the same in­for­ma­tion can­not rea­son­ably dis­agree about their con­clu­sions.” “Gen­eral and Spe­cial Rel­a­tivity both re­quire that ob­servers in differ­ent refer­ence frames mea­sure the length of an ar­ti­fact differ­ently.”

I think you might have meant is “Any time that a con­crete sub­ject takes an ac­tion with a di­rect ob­ject”, there’s causal­ity; there’s prob­a­bly a more gen­eral form.

I know that the top card is ei­ther the six of spades or some var­i­ant of ‘rules of poker/​rank­ing of poker hands’. There is no ‘be­cause’ in that sen­tence, be­cause ‘be­cause’ is a word that only has mean­ing in causal terms. Go ahead- test me with the near­est deck of cards.

• Um, let’s see if I get this (think­ing to my­self but post­ing here if any­one hap­pens to find this use­ful—or even in­tel­ligible)...

claiming you know about X with­out X af­fect­ing you, you af­fect­ing X, or X and your be­lief hav­ing a com­mon cause, vi­o­lates the Markov con­di­tion on causal graphs

The causal Markov con­di­tion is that a phe­nomenon is in­de­pen­dent of its noneffects, given its di­rect causes. It is equiv­a­lent to the or­di­nary Markov con­di­tion for Bayesian nets (any node in a net­work is con­di­tion­ally in­de­pen­dent of its non­de­scen­dents, given its par­ents) when the struc­ture of a Bayesian net­work ac­cu­rately de­picts causal­ity.

So, this con­di­tion in­duces cer­tain (con­di­tional) in­de­pen­den­cies be­tween nodes in a causal graph (that can be found us­ing the D-sep­a­ra­tion trick), and when we find two such nodes, they must also be un­cor­re­lated (this fol­lows from prob­a­bil­is­tic in­de­pen­dence be­ing a stronger prop­erty than un­cor­re­lat­ed­ness).

If one there­fore claims there’s a per­sis­tent cor­re­la­tion be­tween X and be­lief about X, this means there’s got to be some ac­tive path in Bayesian net­work for prob­a­bil­is­tic in­fluence to flow be­tween them—oth­er­wise, X and Belief(X) would be D-sep­a­rated and thereby in­de­pen­dent and un­cor­re­lated. In­sist­ing there’s no such path (e.g. no chain of di­rected links) leads to vi­o­la­tion of Markov con­di­tion, since it main­tains there’s prob­a­bil­is­tic de­pen­dence be­tween two nodes in a graph that can­not be ac­counted for by the causal links cur­rently in the graph.

• More gen­er­ally, for me to ex­pect your be­liefs to cor­re­late with re­al­ity, I have to ei­ther think that re­al­ity is the cause of your be­liefs, ex­pect your be­liefs to al­ter re­al­ity, or be­lieve that some third fac­tor is in­fluenc­ing both of them.

I can con­struct ex­am­ples where for this to be true re­quires us to treat math­e­mat­i­cal truths as causes. Of course, this causes prob­lems for the Bayesian defi­ni­tion of “cause”.

• Yes. An ar­gu­ment similar to this should still be in the other-ed­ited ver­sion of my un­finished TDT pa­per, in­volv­ing a calcu­la­tor on Venus and a calcu­la­tor on Mars, the point be­ing that if you’re not log­i­cally om­ni­scient then you need to fac­tor out log­i­cal un­cer­tainty for the Markov prop­erty to hold over your causal graphs, be­cause phys­i­cally speak­ing, all com­mon causes should’ve been screened off by ob­serv­ing the calcu­la­tors’ ini­tial phys­i­cal states on Earth. Of course, it doesn’t fol­low that we have to fac­tor out log­i­cal un­cer­tainty as a causal node that works like ev­ery other causal node, but we’ve got to fac­tor it out some­how.

• My point is more gen­eral than this. Namely, that a calcu­la­tor on Earth and a calcu­la­tor made by aliens in the An­dromeda galaxy would cor­re­spond de­spite hu­mans and the An­dromedeans never hav­ing had any con­tact.

• Of course, it doesn’t fol­low that we have to fac­tor out log­i­cal un­cer­tainty as a causal node that works like ev­ery other causal node

Is there some rea­son not to treat log­i­cal stuff as nor­mal causal nodes? Does that cause us ac­tual trou­ble, or is it just a bit con­fus­ing some­times?

• In causal mod­els, we can have A → B, E → A, E → ~B. Log­i­cal un­cer­tainty does not seem off­hand to have the same struc­ture as causal un­cer­tainty.

• You seem to be con­fus­ing the causal ar­row with the log­i­cal ar­row. As en­do­self points out here proofs log­i­cally im­ply their the­o­rems, but a the­o­rem causes its proof.

• Math­e­mat­i­cal truths do be­have like causes. Re­mem­ber, Bayesian prob­a­bil­ities rep­re­sent sub­jec­tive un­cer­tainty. Yes, my un­cer­tainty about the Rie­mann hy­poth­e­sis is cor­re­lated with my un­cer­tainty about other math­e­mat­i­cal facts is the same way that my un­cer­tainty about some phys­i­cal facts is cor­re­lated with my un­cer­tainty about oth­ers, so I can rep­re­sent them both as Bayesian net­works (re­ally, one big Bayesian net­work, as my un­cer­tainty about math is also cor­re­lated with my un­cer­tainty about the world).

• Can you provide an ex­am­ple? I would claim that for any model in which you have a math­e­mat­i­cal truth as a node in a causal graph, you can re­place that node by what­ever se­ries of phys­i­cal events caused you to be­lieve that math­e­mat­i­cal truth.

• I add 387+875 to get 1262, from this I can con­clude that any­one else do­ing the same com­pu­ta­tion will get the same an­swer de­spite never hav­ing in­ter­acted with them.

• You can’t con­clude that un­less you are aware of the con­tin­gent fact that they are ca­pa­ble of get­ting the an­swer right.

• “The same com­pu­ta­tion” doesn’t cover that?

• Why would you want a math­e­mat­i­cal truth on a causal graph? Are the tran­sa­tion prob­a­bil­ities ever go­ing to be less than 1.0?

• The tran­si­tion prob­a­bil­ities from the math­e­mat­i­cal truth on some­thing non-math­e­mat­i­cal will cer­tainly be less than 1.0.

• And the tran­si­tion prob­a­bil­ities to a truth will be 1.0. So why write it in? It would be like sprinkiling a cir­cuit di­a­gram with zero ohm re­sis­tors.

• Be­cause oth­er­wise the state­ment I quoted in the great-great-grand­par­ent be­comes false.

• Inas­much as you have stipu­lated that “perform­ing the same calcu­la­tion” means “perfor­ing the same calcu­la­tion cor­recly”, rahter than some­thing like “launch­ing the same al­gorithm but pos­si­bly crash­ing”, your state­ment is tau­tolo­gous. In fact, it isa spe­cial case of the gen­eral state­ment that any­one suc­ces­fully perform­ing a calcu­la­tion will get the same re­sult as ev­ery­one else. But why woud you want to use a causal di­a­gr­tam to rep­re­sent a tu­at­lot­l­ogy? The two have differ­ent prop­er­ties. Causal di­a­grams have <1.0 tran­si­tion prob­a­bil­ities, which tau­tolo­gies don’t. Tau­tolo­gies have con­cpet­u­ally in­tel­ligible re­la­tion­ships be­tween their parts, which causal di­a­grams don’t.

• Ob­serve that your two ob­jec­tions can­cel each other out. If some­one performs the same calcu­la­tion, there is a sig­nifi­cant (but <1.0) chance that it will be done cor­rectly.

• What has that to do with math­em­mat­ica truth? You might as well say that if some­one fol­lows the same recipe there e is a sig­nifi­cant chance that the same dish will be pro­duced. Inas­much as you are tak­ling about somet­ing that can hap­haz­ardly fail, you are not talk­ing about math­e­mat­i­cal truth.

• I can pre­dict what some­one else will con­clude, with­out any causal re­la­tion­ship, in the con­ven­tional sense, be­tween us.

• Your pre­dic­tion is a pre­dic­tion of what some­one else will con­clude, given a set of ini­tial con­di­tions (the math­e­mat­i­cal prob­lem) and a set of rules to ap­ply to these con­di­tions. The con­clu­sion that you ar­rive at is a causal de­scen­dant of the prob­lem and the rules of math­e­mat­ics; the con­clu­sion that the other per­son ar­rives at is a causal de­scen­dant of the same ini­tial prob­lem and the same rules.

That’s the causal link.

• That’s my point. Speci­fi­cally, that one should have nodes in one’s causal di­a­gram for math­e­mat­i­cal truths, what you called “rules of math­e­mat­ics”.

• Surely the node should be “per­son X was taught ba­sic math­e­mat­ics”, and not math­e­mat­ics it­self?

• The point of hav­ing the node is to have a com­mon cause of per­son X’s be­liefs about math­e­mat­ics and per­son Y’s be­liefs about math­e­mat­ics that ex­plains why these two be­liefs are cor­re­lated even if both dis­cov­ered said math­e­mat­ics in­ter­de­pen­dently.

• What has that to do with any causal pow­ers of math­e­mat­i­cal truth?

• If you what your causal graph to have the prop­erty I quoted here, you need to add nodes for math­e­mat­i­cal truths.

• Two peo­ple can ar­rive at the same solu­tion to a cross­word, but that does not mean there is a Cru­civer­bial Truth that has causal pow­ers.

• Yes it does. In this case said truth even has a phys­i­cal man­i­fes­ta­tion, i.e., as the cross­word-writer’s solu­tion as it ex­ists in some com­bi­na­tion of his head and his notes which is causal to the form of the cross­word the solver sees.

• It only has a phys­i­cal man­i­fes­ta­tion. Cru­civer­bial Truth only sum­marises what could have been ar­rived at by a mas­sively fine-grained ex­am­in­i­na­tion of the cross­word-solver’s neu­rol­ogy. It doesn’t have causal pow­ers of its own. Its re­dun­dant in re­la­tion to physics.

• To an­swer your dis­cus­sion about ran­dom­iz­ing the con­trol groups and ex­per­i­men­tal groups- you don’t use ran­dom­ness or noise to di­vide those groups. You di­vide the pop­u­la­tion for study into the num­ber of groups you need, and make that di­vi­sion such that those groups are as close to iden­ti­cal as pos­si­ble, us­ing all of the data you have on all of them.

Ther­mal noise and pseudo-ran­dom num­bers can be used to break ties, but only be­cause if there were any known dis­tinc­tion be­tween the two out­comes, the clas­sifi­ca­tion would be de­ter­minis­tic.

• “uni­verse is a con­nected fabric of causes and effects.”

I do not think that the uni­verse as a whole is one fabric of causes and effects. There are iso­lat­ing lay­ers of ran­dom­ness and chaos upon which there are new lay­ers of emer­gence. This is why we can model at all with­out hav­ing one unified model.

“Every causally sep­a­rated group of events would es­sen­tially be its own re­al­ity.”

Places out­side our so­lar sys­tem are their own re­al­ities in that sense. We have no effect there. Only maybe some­one is there to am­plify our ra­dio sig­nals.

• Hav­ing spent a re­gret­tably large amount of time on fo­rums where the ‘mag­is­te­ria’ type ques­tions were had, I think that you’re rep­re­sent­ing the ‘out­side of sci­ence’ po­si­tion slightly un­fairly. Ob­vi­ously, it of­ten tries to have its cake and eat it. But you’re sub­sti­tut­ing ‘stan­dard ra­tio­nal­ity’, or per­haps ‘ques­tions of cause and effect’ for ‘sci­ence’. Some mag­is­te­ria-types would say that there are di­rect causal effects from God or ghosts, but that these do not man­i­fest with the reg­u­lar­ity of things that you’re likely to be able to find through sci­en­tific ex­per­i­ment. They think that the world is bet­ter ex­plained by in­clud­ing God or ghosts, but that you can’t de­vise an ex­per­i­ment to prove/​dis­prove them (for a va­ri­ety of rea­sons, up to and in­clud­ing ‘the ghosts don’t come out when you’re try­ing to test if they ex­ist’.

This is aside from the peo­ple who ba­si­cally mean that their re­li­gion or what­ever is just sub­jec­tive.

• He dis­cusses that dis­tinc­tion here.

• Pre­vi­ous koan:

“You say that a uni­verse is a con­nected fabric of causes and effects. Well, that’s a very Western view­point—that it’s all about mechanis­tic, de­ter­minis­tic stuff. I agree that any­thing else is out­side the realm of sci­ence, but it can still be real, you know. My cousin is psy­chic—if you draw a card from his deck of cards, he can tell you the name of your card be­fore he looks at it. There’s no mechanism for it—it’s not a causal thing that sci­en­tists could study—he just does it. Same thing when I com­mune on a deep level with the en­tire uni­verse in or­der to re­al­ize that my part­ner truly loves me. I agree that purely spiritual phe­nom­ena are out­side the realm of causal pro­cesses that can be stud­ied by ex­per­i­ments, but I don’t agree that they can’t be real.”

that’s pretty much the worst koan I’ve ever heard

• You’re writ­ing this in­stead of Harry Pot­ter fan­fic? Sigh.

• This is ac­tual day-job stuff.

• I was un­der the im­pres­sion that the HPMoR story was to en­tice peo­ple to be­come “more ra­tio­nal”, that is, get them to read more of the “day-job” stuff. There was also sup­posed to be an ac­tual book on ra­tio­nal­ity, but it looks like that’s been put on hold as well. Which to me seemed like a wise de­ci­sion, since more peo­ple were be­ing led to sim­ply read the se­quences via HPMoR already, so why bother with a book?

• Which to me seemed like a wise de­ci­sion, since more peo­ple were be­ing led to sim­ply read the se­quences via HPMoR already, so why bother with a book?

What ev­i­dence do you have for this? I re­call some stats from the last cen­sus which in­di­cated that LWers referred here by HPMoR were less likely to have read the se­quences and be ac­tive par­ti­ci­pants than the gen­eral pop­u­la­tion.

• Try read­ing through Mys­te­ri­ous An­swers to Mys­te­ri­ous Ques­tions. You might ac­tu­ally find your­self en­joy­ing it!