# Beyond Statistics 101

## Is statis­tics be­yond in­tro­duc­tory statis­tics im­por­tant for gen­eral rea­son­ing?

Ideas such as re­gres­sion to the mean, that cor­re­la­tion does not im­ply cau­sa­tion and base rate fal­lacy are very im­por­tant for rea­son­ing about the world in gen­eral. One gets these from a deep un­der­stand­ing of statis­tics 101, and the ba­sics of the Bayesian statis­ti­cal paradigm. Up un­til one year ago, I was un­der the im­pres­sion that more ad­vanced statis­tics is tech­ni­cal elab­o­ra­tion that doesn’t offer ma­jor ad­di­tional in­sights into think­ing about the world in gen­eral.

Noth­ing could be fur­ther from the truth: ideas from ad­vanced statis­tics are es­sen­tial for rea­son­ing about the world, even on a day-to-day level. In hind­sight my prior be­lief seems very naive – as far as I can tell, my only rea­son for hold­ing it is that I hadn’t heard any­one say oth­er­wise. But I hadn’t ac­tu­ally looked ad­vanced statis­tics to see whether or not my im­pres­sion was jus­tified :D.

Since then, I’ve learned some ad­vanced statis­tics and ma­chine learn­ing, and the ideas that I’ve learned have rad­i­cally al­tered my wor­ld­view. The “offi­cial” pre­req­ui­sites for this ma­te­rial are calcu­lus, differ­en­tial mul­ti­vari­able calcu­lus, and lin­ear alge­bra. But one doesn’t ac­tu­ally need to have de­tailed knowl­edge of these to un­der­stand ideas from ad­vanced statis­tics well enough to benefit from them. The prob­lem is ped­a­gog­i­cal: I need to figure out how how to com­mu­ni­cate them in an ac­cessible way.

## Ad­vanced statis­tics en­ables one to reach nonob­vi­ous conclusions

To give a bird’s eye view of the per­spec­tive that I’ve ar­rived at, in prac­tice, the ideas from “ba­sic” statis­tics are gen­er­ally use­ful pri­mar­ily for dis­prov­ing hy­pothe­ses. This pushes in the di­rec­tion of a state of rad­i­cal ag­nos­ti­cism: the idea that one can’t re­ally know any­thing for sure about lots of im­por­tant ques­tions. More ad­vanced statis­tics en­ables one to be­come jus­tifi­ably con­fi­dent in nonob­vi­ous con­clu­sions, of­ten even in the ab­sence of for­mal ev­i­dence com­ing from the stan­dard sci­en­tific prac­tice.

## IQ re­search and PCA as a case study

In the early 20th cen­tury, the psy­chol­o­gist and statis­ti­cian Charles Spear­man dis­cov­ered the the g-fac­tor, which is what IQ tests are de­signed to mea­sure. The g-fac­tor is one of the most pow­er­ful con­structs that’s come out of psy­chol­ogy re­search. There are many fac­tors that played a role in en­abling Bill Gates abil­ity to save per­haps mil­lions of lives, but one of the most salient fac­tors is his IQ be­ing in the top ~1% of his class at Har­vard. IQ re­search helped the Gates Foun­da­tion to rec­og­nize io­dine sup­ple­men­ta­tion as a nu­tri­tional in­ter­ven­tion that would im­prove so­cioe­co­nomic prospects for chil­dren in the de­vel­op­ing world.

The work of Spear­man and his suc­ces­sors on IQ con­sti­tute one of the pin­na­cles of achieve­ment in the so­cial sci­ences. But while Spear­man’s dis­cov­ery of IQ was a great dis­cov­ery, it wasn’t his great­est dis­cov­ery. His great­est dis­cov­ery was a dis­cov­ery about how to do so­cial sci­ence re­search. He pi­o­neered the use of fac­tor anal­y­sis, a close rel­a­tive of prin­ci­pal com­po­nent anal­y­sis (PCA).

## The philos­o­phy of di­men­sion­al­ity reduction

PCA is a di­men­sion­al­ity re­duc­tion method. Real world data of­ten has the sur­pris­ing prop­erty of “di­men­sion­al­ity re­duc­tion”: a small num­ber of la­tent vari­ables ex­plain a large frac­tion of the var­i­ance in data.

This is re­lated to the effec­tive­ness of Oc­cam’s ra­zor: it turns out to be pos­si­ble to de­scribe a sur­pris­ingly large amount of what we see around us in terms of a small num­ber of vari­ables. Only, the vari­ables that ex­plain a lot usu­ally aren’t the vari­ables that are im­me­di­ately visi­blein­stead they’re hid­den from us, and in or­der to model re­al­ity, we need to dis­cover them, which is the func­tion that PCA serves. The small num­ber of vari­ables that drive a large frac­tion of var­i­ance in data can be thought of as a sort of “back­bone” of the data. That en­ables one to un­der­stand the data at a “macro /​ big pic­ture /​ struc­tural” level.

This is a very long story that will take a long time to flesh out, and do­ing so is one of my main goals.

• “im­pres­sion that more ad­vanced statis­tics is tech­ni­cal elab­o­ra­tion that doesn’t offer ma­jor ad­di­tional in­sights”

Why did you have this im­pres­sion?

Sorry for the off-topic, but I see this a lot in LessWrong (as a ca­sual reader). Peo­ple seem to fo­cus on tex­tual, deep-sound­ing, wow-in­duc­ing ex­po­si­tions, but of­ten dis­like the tech­ni­cal­ities, get­ting hands dirty with ac­tu­ally un­der­stand­ing calcu­la­tions, equa­tions, for­mu­las, de­tails of al­gorithms etc (calcu­la­tions that don’t tickle those wow-re­cep­tors that we all have). As if these were merely some minor ad­di­tions over the re­ally im­por­tant big pic­ture view. As I see it this move­ment seems to try to build up a new back­bone of knowl­edge from scratch. But do­ing this they re­peat the mis­takes of the past philoso­phers. For ex­am­ple go­ing for the “deep”, out­look-trans­form­ing texts that of­ten give a delu­sional feel­ing of “oh now I un­der­stand the whole world”. It’s easy to have wow-mo­ments with­out ac­tu­ally hav­ing un­der­stood some­thing new.

So yes, PCA is use­ful and most statis­tics and maths and com­puter sci­ence is use­ful for un­der­stand­ing stuff. But then you swing to the other ex­treme and say “ideas from ad­vanced statis­tics are es­sen­tial for rea­son­ing about the world, even on a day-to-day level”. Tell me how ex­actly you’re plan­ning to use PCA day-to-day? I think you may mean you want to use some “in­sight” that you gained from it. But I’m not sure what that would be. It seems to be a car­toon­ish dis­tor­tion that makes it fit into an ide­ol­ogy.

Any­way, main­stream ma­chine learn­ing is very use­ful. And it’s usu­ally much more in­tri­cate and com­pli­cated than to be able to pro­duce a deep ev­ery­day in­sight out of it. I think the sooner you lose the need for ev­ery­thing to res­onate deeply or have a con­cise in­sight­ful sum­mary, the bet­ter.

• Why did you have this im­pres­sion?

Prob­a­bly be­cause of the hu­man ten­dency to over­es­ti­mate the im­por­tance of any knowl­edge one hap­pens to have and un­der­es­ti­mate the im­por­tance of any knowl­edge one doesn’t. (Is there a name for this bias?)

• Why did you have this im­pres­sion?

Group­think I guess: other peo­ple who I knew didn’t think that it’s so im­por­tant (de­spite be­ing peo­ple who are very well ed­u­cated by con­ven­tional stan­dards, top ~1% of elite col­leges).

Tell me how ex­actly you’re plan­ning to use PCA day-to-day?

Dis­claimer: I know that I’m not giv­ing enough ev­i­dence to con­vince you: I’ve thought about this for thou­sands of hours (in­clud­ing work­ing through many quan­ti­ta­tive ex­am­ples) and it’s tak­ing me a long time to figure out how to or­ga­nize what I’ve learned.

I already have been us­ing di­men­sion­al­ity re­duc­tion (qual­i­ta­tively) in my day to day life, and I’ve found that it’s greatly im­proved my in­ter­per­sonal re­la­tion­ships be­cause it’s made it much eas­ier to guess where peo­ple are com­ing from (be­fore peo­ple’s so­cial be­hav­ior had seemed like a com­pli­cated blur be­cause I saw so many vari­ables with­out hav­ing started to cor­rectly iden­tify the la­tent ones).

i think the sooner you lose the need for ev­ery­thing to res­onate deeply or have a con­cise in­sight­ful sum­mary, the bet­ter.

You seem to be mak­ing overly strong as­sump­tions with in­suffi­cient ev­i­dence: how would you know whether this was the case, never hav­ing met me? ;-)

• Qual­i­ta­tive day-to-day di­men­sion­al­ity re­duc­tion sounds like woo to me. Not a bit more con­vinc­ing than quan­tum woo (Deepak Cho­pra et al.). What­ever you’re do­ing, it’s surely not like do­ing SVD on a data ma­trix or eigen-de­com­po­si­tion on the co­var­i­ance ma­trix of your ob­ser­va­tions.

Of course, you can of­ten iden­tify mo­ti­va­tions be­hind peo­ple’s ac­tions. A lot of psy­chol­ogy is ba­si­cally try­ing to un­cover these mo­ti­va­tions. Ba­si­cally an in­ten­tional in­ter­pre­ta­tion and a the­ory of mind are ex­am­ples of di­men­sion­al­ity re­duc­tion in some sense. In­stead of ex­plain­ing be­hav­ior by rea­son­ing about re­cep­tors and neu­rons, you imag­ine a con­scious agent with be­liefs, de­sires and in­ten­tions. You could also link it to data com­pres­sion (di­men­sion­al­ity re­duc­tion is a sort of lossy data com­pres­sion). But I wouldn’t say I’m us­ing ad­vanced data com­pres­sion al­gorithms when play­ing with my dog. It just sounds pre­ten­tious and shows a des­per­ate need to sig­nal smart­ness.

So, what is the ev­i­dence that you are con­sciously do­ing some­thing similar to PCA in so­cial life? Do you write down vari­ables and num­bers, or how can I imag­ine qual­i­ta­tive di­men­sion­al­ity re­duc­tion. How is it differ­ent from some­body just get­ting an opinion in­tu­itively and then jus­tify­ing it with af­ter­wards?

• Your tone is con­de­scend­ing, far out­side of po­lite­ness norms. In the past I would have un­char­i­ta­bly writ­ten this off to you be­ing de­praved, but I’ve re­al­ized that I should be mak­ing a stronger effort to un­der­stand other peo­ple’s per­spec­tives. So can you help me un­der­stand where you’re com­ing from on an emo­tional level?

• You asked about emo­tional stuff so here is my per­spec­tive. I have ex­tremely weird feel­ings about this whole fo­rum that may af­fect my writ­ing style. My view is con­stantly pop­ping back and forth be­tween differ­ent views, like in the rab­bit-duck gestalt image. On one hand I of­ten see in­ter­est­ing and very good ar­gu­ments, but on the other hand I see tons of red flags pop­ping up. I feel that I need to main­tain ex­treme men­tal efforts to stay “sane” here. Maybe I should re­frain from com­ment­ing. It’s a pity be­cause I’m gen­er­ally very in­ter­ested in the top­ics dis­cussed here, but the tone and the un­der­ly­ing ide­ol­ogy is push­ing me away. On the other hand I feel an urge to check out the posts de­spite this effect. I’m not sure what as­pect of cer­tain fo­rums have this psy­cholog­i­cal effect on my think­ing, but I’ve felt it on var­i­ous red­dit com­mu­ni­ties as well.

• On one hand I of­ten see in­ter­est­ing and very good ar­gu­ments, but on the other hand I see tons of red flags pop­ping up. I feel that I need to main­tain ex­treme men­tal efforts to stay “sane” here.

Se­conded, ac­tu­ally, and it’s par­tic­u­lar to LessWrong. I know I of­ten joke that post­ing here gets treated as sub­mit­ting aca­demic ma­te­rial and skew­ered ac­cord­ingly, but that is very much what it feels like from the in­side. It feels like con­fronting a hos­tile crowd of, as Jonah put it, rad­i­cal ag­nos­tics, ev­ery sin­gle time one posts, and they’re wait­ing for you to say some­thing so they can jump down your throat about it.

Oh, and then you run into the is­sue of hav­ing rad­i­cally differ­ent pri­ors and be­liefs, so that you find your­self on a “ra­tio­nal­ity” site where some­one is sud­denly us­ing the term “global warm­ing be­liever” as though the IPCC never is­sued mul­ti­ple re­ports full of statis­ti­cal ev­i­dence. I mean, sure, I can put some prob­a­bil­ity on, “It’s all a con­spir­acy and the offi­cial sci­en­tists are ly­ing”, but for me that’s in the “non­sense zone”—I ac­tu­ally take offense to be­ing asked to jus­tify my be­lief in main­stream sci­ence.

As much as “good Bayesi­ans” are never sup­posed to agree to dis­agree, I would very much like if peo­ple would be up-front about their pri­ors and be­liefs, so that we can both de­cide whether it’s worth the en­ergy spent on long threads of try­ing to con­vince peo­ple of things.

• Oh, and then you run into the is­sue of hav­ing rad­i­cally differ­ent pri­ors and be­liefs, so that you find your­self on a “ra­tio­nal­ity” site where some­one is sud­denly us­ing the term “global warm­ing be­liever” as though the IPCC never is­sued mul­ti­ple re­ports full of statis­ti­cal ev­i­dence.

Rather bad statis­ti­cal ev­i­dence I might add. Se­ri­ously, your ar­gu­ment amounts to an ap­peal to au­thor­ity. What­ever hap­pened to nul­lius in verba?

I mean, sure, I can put some prob­a­bil­ity on, “It’s all a con­spir­acy and the offi­cial sci­en­tists are ly­ing”,

Some of them are, a lot of them were even caught when the cli­mate­gate emails went pub­lic. Most of them, how­ever, are some com­bi­na­tion of ide­ologues and peo­ple who couldn’t han­dle the harder sci­ences and are now mem­o­riz­ing the teacher’s pass­word, in other words a prospiracy. Add in what hap­pens to cli­mate jour­nals that dare pub­lish any­thing in­suffi­ciently alarmist and one gets the idea about the cur­rent state of cli­mate sci­ence.

• Ap­peal to Author­ity? Not in the nor­mal sense that the IPCC ex­er­cises vi­o­lent force, and I there­fore des­ig­nate them fac­tu­ally cor­rect. No, it’s an Ap­peal to Ex­per­tise Out­side My Own Do­main. It’s me ex­pect­ing that the same aca­demic and sci­en­tific pro­cesses and meth­ods that pro­duced my ex­per­tise in my fields pro­duced do­main-ex­perts in other fields with their own ex­per­tise, and that I can there­fore trust in their find­ings about as thor­oughly as I trust in my own.

• Ap­peal to Author­ity? Not in the nor­mal sense that the IPCC ex­er­cises vi­o­lent force, and I there­fore des­ig­nate them fac­tu­ally cor­rect.

That’s not the nor­mal sense of ap­peal to au­thor­ity, that would be ap­peal to force.

No, it’s an Ap­peal to Ex­per­tise Out­side My Own Do­main.

And how do you know that they’re ac­tual ex­perts? Be­cause they (metaphor­i­cally) wear lab coats? That’s what ap­peal to au­thor­ity is. While it’s not nec­es­sar­ily a fal­lacy, it’s no­table that sci­ence started mak­ing progress as soon as peo­ple dis­avowed us­ing it.

• Do you be­lieve that the mass of the muon as listed by the Par­ti­cle Data Group is at least ap­prox­i­mately cor­rect? If so, why?

• If you ask a physi­cist or an evolu­tion­ist why their be­liefs are cor­rect they will gen­er­ally give you an an­swer (or at least start talk­ing about the gen­eral prin­ci­pal). If you ask that ques­tion about cli­mate sci­ence you’ll gen­er­ally get ei­ther a di­rect ap­peal to au­thor­ity or an in­di­rect one: it’s all in this offi­cial re­port which I haven’t read but it’s offi­cial so it must be cor­rect.

Heck cli­mate sci­en­tists aren’t even that spar­ing about ba­sic facts. They’ll men­tion that CO2 is a green­house gas, but avoid any more tech­ni­cal ques­tions. For ex­am­ple, I only re­cently found out that (in the ab­sence of other fac­tors or any feed­back) tem­per­a­ture is a log­a­r­ith­mic func­tion of CO2 con­cen­tra­tion.

• Heck cli­mate sci­en­tists aren’t even that spar­ing about ba­sic facts. They’ll men­tion that CO2 is a green­house gas, but avoid any more tech­ni­cal ques­tions. For ex­am­ple, I only re­cently found out that (in the ab­sence of other fac­tors or any feed­back) tem­per­a­ture is a log­a­r­ith­mic func­tion of CO2 con­cen­tra­tion.

So this seems like you’ve never cracked open any cli­mate/​at­mo­spheric sci­ence text­book? Be­cause that is pretty ba­sic info. It seems like you’re de­ter­mined to be skep­ti­cal de­spite not re­ally spend­ing much time learn­ing about the state of the sci­ence. Also it sounds like you are equiv­o­cat­ing be­tween “cli­mate sci­en­tist” and “per­son on the in­ter­net who be­lieves in global warm­ing.”

My back­ground is par­ti­cle physics, if some­one asked me about the mass of a muon, I’d have to make about a hun­dred ap­peals to au­thor­ity to give them any rele­vant in­for­ma­tion, and I sus­pect cli­mate sci­en­tists are in the same boat when talk­ing to peo­ple who don’t un­der­stand some of the ba­sics. I’ve per­son­ally en­gaged with spe­cial rel­a­tivity crack­pots who ask you to jus­tify ev­ery­thing, and keep say­ing this or that ba­sic fact from the field is an ap­peal to au­thor­ity. There is no con­vinc­ing a de­ter­mined skep­tic, so it’s best not to en­gage.

If you are near a uni­ver­sity cam­pus, wait un­til there is a tech­ni­cal talk on cli­mate mod­el­ling and go sit and listen (don’t ask ques­tions, just listen). You’ll prob­a­bly be sur­prised at how vo­cif­er­ous the de­bate is- cli­mate mod­el­ers are se­ri­ous sci­en­tists work­ing hard on perfect­ing their mod­els.

• I haven’t tracked down the spe­cific ev­i­dence—but muons are com­par­a­tively easy: They live long enough to leave tracks in par­ti­cle de­tec­tors with known mag­netic fields. That gives you the charge-to-mass ra­tio. Given that charge looks quan­tized (Milliken oil drop ex­per­i­ment and umpteen rep­e­ti­tions), and there are other pieces of ev­i­dence from the par­ti­cle tracks of muon de­cay (and the elec­trons from that de­cay again leave tracks, and the an­gles are visi­ble even if the neu­trinos aren’t) - I’d be sur­prised if the muon mass wasn’t pretty solid.

• As­sum­ing that both par­ti­cle physi­cists and cli­ma­tol­o­gists are do­ing things prop­erly, that would only mean that the muon mass has much smaller er­ror bars than the global warm­ing (which it does), not that the former is more likely to be cor­rect within its er­ror bars.

Then again, it’s pos­si­ble that cli­ma­tol­o­gists are less likely to be do­ing things prop­erly.

• Thanks so much for shar­ing. I’m as­ton­ished by how much more fruit­ful my re­la­tion­ships have be­came since I’ve started ask­ing.

I think that a lot of what you’re see­ing is a cul­tural clash: differ­ent com­mu­ni­ties have differ­ent blindspots and norms for com­mu­ni­ca­tion, and a lot of times the com­bi­na­tion of (i) blindspots of the com­mu­ni­ties that one is fa­mil­iar with and (ii) re­spects in which a new com­mu­nity ac­tu­ally is un­sound can give one the im­pres­sion “these peo­ple are be­yond the pale!” when the ac­tual situ­a­tion is that they’re no less ra­tio­nal than mem­bers of one’s own com­mu­ni­ties.

I had a very similar ex­pe­rience to your own com­ing from academia, and wrote a post ti­tled The Im­por­tance of Self-Doubt in which I raised the con­cern that Less Wrong was func­tion­ing as a cult. But since then I’ve re­al­ized that a lot of the ap­par­ently weird be­liefs on LWers are in fact also be­lieved by very cred­ible peo­ple: for ex­am­ple, Bill Gates re­cently ex­pressed se­ri­ous con­cern about AI risk.

If you’re new to the com­mu­nity, you’re prob­a­bly un­fa­mil­iar with my own cre­den­tials which should re­as­sure you some­what:

• I did a PhD in pure math un­der the di­rec­tion of Nathan Dun­field, who coau­thored pa­pers with Bill Thurston, who for­mu­lated the ge­ometriza­tion con­jec­ture which Perel­man proved and in do­ing so won one of the Clay Millen­nium Prob­lems.

• I’ve been deeply in­volved with math ed­u­ca­tion for highly gifted chil­dren for many years. I worked with the per­son who won the Amer­i­can Math So­ciety prize for best un­der­grad­u­ate re­search when he was 12.

• I worked at GiveWell, which part­ners with with Good Ven­tures, Dustin Moskovitz’s foun­da­tion.

• I’ve done ful­ls­tack web de­vel­op­ment, mak­ing an asyn­chronous clone of Stack­Overflow (link).

• I’ve done ma­chine learn­ing, re­dis­cov­er­ing lo­gis­tic re­gres­sion, col­lab­o­ra­tive fil­ter­ing, hi­er­ar­chi­cal mod­el­ing, the use of prin­ci­pal com­po­nent anal­y­sis to deal with mul­ti­col­lin­ear­ity, and cross val­i­da­tion. (I found the ex­po­si­tions so poor that it was faster for me to work things out on my own than to learn from them, though I even­tu­ally learned the offi­cial ver­sions).You can read some de­tails of things that I found here. I did a pro­ject im­ple­ment­ing Bayesian ad­just­ment of Yelp restau­rant star rat­ings us­ing their pub­lic dataset here

So I imag­ine that I’m cred­ible by your stan­dards. There are other peo­ple in­volved in the com­mu­nity who you might find even more cred­ible. For ex­am­ple: (a) Paul Chris­ti­ano who was an in­ter­na­tional math olympiad medal­ist, wrote a 50 page pa­per on quan­tum com­pu­ta­tional com­plex­ity with Scott Aaron­son as an un­der­grad­u­ate at MIT, and is a the­o­ret­i­cal CS grad stu­dent at Berkeley. (b) Ja­cob Stein­hardt, a Hertz grad­u­ate fel­low who does ma­chine learn­ing re­search un­der Percy Liang at Stan­ford.

So you’re not ac­tu­ally in some sort of twilight zone. I share some of your con­cerns with the com­mu­nity, but the group­think here is no stronger than the group­think pre­sent in academia. I’d be happy to share my im­pres­sions of the rel­a­tive sound­ness of the var­i­ous LW com­mu­nity prac­tices and be­liefs.

• There are other peo­ple in­volved in the com­mu­nity who you might find even more cred­ible. For ex­am­ple: (a) Paul Chris­ti­ano who was an in­ter­na­tional math olympiad medal­ist, wrote a 50 page pa­per on quan­tum com­pu­ta­tional com­plex­ity with Scott Aaron­son as an un­der­grad­u­ate at MIT, and is a the­o­ret­i­cal CS grad stu­dent at Berkeley. (b) Ja­cob Stein­hardt, a Hertz grad­u­ate fel­low who does ma­chine learn­ing re­search un­der Percy Liang at Stan­ford.

Of course, Chris­ti­ano tends to is­sue dis­claimers with his MIRI-branded AGI safety work, ex­plic­itly stat­ing that he does not be­lieve in alarmist UFAI sce­nar­ios. Which is fine, in it­self, but it does show how peo­ple ex­pect some­one as­so­ci­ated with these com­mu­ni­ties to sound.

And Ja­cob Stein­hardt hasn’t ex­actly en­dorsed any “Twilight Zone” com­mu­nity norms or pro­pa­ganda views. Errr, is there a term for “things ev­ery­one in a group thinks ev­ery­one else be­lieves, whether or not they ac­tu­ally do”?

• I’m not claiming oth­er­wise: I’m merely say­ing that Paul and Ja­cob don’t dis­miss LWers out of hand as ob­vi­ously crazy, and have in fact found the com­mu­nity to be worth­while enough to have par­ti­ci­pated sub­stan­tially.

• I think in this case we have to taboo the term “LWers” ;-). This com­mu­nity has many pieces in it, and two large parts of the origi­nal core are “techno-liber­tar­ian Over­com­ing Bias read­ers with many very non-main­stream be­liefs that they claim are much more ra­tio­nal than any­one else’s be­liefs” and “the SL4 mailing list wear­ing suits and try­ing to act pro­fes­sional enough that they might ac­tu­ally ac­com­plish their Shock Level Four dreams.”

On the other hand, in the pro­cess of the site’s growth, it has even­tu­ally come to en­com­pass those two de­mo­graph­ics plus, to some limited ex­tent, al­most ev­ery­one who’s will­ing to as­sent that sci­ence, statis­ti­cal rea­son­ing, and the neuro/​cog­ni­tive sci­ences ac­tu­ally re­ally work and should be taken se­ri­ously. With spe­cial em­pha­sis on statis­ti­cal rea­son­ing and cog­ni­tive sci­ences.

So the core de­mo­graphic con­sists of Very Unusual Peo­ple, but the periph­ery de­mo­graph­ics, who now make up most of the com­mu­nity, con­sist of only Mildly Unusual Peo­ple.

• Yes, this seems like a fair as­sess­ment o the situ­a­tion. Thanks for dis­en­tan­gling the is­sues. I’ll be more pre­cise in the fu­ture.

• Those are in­deed im­pres­sive things you did. I agree very much with your post from 2010. But the fact that many peo­ple have this ini­tial im­pres­sion shows that some­thing is wrong. What makes it look like a “twilight zone”? Why don’t I feel the same symp­toms for ex­am­ple on Scott Alexan­der’s Slate Star Codex blog?

Another thing I could pin­point is that I don’t want to iden­tify as a “ra­tio­nal­ist”, I don’t want to be any -ist. It seems like a tac­tic to make peo­ple iden­tify with a group and swal­low “the whole pack­age”. (I also don’t think peo­ple should iden­tify as athe­ist ei­ther.)

• Another thing I could pin­point is that I don’t want to iden­tify as a “ra­tio­nal­ist”, I don’t want to be any -ist.

No­body forces you to do so. Plenty of peo­ple in this com­mu­nity don’t self iden­tify that way.

• I’m sym­pa­thetic to ev­ery­thing you say.

In my ex­pe­rience there’s an is­sue of Less Wrongers be­ing un­usu­ally emo­tion­ally dam­aged (e.g. rel­a­tive to aca­demics) and this gives rise to a lot of prob­lems in the com­mu­nity. But I don’t think that the emo­tional dam­age pri­mar­ily comes from the weird stuff that you see on Less Wrong. What one sees is them hav­ing born the brunt of the phe­nomenon that I de­scribed here dis­pro­por­tionately rel­a­tive to other smart peo­ple, of­ten be­cause they’re un­usu­ally cre­ative and have been marginal­ized by con­formist norms

Quite frankly, I find the norms in academia very creepy: I’ve seen a lot of peo­ple de­velop se­ri­ous men­tal health prob­lems in con­nec­tion with their ex­pe­riences in academia. It’s hard to see it from the in­side: I was dis­turbed by what I saw, but I didn’t re­al­ize that math academia is ac­tu­ally func­tion­ing as a cult, based on ret­ro­spec­tive im­pres­sions, and in fact by im­plicit con­sen­sus of the best math­e­mat­i­ci­ans of the world (I can give refer­ences if you’d like) .

• I was dis­turbed by what I saw, but I didn’t re­al­ize that math academia is ac­tu­ally func­tion­ing as a cult

I’m sure you’re aware that the word “cult” is a strong claim that re­quires a lot of ev­i­dence, but I’d also is­sue a friendly warn­ing that to me at least it im­me­di­ately set off my “crank” alarm bells. I’ve seen too many Usenet posters who are sure they have a P=/​!=NP proof, or a proof that set the­ory is false, or etc. who ul­ti­mately claim that be­cause “the math­e­mat­i­cal elite” are a cult that no one will listen to them. A cult gen­er­ally en­gages in ac­tive sup­pres­sion, of­ten de­fa­ma­tion, and not sim­ply ex­clu­sion. Do you have ev­i­dence of le­gi­t­i­mate math­e­mat­i­cal re­sults or re­search be­ing hid­den/​with­drawn from jour­nals or pub­li­cly de­rided, or is it more of an old boy’s club that’s hard for out­siders to par­ti­ci­pate in and that plays petty poli­tics to the dam­age of the sci­ence?

Grothendieck’s prob­lems look to be poli­ti­cal and in­ter­per­sonal. Perel­man’s also. I think it’s one thing to claim that math­e­mat­i­cal in­sti­tu­tions are no more ra­tio­nal than any other poli­ti­cized body, and quite an­other to claim that it’s a cult. Or maybe most so­cial be­hav­ior is too cult-like. If so; per­haps don’t sin­gle out math­e­mat­ics.

I’ve seen a lot of peo­ple de­velop se­ri­ous men­tal health prob­lems in con­nec­tion with their ex­pe­riences in academia.

I ques­tion the di­rec­tion of cau­sa­tion. His­tor­i­cally many great math­e­mat­i­ci­ans have been men­tally and so­cially atyp­i­cal and ended up not mak­ing much sense with their later writ­ings. Either math­e­mat­ics has always had an in­sti­tu­tional prob­lem or math­e­mat­i­ci­ans have always had an in­ci­dence of men­tal difficul­ties (or a com­bi­na­tion of both; but I would ex­pect one to dom­i­nate).

Espe­cially in Thurston’s On Proof and Progress in Math­e­mat­ics I can ap­pre­ci­ate the prob­lem of try­ing to grok spe­cial­ized ar­eas of math­e­mat­ics. The ter­minol­ogy and sym­bol­ogy is opaque to the un­ini­ti­ated. It re­minds me of sec­tion 1 of the Me­ta­math Book which ex­presses similar un­hap­piness with the state of knowl­edge be­tween spe­cial­ist fields of math­e­mat­ics and the gen­eral difficulty of learn­ing math­e­mat­ics. I had hoped that Me­ta­math would be­come more pop­u­lar and tie var­i­ous sub­fields to­gether through unify­ing the­o­ries and defi­ni­tions, but as far as I can tell it lan­guishes as a hob­by­ist pro­ject for a few ded­i­cated math­e­mat­i­ci­ans.

• I’m sure you’re aware that the word “cult” is a strong claim that re­quires a lot of ev­i­dence, but I’d also is­sue a friendly warn­ing that to me at least it im­me­di­ately set off my “crank” alarm bells.

Thanks, yeah, peo­ple have been tel­ling me that I need to be more care­ful in how I frame things. :-)

Do you have ev­i­dence of le­gi­t­i­mate math­e­mat­i­cal re­sults or re­search be­ing hid­den/​with­drawn from jour­nals or pub­li­cly de­rided, or is it more of an old boy’s club that’s hard for out­siders to par­ti­ci­pate in and that plays petty poli­tics to the dam­age of the sci­ence?

The lat­ter, but note that that’s not nec­es­sar­ily less dam­ag­ing than ac­tive sup­pres­sion would be.

Or maybe most so­cial be­hav­ior is too cult-like. If so; per­haps don’t sin­gle out math­e­mat­ics.

Yes, this is what I be­lieve. The math com­mu­nity is just un­usu­ally salient to me, but I should phrase things more care­fully.

I ques­tion the di­rec­tion of cau­sa­tion. His­tor­i­cally many great math­e­mat­i­ci­ans have been men­tally and so­cially atyp­i­cal and ended up not mak­ing much sense with their later writ­ings. Either math­e­mat­ics has always had an in­sti­tu­tional prob­lem or math­e­mat­i­ci­ans have always had an in­ci­dence of men­tal difficul­ties (or a com­bi­na­tion of both; but I would ex­pect one to dom­i­nate).

Most of the peo­ple who I have in mind did have pre­ex­ist­ing difficul­ties. I meant some­thing like “rel­a­tive to a coun­ter­fac­tual where academia was serv­ing its in­tended func­tion.” Peo­ple of very high in­tel­lec­tual cu­ri­os­ity some­times ap­proach academia be­liev­ing that it will be an oa­sis and find this not to be at all the case, and that the struc­tures in place are in fact hos­tile to them.

This is not what the gov­ern­ment should be sup­port­ing with tax­payer dol­lars.

Espe­cially in Thurston’s On Proof and Progress in Math­e­mat­ics I can ap­pre­ci­ate the prob­lem of try­ing to grok spe­cial­ized ar­eas of math­e­mat­ics.

• The lat­ter, but note that that’s not nec­es­sar­ily less dam­ag­ing than ac­tive sup­pres­sion would be.

I sup­pose there’s one scant anec­dote for es­ti­mat­ing this; cryp­tog­ra­phy re­search seemed to lag a decade or two be­hind ac­tively sup­pressed/​hid­den gov­ern­ment re­search. Granted, there was also less pub­lic in­ter­est in cryp­tog­ra­phy un­til the 80s or 90s, but it seems that sup­pres­sion can only de­lay pub­li­ca­tion, not pre­vent it.

The real risk of sup­pres­sion and ex­clu­sion both seem to be in per­ma­nently dis­cour­ag­ing math­e­mat­i­ci­ans who would oth­er­wise make great break­throughs, since af­fect­ing the timing of pub­li­ca­tion/​dis­cov­ery doesn’t seem as dam­ag­ing.

This is not what the gov­ern­ment should be sup­port­ing with tax­payer dol­lars.

I think I would be sur­prised if Ba­sic In­come was a less effec­tive strat­egy than tar­geted gov­ern­ment re­search fund­ing.

Every­thing from logic and ax­io­matic foun­da­tions of math­e­mat­ics to prac­ti­cal use of ad­vanced the­o­rems for com­puter sci­ence. What at­tracted me to Me­ta­math was the idea that if I en­coun­tered a pa­per that was to­tally un­in­tel­ligible to me (say Perel­man’s proof of Poin­caire’s con­jec­ture or Wiles’ proof of Fer­mat’s Last The­o­rem) I could back­track through sound defi­ni­tions to con­cepts I already knew, and then build my un­der­stand­ing up from those defi­ni­tions. Alas, just hav­ing a cross-refer­ence of re­lated defi­ni­tions be­tween var­i­ous fields would be helpful. I take it that model the­ory is the place to look for such a cross-refer­ence, and so that is prob­a­bly the next thing I plan to study.

Prac­ti­cally, I re­al­ize that I don’t have enough time or pa­tience or men­tal abil­ity to slog through for­mal defi­ni­tions all day, and so it would be nice to have some­thing even bet­ter. A uni­ver­sal math­e­mat­i­cal ed­u­ca­tor, so to speak. Although I worry that with­out a strong for­mal un­der­stand­ing I will miss im­por­tant re­sults/​in­sights. So my other in­ter­est is build­ing the kind of agent that can iden­tify which for­mal in­sights are use­ful or im­por­tant, which sort of nat­u­rally leads to an in­ter­est in AI and de­ci­sion the­ory.

• I would like to see some of those refer­ences (sim­ply be­cause I have no re­la­tion to Academia, and don’t like things I read some­where to ges­tate into un­founded in­tu­itions about a sub­ject).

• Quite frankly, I find the norms in academia very creepy: I’ve seen a lot of peo­ple de­velop se­ri­ous men­tal health prob­lems in con­nec­tion with their ex­pe­riences in academia. It’s hard to see it from the in­side: I was dis­turbed by what I saw, but I didn’t re­al­ize that math academia is ac­tu­ally func­tion­ing as a cult, based on ret­ro­spec­tive im­pres­sions, and in fact by im­plicit con­sen­sus of the best math­e­mat­i­ci­ans of the world (I can give refer­ences if you’d like) .

I’ve only been in CS academia, and wouldn’t call that a cult. I would call it, like most of the rest of academia, a deeply dys­func­tional in­dus­try in which to work, but that’s the fault of the aca­demic ca­reer and fund­ing struc­ture. CS is even rel­a­tively healthy by com­par­i­son to much of the rest.

How much of our im­pres­sion of math­e­mat­ics as a creepy, men­tal-health-harm­ing cult comes from pure stereo­typ­ing?

• How much of our im­pres­sion of math­e­mat­ics as a creepy, men­tal-health-harm­ing cult comes from pure stereo­typ­ing?

Jonah hap­pens to be a math phd. How can you en­gage in pure stereo­typ­ing of math­e­mat­i­ci­ans while you get your PHD?

• I was more posit­ing that it’s a self-re­in­forc­ing, self-cre­at­ing effect: peo­ple treat Math­e­mat­ics in a cultish way be­cause they think they’re sup­posed to.

• I was more posit­ing that it’s a self-re­in­forc­ing, self-cre­at­ing effect

I don’t be­lieve there’s any such thing, on the gen­eral grounds of “no fake with­out a re­al­ity to be a fake of.”

• Who do you mean when you say “peo­ple”?

• For what its worth, I have ob­served a cer­tain rev­er­ence in the way great math­e­mat­i­ci­ans are treated by their lesser-ac­com­plished col­leagues that can of­ten bor­der on the creepy. This is some­thing spe­cific to math, in that it seems to ex­ist in other dis­ci­plines with lesser in­ten­sity.

But I agree, “dys­func­tional” seems to be a more apt la­bel than “cult.” May I also add “fash­ion-prone?”

• How much of our im­pres­sion of math­e­mat­ics as a creepy, men­tal-health-harm­ing cult

Er, what? Who do you mean by “we”?

comes from pure stereo­typ­ing?

Fi­nally, Alan Tur­ing, the great Bletch­ley Park code breaker, father of com­puter sci­ence and ho­mo­sex­ual, died try­ing to prove that some things are fun­da­men­tally un­prov­able.

This is a stag­ger­ingly wrong ac­count of how he died.

• This is a stag­ger­ingly wrong ac­count of how he died.

Hence my call­ing it “pure stereo­typ­ing”!

• I don’t have di­rect ex­po­sure to CS academia, which, as you com­ment, is known to be healthier :-). I was speak­ing in broad brush­strokes , I’ll qual­ify my claims and im­pres­sions more care­fully later.

• I don’t re­ally un­der­stand what you mean about math academia. Those refer­ences would be ap­pre­ci­ated.

• The top 3 an­swers to the MathOverflow ques­tion Which math­e­mat­i­ci­ans have in­fluenced you the most? are Alexan­der Grothendieck, Mikhail Gro­mov, and Bill Thurston. Each of these have ex­pressed se­ri­ous con­cerns about the com­mu­nity.

• Grothendieck was ac­tu­ally effec­tively ex­com­mu­ni­cated by the math­e­mat­i­cal com­mu­nity and then was pathol­o­gized as hav­ing gone crazy. See pages 37-40 of David Ruelle’s book A Math­e­mat­i­cian’s Brain.

• Gro­mov ex­presses strong sym­pa­thy for Gri­gory Perel­man hav­ing left the math­e­mat­i­cal com­mu­nity start­ing on page 110 of Perfect Ri­gor. (You can search for “Gro­mov” in the pdf to see all of his re­marks on the sub­ject.)

• Thurston made very apt crit­i­cisms of the math­e­mat­i­cal com­mu­nity in his es­say On Proof and Progress In Math­e­mat­ics. See es­pe­cially the be­gin­ning of Sec­tion 3: “How is math­e­mat­i­cal un­der­stand­ing com­mu­ni­cated?” Terry Tao en­dorses Thurston’s es­say in his obit­u­ary of Thurston. But the com­mu­nity has es­sen­tially ig­nored Thurston’s re­marks: one al­most never hears peo­ple talk about the points that Thurston raises.

• I don’t know about Grothendieck, but the two other sources ap­pear to have softer crit­i­cism of the math­e­mat­i­cal com­mu­nity than “ac­tu­ally func­tion­ing as a cult”.

• The links you give are ex­tremely in­ter­est­ing, but, un­less I am miss­ing some­thing, it seems that they fall short of jus­tify­ing your ear­lier state­ment that math academia func­tions as a cult. I won­der if you would be will­ing to elab­o­rate fur­ther on that?

The most scary thing to me is that the most math­e­mat­i­cally tal­ented stu­dents are of­ten turned off by what they see in math classes, even at the un­der­grad­u­ate and grad­u­ate lev­els. Math serves as a back­bone for the sci­ences, so this may badly un­der­cut­ting sci­en­tific in­no­va­tion at a so­cietal level.

I hon­estly think that it would be an im­prove­ment on the sta­tus quo to stop teach­ing math classes en­tirely. Thurston char­ac­ter­ized his early math ed­u­ca­tion as fol­lows:

I hated much of what was taught as math­e­mat­ics in my early school­ing, and I of­ten re­ceived poor grades. I now view many of these early les­sons as anti-math: they ac­tively tried to dis­cour­age in­de­pen­dent thought. One was sup­posed to fol­low an es­tab­lished pat­tern with me­chan­i­cal pre­ci­sion, put an­swers in­side boxes, and “show your work,” that is, re­ject men­tal in­sights and al­ter­na­tive ap­proaches.

I think that this char­ac­ter­izes math classes even at the grad­u­ate level, only at a higher level of ab­strac­tion. The classes es­sen­tially never offer stu­dents ex­po­sure to free-form math­e­mat­i­cal ex­plo­ra­tion, which is what it takes to make ma­jor sci­en­tific dis­cov­er­ies with sig­nifi­cant quan­ti­ta­tive com­po­nents.

• I dis­tinctly re­mem­ber hav­ing points taken off of a physics midterm be­cause I didn’t show my work. I think I dropped the exam in the waste bas­ket on the way out of the au­di­to­rium.

I’ve always as­sumed that the prob­lem is three-fold; gen­er­at­ing a for­mal proof is NP-hard, get­ting the right an­swer via short­cuts can in­clude cheat­ing, and the fac­ulty’s time is limited. Pro­fes­sors/​graders do not have the ca­pac­ity to rigor­ously demon­strate to them­selves that the steps a stu­dent has writ­ten down ac­tu­ally pin­point the unique an­swer. Without ac­cess to the stu­dent’s mind graders are un­able to de­ter­mine if stu­dents cheat or not; be­ing able to mem­o­rize and/​or re­pro­duce the ex­act steps of a calcu­la­tion sig­nifi­cantly de­crease the like­li­hood of cheat­ing. Even if graders could do one or both of the pre­vi­ous for a sin­gle stu­dent, they are not 30x or 100x as smart as their stu­dents, mak­ing it im­prac­ti­cal to re­peat the pro­cess for ev­ery stu­dent.

That said, I had some very good math­e­mat­ics teach­ers in higher level courses who could force stu­dents to think, and one in par­tic­u­lar who could en­courage/​de­mand nov­elty from stu­dents sim­ply by ask­ing them to solve prob­lems that they hadn’t yet learned to solve. I didn’t re­al­ize the power of the lat­ter ap­proach un­til later (and at the time ev­ery­one com­plained about ex­ams with a me­dian score well un­der 50%), but his classes were always my fa­vorite.

• Thank you for all these in­ter­est­ing refer­ences. I en­joyed read­ing all of them, and reread­ing in Thurston’s case.

Do peo­ple pathol­o­gize Grothendieck as hav­ing gone crazy? I mostly think peo­ple think of him as be­ing a lit­tle bit strange. The story I heard was that be­cause of philo­soph­i­cal dis­agree­ments with mil­i­tary fund­ing and per­sonal con­flicts with other math­e­mat­i­ci­ans he left the com­mu­nity and was more or less re­fus­ing to speak to any­one about math­e­mat­ics, and peo­ple were sad about this and wished he would come back.

• Do peo­ple pathol­o­gize Grothendieck as hav­ing gone crazy?

His con­tri­bu­tion of math is too great for peo­ple to have ex­plic­itly adopted a stance that was too un­fa­vor­able to him, and many math­e­mat­i­ci­ans did in fact miss him a lot. But as Perel­man said:

Of course, there are many math­e­mat­i­ci­ans who are more or less hon­est. But al­most all of them are con­formists. They are more or less hon­est, but they tol­er­ate those who are not hon­est.” He has also said that “It is not peo­ple who break eth­i­cal stan­dards who are re­garded as aliens. It is peo­ple like me who are iso­lated.

If pressed, many math­e­mat­i­ci­ans down­play the role of those who be­haved un­eth­i­cally to­ward him and the failure of the com­mu­nity to give him a job in fa­vor of a nar­ra­tive “poor guy, it’s so sad that he de­vel­oped men­tal health prob­lems.”

• If pressed, many math­e­mat­i­ci­ans down­play the role of those who be­haved un­eth­i­cally to­ward him and the failure of the com­mu­nity to give him a job

What failure? He stepped down from the Stek­lov In­sti­tute and has re­fused ev­ery job offer and prize given to him.

• Do peo­ple pathol­o­gize Grothendieck as hav­ing gone crazy?

From the de­tails I’m aware of “gone crazy” is not a bad de­scrip­tion of what hap­pened.

• In my ex­pe­rience there’s an is­sue of Less Wrongers be­ing un­usu­ally emo­tion­ally dam­aged (e.g. rel­a­tive to aca­demics) and this gives rise to a lot of prob­lems in the com­mu­nity.

I think you’re just pro­ject­ing.

• I would prob­a­bly use differ­ent words, but I be­lieve I fit Jonah’s de­scrip­tion. Be­fore find­ing LW, I felt strongly iso­lated. Like, sur­rounded by hu­man bod­ies, but in­tel­lec­tu­ally alone. Think­ing about top­ics that peo­ple around me con­sid­ered “weird”, so I had no one to de­bate them with. Hav­ing a large range of in­ter­ests, and while I could find peo­ple to de­bate in­di­vi­d­ual in­ter­ests with, I had no one to talk with about the in­ter­est­ing com­bi­na­tions I saw there.

I felt “weird”, and from peo­ple around me I usu­ally got two kinds of feed­back. When I didn’t try to pre­tend any­thing, they more or less con­firmed that I am weird (of course, many were gen­tle, try­ing not to hurt me). When I tried to play a role of some­one “less weird” (that is, I ig­nored most of the things I con­sid­ered in­ter­est­ing, and just tried to fit)… well, it took a lot of time and prac­tice to do this cor­rectly, but then peo­ple ac­cepted me. So, for a long time it felt like the only way to be ac­cepted would be to su­press a large part of what I con­sider to be “my­self”; and I sus­pect that it would never work perfectly, that there would still be some kind of in­tel­lec­tual hunger.

Then I found LW and I was like: “whoa… there ac­tu­ally are peo­ple like me! too bad they are on the other side of the planet though”. Then I found some of them liv­ing closer, and… go­ing to mee­tups feels in­cred­ibly re­fresh­ing. First time in my life, I don’t have to sup­press any­thing, to play any role. I just am… in an en­vi­ron­ment that feels nat­u­ral. I fi­nally started un­der­stand­ing how peo­ple can en­joy hav­ing so­cial con­tacts.

Now let’s imag­ine that in a par­allel uni­verse, those LessWrongers who live in a city near to mine, would in­stead be my neigh­bors since my child­hood, or that we would be class­mates at high school. I be­lieve my life would be very differ­ent. (I be­lieve there are peo­ple like this in my city, but the prob­lem is find­ing those few dozen in­di­vi­d­u­als among the hun­dreds of thou­sands, es­pe­cially when there is no word in a pub­lic vo­cab­u­lary to de­scribe “us”.)

I can’t the ar­ti­cle now, but I be­lieve it was writ­ten by Lewis Ter­man, where he ob­served how suc­cess­ful are highly in­tel­li­gent peo­ple. He found a differ­ence be­tween those who were “in­tel­li­gent peo­ple in an in­tel­li­gent en­vi­ron­ment” and those who were “iso­lated in­tel­li­gent peo­ple”. The former were usu­ally very suc­cess­ful in life: they could talk with their par­ents and friends as equals, share their al­gorithms for life suc­cess, fit into their en­vi­ron­ment. The lat­ter felt iso­lated, and of­ten burned out at some mo­ment of their lives. The con­clu­sion was that for a highly in­tel­li­gent per­son, hav­ing similarly highly in­tel­li­gent fam­ily and friends makes a huge differ­ence in their lives. -- When you ob­serve the differ­ence be­tween “academia” and “LessWrong”, it may be re­lated to this.

It is eas­ier to be aca­dem­i­cally suc­cess­ful when your par­ents are. You can pick good habits and strate­gies from them; you can de­bate your work and prob­lems with them. If you are the only aca­dem­i­cally in­clined per­son in the fam­ily, you lead a dou­ble life: the “real life” out­side of school, and the “aca­demic life” in­side. The more you fo­cus on your work, the more it feels like you are with­draw­ing from ev­ery­thing else. On the other hand, if you come from the same cul­ture, fo­cus­ing on the work makes you fit into the cul­ture.

I am go­ing to break a taboo here, but I don’t know how to tell it oth­er­wise. I have IQ about four or five sigma above the av­er­age. The differ­ence be­tween me and the av­er­age Mensa mem­ber is larger that the differ­ence be­tween Mensa and the gen­eral pop­u­la­tion. Many peo­ple in Mensa seem kind of dense to me, and av­er­age peo­ple, those are some­times like five-years old chil­dren. (I be­lieve for many peo­ple on LessWrong it feels the same.) Sure, in­tel­li­gence in not ev­ery­thing: other peo­ple have skills and traits that I lack, some­times have more suc­cess than me, and I ad­mire that. It’s just… so difficult to talk with them like with adult peo­ple. But when I go to LW meetup, it’s like “whoa… fi­nally a group of adult peo­ple, how amaz­ing!”.

But I’m already an old man, rel­a­tively speak­ing. Now I am 39; I found LW when I was 35. Fi­nally I have a com­pany of my peers (still not in my own city), but it can’t fix the three decades of my life that already passed in iso­la­tion. It can make my life bet­ter, but I will always have the emo­tional scars of chronic loneli­ness. Oh, how much I envy those lucky kids who can go to LW mee­tups as teenagers. Makes me won­der how much my own life could be differ­ent; I prob­a­bly wouldn’t rec­og­nize my­self.

Of course, this is just one data point; I don’t know how typ­i­cal or atyp­i­cal I am within the LW com­mu­nity.

• Thank you for shar­ing your story, it was mov­ing and it was can­did. My ques­tion is, are you plan­ning to be suc­cess­ful now? Sup­pose you gonna die at 80, you have 40 bloody years, that’s a lot of time. Most likely you won’t win Fields Medal, but sci­ence and hu­man life has so many low-hang­ing fruit yet not picked. Do you plan to gain max­i­mum pro­duc­tivity and do some­thing to change the world? Or maybe you already do­ing it?

• I am not giv­ing up, and I hope I will still achieve some big suc­cess.

In the short­est term… I have a baby now, which turned my life up­side down a bit, so I need to solve some lo­gis­tic prob­lems first (e.g. to buy a new flat) and get used to the new situ­a­tion. It might take a year. -- Not com­plain­ing here; I always wanted to have chil­dren, but it’s tak­ing time and en­ergy and money, so my op­tions are now more limited than usual. I be­lieve it will be okay in a few months, but to­day, I am rather busy and tired. Also, hav­ing a fam­ily limits my op­tions; for ex­am­ple if I would de­cide that mov­ing to an­other city would make my life bet­ter, it is no longer only my own de­ci­sion. My hands are a bit more tied than they would be if I were 25 again.

I still didn’t give up com­pletely on start­ing a ra­tio­nal­ist com­mu­nity in my own city, and I have two spe­cific plans. (1) Th­ese days I am finish­ing the trans­la­tion of the LW Se­quences book; when it is ready, I will dis­tribute it freely and try to make it pop­u­lar, and hope that peo­ple who en­joy it will con­tact me. (2) In Septem­ber, I plan to do some ra­tio­nal­ity “lec­tures” (ad­ver­tis­ing for LW and for the trans­lated book) on at least one high school, and one uni­ver­sity.

I will prob­a­bly not do any­thing sci­en­tific, ever; that train has already gone. Can­not com­pete with 20-years olds with fresh brains and fresh mem­o­ries of their uni­ver­sity lec­tures, who don’t have a fam­ily to feed. It would be wiser to fo­cus fully on my per­sonal life and mak­ing money, be­cause that’s what I have to do any­way. -- The cur­rent plan is writ­ing com­puter games, be­cause the en­try costs are al­most zero, and I can do it at home in the evenings when the baby sleeps. (I have to keep the day job to pay bills.) Later, when the baby grows up and starts at­tent­ing school, I may try some­thing more am­bi­tious.

But still, even if my plans suc­ceed and I live till 80, I will not be able to do as much as in the hy­po­thet­i­cal par­allel uni­verse where I would find a LW com­mu­nity as a teenager (and also live till 80). But it will still be bet­ter than yet an­other par­allel uni­verse where LW doesn’t ex­ist at all or where I am some­how un­able to find it.

• It is so painful to have an eas­ily available pos­si­ble world in which you find LessWrong ear­lier than in the real world. I ran into LW/​OB five times since I was 16 and didn’t stick around un­til I was 21. I can’t imag­ine what I would be like with five years of ex­po­sure to the im­por­tant things that I’ve been ex­posed to in the past six months, as well as hav­ing grown alongside the com­mu­nity, see­ing as how I came around near the time that LW be­gan.

• I also didn’t stick with LW at the first time. I found an ar­ti­cle linked from some­where, I be­lieve it was “Well-Kept Gar­dens Die By Paci­fism”, I was im­pressed, but then I left. A year or two later, I again ran­domly found an ar­ti­cle, then I saw it was the same web­site as the pre­vi­ous one, so I was like “Oh, this web­site con­tains mul­ti­ple in­ter­est­ing ar­ti­cles” and started click­ing on ran­dom links in text. Then I cau­tiously posted a few com­ments in the Open Thread—some got down­votes, some got up­votes—and kept read­ing...

So, some­where in the par­allel Everett branch there is a ver­sion of me that didn’t re­turn to LW any­more, or just re­turned, read one ar­ti­cle, and left again. Poor guy; he prob­a­bly spends a lot of time hav­ing stupid de­bates on other web­sites.

What do you be­lieve you would have done differ­ently, if you would stick around here at 16?

• I’m speak­ing based on many in­ter­ac­tions with many mem­bers of the com­mu­nity. I don’t think this is true of ev­ery­body, but I have seen a differ­ence at the group level.

• I didn’t ques­tion that you were in­ter­act­ing with many mem­bers of the com­mu­nity. I’m say­ing you’re pro­ject­ing. Maybe peo­ple are ei­ther nor­mal or slightly de­pressed/​anx­ious/​bit­ter/​etc, mean­ing, they have same emo­tional prob­lems just like any hu­man be­ing. You, how­ever, see them as un­usu­ally emo­tion­ally dam­aged.

Typ­i­cal Mind Fal­lacy, to un­der­stand other peo­ple we model them just like our­selves. You said your­self you had emo­tional prob­lems be­fore, so I be­lieve your per­cep­tion of the com­mu­nity is skewed. Maybe you see signs of emo­tional dam­age in other peo­ple, just like in­se­cure promis­cu­ous peo­ple seem­ingly spot de­prav­ity in other peo­ple.

• This doesn’t ad­dress the is­sue of the claimed differ­ence in Jonah’s per­cep­tion of LWers from his per­cep­tion of other groups.

• Another thing I could pin­point is that I don’t want to iden­tify as a “ra­tio­nal­ist”, I don’t want to be any -ist.

I’ve always thought that call­ing your­self a “ra­tio­nal­ist” or “as­piring ra­tio­nal­ist” is rather use­less. You’re ei­ther win­ning or not win­ning. Cal­ling your­self by some funny term can give you the nice feel­ing of be­long­ing to a com­mu­nity, but it doesn’t ac­tu­ally make you win more, in it­self.

• My view is con­stantly pop­ping back and forth be­tween differ­ent views

That sounds like you en­gage in bi­nary think­ing and don’t value shades of grey of un­cer­tainty enough. You feel to need to judge ar­gu­ments for whether they are true or aren’t and don’t have men­tal cat­e­gories for “might be true, or might not be true”.

Jonah makes strong claims for which he doesn’t provide ev­i­dence. He’s clear about the fact that he hasn’t pro­vided the nec­es­sary ev­i­dence.

Given that you pat­tern match to “crack­pot” in­stead of putting Jonah in the men­tal cat­e­gory where you don’t know whether what Jonah says is right or wrong. If you start to put a lot of claims into the “I don’t know”-pile you don’t con­stantly pop be­tween be­lief and non-be­lief. Pop­ping back and forth means that the size of your up­dates when pre­sented new ev­i­dence are too large.

Be­ing able to say “I don’t know” is part of gen­uine skep­ti­cism.

• I’m not talk­ing about back and forth be­tween true and false, but be­tween two ex­pla­na­tions. You can have a mul­ti­modal prob­a­bil­ity dis­tri­bu­tion and two dis­tant modes are about equally prob­a­ble, and when you up­date, some­times one is larger and some­times the other. Of course one doesn’t need to choose a point es­ti­mate (max­i­mum a pos­te­ri­ori), the dis­tri­bu­tion it­self should ideally be be­lieved in its en­tirety. But just as you can’t see the rab­bit-duck as si­mul­ta­neously 50% rab­bit and 50% duck, one some­times switches be­tween differ­ent ex­pla­na­tions, similarly to an MCMC sam­pling pro­ce­dure.

I don’t want to ar­gue this too much be­cause it’s largely a prefer­ence of style and cul­ture. I think the dis­cus­sions are very repet­i­tive and it’s an illu­sion that there is much to be learned by spend­ing so much time think­ing meta.

Any­way, I evap­o­rate from the site for now.

• I feel that I need to main­tain ex­treme men­tal efforts to stay “sane” here. Maybe I should re­frain from com­ment­ing. It’s a pity be­cause I’m gen­er­ally very in­ter­ested in the top­ics dis­cussed here, but the tone and the un­der­ly­ing ide­ol­ogy is push­ing me away.

I would be very in­ter­ested in hear­ing elab­o­ra­tion on this topic, ei­ther pub­li­cly or pri­vately.

• I pre­fer pub­lic dis­cus­sions. First, I’m a com­puter sci­ence stu­dent who took courses in ma­chine learn­ing, AI, wrote the­ses in these ar­eas (noth­ing ex­cep­tional), I en­joy books like Think­ing Fast and Slow, Black Swan, Pinker, Dawk­ins, Den­nett, Ra­machan­dran etc. So the top­ics dis­cussed here are also in­ter­est­ing to me. But the at­mo­sphere seems quite closed and turn­ing in­wards.

I feel similar­i­ties to red­dit’s Red Pill com­mu­nity. Pre­vi­ously “ig­no­rant” peo­ple feel the com­mu­nity has opened a new world to them, they lived in dark­ness be­fore, but now they found the “Way” (“Bayescraft”) and all this stuff is be­com­ing an iden­tity for them.

Sorry if it’s offen­sive, but I feel as if many peo­ple had no suc­cess in the “real world” mat­ters and in­vented a fic­tion where they are the heroes by hav­ing joined some great or­ga­ni­za­tion much higher above the gen­eral pub­lic, who are just ir­ra­tional au­tomata still liv­ing in the dark.

I dis­like the heavy use of in­sider ter­minol­ogy that make com­mu­ni­ca­tion with “out­siders” about these ideas quite hard be­cause you get used to refer­ring to these things by the in-group terms, so you get kind of iso­lated from your real-life friends as you feel “they won’t un­der­stand, they’d have to read so much”. When ac­tu­ally many of the con­cepts are not all that new and could be phrased in a way that the “un­ini­ti­ated” can also get it.

There are too many cross refer­ences in posts and it keeps you busy with the site longer than nec­es­sary. It seems that peo­ple try to prove they know some con­cept by us­ing the jar­gon and in­clud­ing links to them. In­stead, I’d pre­fer au­thors who ac­tively try to min­i­mize the need for links and jar­gon.

I also find the posts quite re­dun­dant. They seem to be re­it­er­a­tions of the same pat­terns in very long prose with peo­ple’s sto­ries in­ter­twined with the ideas, in­stead of striv­ing for clar­ity and con­cise­ness. Much of it feels a lot like self-help for peo­ple with de­railed lives who try to en­g­ineer their life (back) to suc­cess. I may be wrong but I get a de­pressed vibe from read­ing the site too long. It may also be be­cause there is no light­hearted hu­mor or in-jokes or “fun” or self-irony at all. Maybe be­cause the mem­bers are just like that in gen­eral (per­haps due to men­tal differ­ences, like be­ing on the autism spec­trum, I’m not a psy­chi­a­trist).

I can see that peo­ple here are re­ally smart and the com­ments are of­ten very rea­son­able. And it makes me won­der why they’d re­gard a sin­gle per­son such as Yud­kowsky in such high es­teem as com­pared to es­tab­lished book au­thors or aca­demics or in­dus­try peo­ple in these ar­eas. I know there has been much dis­cus­sion about cultish­ness, and I think it goes a lot deeper than sur­face is­sues. LessWrong seems to be quite iso­lated and dis­trust­ing to­wards the main­stream. Many peo­ple seem to have read stuff first from Yud­kowsky, who of­ten does not refer­ence ear­lier works that ba­si­cally state the same stuff, so peo­ple get the im­pres­sion that all or most of the ideas in “The Se­quences” come from him. I was quite dis­ap­pointed sev­eral times when I found the same ideas in main­stream books. The Se­quences of­ten de­pict the whole out­side world as dumber than it is (straw man tac­tics, etc).

Another thing is that dis­cus­sion is of­ten too meta (or meta-meta). There is dis­cus­sion on Bayes the­o­rem and math prin­ci­ples but no ac­tual de­tailed, worked out stuff. Very lit­tle ac­tual pro­gram­ming for ex­am­ple. I’d ex­pect peo­ple to cre­ate github pro­jects, IPython note­books to show some ex­am­ples of what they are talk­ing about. Much of the meta-meta-dis­cus­sion is very opinion-based be­cause there is no im­me­di­ate feed­back about whether some­one is wrong or right. It’s hard to test such hy­pothe­ses. For ex­am­ple, in this post I would have ex­pected an ex­am­ple dataset and show­ing how PCA can un­cover some­thing sur­pris­ing. Other­wise it’s just float­ing out there al­though it matches nicely with the pat­tern that “some math con­cept gave me in­sight that re­fined my ra­tio­nal­ity”. I’m not sure, maybe these “ra­tio­nal­ity im­prove­ments” are some­times illu­sions.

I also don’t get why the ra­tio­nal­ity stuff is in­ter­mixed with friendly AI and cry­on­ics and tran­shu­man­ism. I just don’t see why these be­long that much to­gether. I find them too spec­u­la­tive and de­tached from the “real world” to be the cen­tral ideas. I re­al­ize they are im­por­tant, but their prevalence could also be ex­plained as “es­capism” and it pro­motes the dis­cus­sion of untestable meta things that I men­tioned above, never hav­ing to face re­al­ity. There is much talk about what ev­i­dence is but not much talk that ac­tu­ally pre­sents ev­i­dence.

I needed to de­velop a sort of im­mu­nity against top­ics like acausal trade that I can’t fully spec­ify how they are wrong, but they feel wrong and are hard to trans­late to prac­ti­cal testable state­ments, and it just messes with my head in the wrong way.

And of course there is also that se­crecy around and hid­ing of “cer­tain things”.

That’s it. This place may just not be for me, which is fine. Peo­ple can have their com­mu­ni­ties in the way they want. You just asked for elab­o­ra­tion.

• Thanks for the de­tailed re­sponse! I’ll re­spond to a hand­ful of points:

Pre­vi­ously “ig­no­rant” peo­ple feel the com­mu­nity has opened a new world to them, they lived in dark­ness be­fore, but now they found the “Way” (“Bayescraft”) and all this stuff is be­com­ing an iden­tity for them.

I cer­tainly agree that there are peo­ple here who match that de­scrip­tion, but it’s also worth point­ing out that there are ac­tual ex­perts too.

the gen­eral pub­lic, who are just ir­ra­tional au­tomata still liv­ing in the dark.

One of the things I find most charm­ing about LW, com­pared to places like Ra­tion­alWiki, is how much em­pha­sis there is on self-im­prove­ment and your mis­takes, not mis­takes made by other peo­ple be­cause they’re dumb.

It seems that peo­ple try to prove they know some con­cept by us­ing the jar­gon and in­clud­ing links to them. In­stead, I’d pre­fer au­thors who ac­tively try to min­i­mize the need for links and jar­gon.

I’m not sure this is avoid­able, and in full irony I’ll link to the wiki page that ex­plains why.

In gen­eral, there are lots of con­cepts that seem use­ful, but the only way we have to re­fer to con­cepts is ei­ther to re­fer to a la­bel or to ex­plain the con­cept. A num­ber of peo­ple read through the se­quences and say “but the con­clu­sions are just com­mon sense!”, to which the re­sponse is, “yes, but how easy is it to com­mu­ni­cate com­mon sense?” It’s one thing to be able to rec­og­nize that there’s some vague prob­lem, and an­other thing to be able to say “the prob­lem here is in­fer­en­tial dis­tance; knowl­edge takes many steps to ex­plain, and at­tempts to ex­plain it in fewer steps sim­ply won’t work, and the jus­tifi­ca­tion for this po­ten­tially sur­pris­ing claim is in Ap­pendix A.” It is one thing to be able to rec­og­nize a con­cept as worth­while; it is an­other thing to be able to recre­ate that con­cept when a need arises.

Now, I agree with you that hav­ing differ­ent la­bels to re­fer to the same con­cept, or con­cep­tual bound­aries or defi­ni­tions that are drawn slightly differ­ently, is a gi­ant pain. When pos­si­ble, I try to bring the wider com­mu­nity’s ter­minol­ogy to LW, but this re­quires be­ing in both com­mu­ni­ties, which limits how much any in­di­vi­d­ual per­son can do.

I also don’t get why the ra­tio­nal­ity stuff is in­ter­mixed with friendly AI and cry­on­ics and tran­shu­man­ism.

Part of that is just seed­ing effects—if you start a ra­tio­nal­ity site with a bunch of peo­ple in­ter­ested in tran­shu­man­ism, the site will re­main dis­pro­por­tionately linked to tran­shu­man­ism be­cause peo­ple who aren’t tran­shu­man­ists will be more likely to leave and peo­ple who are tran­shu­man­ists will be more likely to find and join the site.

Part of it is that those are the cluster of ideas that seem weird but ‘hold up’ un­der in­ves­ti­ga­tion—most of the rea­sons to be­lieve that the econ­omy of fifty years from now will look like the econ­omy of to­day are just con­fused, and if a com­mu­nity has good tools for dis­solv­ing con­fu­sions you should ex­pect them to con­verge on the un-con­fused an­swer.

A fi­nal part seems to be availa­bil­ity; peo­ple who are con­vinced by the case for cry­on­ics tend to be louder than the peo­ple who are un­con­vinced. The an­nual sur­veys show the per­cep­tion of LW one gets from just read­ing posts (or posts and com­ments) is skewed from the per­cep­tion of LW one gets from the sur­vey re­sults.

• One of the things I find most charm­ing about LW, com­pared to places like Ra­tion­alWiki, is how much em­pha­sis there is on self-im­prove­ment and your mis­takes, not mis­takes made by other peo­ple be­cause they’re dumb.

I agree that LW is much bet­ter than Ra­tion­alWiki, but I still think that the norms for dis­cus­sion are much too far in the di­rec­tion of fo­cus on how other com­menters are wrong as op­posed to how one might one­self be wrong.

I know that there’s a se­lec­tion effect (with re­spect to the more frus­trat­ing in­ter­ac­tions stand­ing out). But peo­ple not in­fre­quently mis­tak­enly be­lieve that I’m wrong about things that I know much more about than they do, with very high con­fi­dence, and in such in­stances I find the con­no­ta­tions that I’m un­sound to be ex­as­per­at­ing.

I don’t think that this is just a prob­lem for me rather than a prob­lem for the com­mu­nity in gen­eral: I know a num­ber of very high qual­ity thinkers in real life who are un­in­ter­ested in par­ti­ci­pat­ing on LW ex­plic­itly be­cause they don’t want to en­gage with com­menters who are highly con­fi­dent that their own po­si­tions are in­cor­rect. There’s an­other se­lec­tion effect here: such peo­ple aren’t salient be­cause they’re in­visi­ble to the on­line com­mu­nity.

• I know that there’s a se­lec­tion effect (with re­spect to the more frus­trat­ing in­ter­ac­tions stand­ing out).

I agree that those frus­trat­ing in­ter­ac­tions both hap­pen and are frus­trat­ing, and that it leads to a gen­eral acid­ifi­ca­tion of the dis­cus­sion as peo­ple who don’t want to deal with it leave. Rev­ers­ing that pro­cess in a sus­tain­able way is prob­a­bly the most valuable way to im­prove LW in the medium term.

• There’s also the whole Less­wrong-is-dy­ing thing that might be con­tribute to the vibe you’re get­ting. I’ve been read­ing the fo­rum for years and it hasn’t felt very healthy for a while now. A lot of the im­pres­sive peo­ple from ear­lier have moved on, we don’t seem to be get­ting that many new im­pres­sive peo­ple com­ing in and hang­ing out a lot on the fo­rum turns out not to make you that much more im­pres­sive. What’s left is turn­ing in­creas­ingly into a weird sort of cargo cult of a fo­rum for im­pres­sive peo­ple.

• Ac­tu­ally, I think that LessWrong used to be worse when the “im­pres­sive peo­ple” were post­ing about cry­on­ics, FAI, many-world in­ter­pre­ta­tion of quan­tum me­chan­ics, and so on.

• It has seemed to me that a lot of the com­menters who come with their own solid com­pe­tency are also less likely to get un­ques­tion­ingly swept away fol­low­ing EY’s par­tic­u­lar hob­by­horses.

• I needed to de­velop a sort of im­mu­nity against top­ics like acausal trade that I can’t fully spec­ify how they are wrong, but they feel wrong and are hard to trans­late to prac­ti­cal testable state­ments, and it just messes with my head in the wrong way.

The ap­pli­ca­ble word is meta­physics. Acausal trade is dab­bling in meta­physics to “solve” a ques­tion in de­ci­sion the­ory, which is it­self mere philoso­phiz­ing, and thus one has to won­der: what does Na­ture care for philoso­phies?

By the way, for the rest of your post I was go­ing, “OH MY GOD I KNOW YOUR FEELS, MAN!” So it’s not as though no­body ever thinks these things. Those of us who do just tend to, in perfect evap­o­ra­tive cool­ing fash­ion, go get on with our lives out­side this web­site, be­ing rel­a­tively or­di­nary sci­ence nerds.

• The ap­pli­ca­ble word is meta­physics.

Sorry avoid­ing meta­physics doesn’t work. You just end up ei­ther rein­vent­ing them (badly) or us­ing a bad 5th hand ver­sion of some old philo­spher’s meta­physics. In­ci­den­tally, Eliezer also tried avoid­ing meta­physics and wound up do­ing the former.

• I don’t like Eliezer’s ap­par­ent math­e­mat­i­cal/​com­pu­ta­tional Pla­ton­ism my­self, but most work­ing sci­en­tists man­age to avoid meta­phys­i­cal bug­gery by sim­ply deal­ing with only those things with which what they can ac­tu­ally causally in­ter­act. I re­call an Eliezer post on “Ex­plain/​Wor­ship/​Ig­nore”, and would add my­self that while “Ex­plain” even­tu­ally bot­toms out in the limits of our cur­rent knowl­edge, the cor­rect re­sponse is to hit “Ig­nore” at that stage, not to drop to one’s knees in Wor­ship of a Sa­cred Mys­tery that is in fact just a limit to cur­rent ev­i­dence.

EDIT: This is also one of the rea­sons I en­joy be­ing in this com­mu­nity: even when I dis­agree with some­one’s view (eg: Eliezer’s), peo­ple here (in­clud­ing him) are of­ten more pro­duc­tive and fun to talk to than some­one who hits the limits of their sci­en­tific knowl­edge and just throws their hands up to the tune of “METAPHYSICS, SON!”, and then joins the bloody Catholic Church, as if that solved any­thing.

• I don’t like Eliezer’s ap­par­ent math­e­mat­i­cal/​com­pu­ta­tional Pla­ton­ism my­self, but most work­ing sci­en­tists man­age to avoid meta­phys­i­cal bug­gery by sim­ply deal­ing with only those things with which what they can ac­tu­ally causally in­ter­act.

That works up un­til the point where you ac­tu­ally have to think about what it means to “causally in­ter­act” with some­thing. Also ques­tions like “does some­thing that falls into a black hole cease to ex­ist since it’s no longer pos­si­ble to in­ter­act with it”?

• Also ques­tions like “does some­thing that falls into a black hole cease to ex­ist since it’s no longer pos­si­ble to in­ter­act with it”?

But there are triv­ially easy an­swers to ques­tions like that. Ba­si­cally you have to ask “Cease to ex­ist for whom?” i.e. it ob­vi­ously ceases to ex­ist for you. You just have to taboo words like “re­ally” here such “does it re­ally cease to ex­ist” as they are mean­ingless, they don’t lead to pre­dic­tions. What of­ten peo­ple con­sider “re­ally” re­al­ity is the per­cep­tion of a perfect god-like om­ni­scient ob­server but there is no such thing.

Essen­tially there are just two ex­tremes to avoid, the po-mo “noth­ing is real, ev­ery­thing is mere per­cep­tion” and the tra­di­tional, clas­si­cal “but how things re­ally re­ally REALLY are?” and the mid­dle way here is “re­al­ity is the sum of what could be per­ceived in prin­ci­ple”. A per­cep­tion is right or wrong based on how much it meshes with all the other things that can in prin­ci­ple be per­ceived. Every­thing that can­not even be per­ceived in the­ory is not part of re­al­ity. There is no how things “re­ally” are, the clos­est we have to that what is the sum of all po­ten­tial, pos­si­ble per­ceiv­ables about a thing.

I picked up this ap­proach from Eric S. Ray­mond, I think he worked it out decades be­fore Eliezer did, pos­si­bly both work­ing from Peirce.

This is ba­si­cally anti-meta­physics.

• Every­thing that can­not even be per­ceived in the­ory is not part of re­al­ity.

Does this im­ply that only things that ex­ist in my past light cone are real for me at any given mo­ment?

• I don’t know what real-for-me means here. Every­thing that in prin­ci­ple, in the­ory, could be ob­served, is real. Most of those you didn’t. This does not make them any less real.

I meant the “for whom?” not in the sense of me, you, or the bar­keeper down the street. I meant it in the sense of nor­mal be­ings who know only things that are in prin­ci­ple know­able, vs. some godlike be­ing who can know how things re­ally “are” re­gard­less of whether they are know­able or not.

• Every­thing that in prin­ci­ple, in the­ory, could be ob­served, is real.

Well, that’s where it starts to break down; be­cause what you can, in the­ory, ob­serve is differ­ent from what I can, in the­ory, ob­serve.

This is be­cause, as far as any­one can tell, ob­ser­va­tions are limited by the speed of light. I can­not, even in prin­ci­ple, ob­serve the 2015 Alpha Cen­tauri un­til at least 2019 (if I ob­serve it now, I am see­ing light that left it around 2011). If Alpha Cen­tauri had sud­denly ex­ploded in 2013, I have no way of ob­serv­ing that un­til at least 2018 - even in prin­ci­ple.

So if the bar­keeper, in­stead of be­ing down the street, is rather liv­ing on a planet or­bit­ing Alpha Cen­tauri, then the set of what he can ob­serve in prin­ci­ple is not the same as the set of what I can ob­serve in prin­ci­ple.

• Every­thing that in prin­ci­ple, in the­ory, could be ob­served, is real. Most of those you didn’t. This does not make them any less real.

I’d like to con­grat­u­late you on de­vel­op­ing your own “makes you sound in­sane to the man in the street” the­ory of meta­physics.

• Man on the street needs to learn what coun­ter­fac­tual definite­ness is.

• Ilya, can you give me a defi­ni­tion of “coun­ter­fac­tual definite­ness” please?

• Physi­cists are not very pre­cise about it, may I sug­gest look­ing into “po­ten­tial out­comes” (the lan­guage some statis­ti­ci­ans use to talk about coun­ter­fac­tu­als):

https://​​en.wikipe­dia.org/​​wiki/​​Ru­bin_causal_model

https://​​en.wikipe­dia.org/​​wiki/​​Coun­ter­fac­tual_definiteness

Po­ten­tial out­comes let you think about a model that con­tains a ran­dom vari­able for what hap­pens to Fred if we give Fred as­pirin, and a ran­dom vari­able for what hap­pens to Fred if we give Fred placebo. Even though in re­al­ity we only gave Fred as­pirin. This is “coun­ter­fac­tual definite­ness” in statis­tics.

This pa­per uses po­ten­tial out­comes to talk about out­comes of physics ex­per­i­ments (so there is an ex­act iso­mor­phism be­tween coun­ter­fac­tu­als in physics and po­ten­tial out­comes):

http://​​arxiv.org/​​pdf/​​1207.4913.pdf

• Sounds like this is per­haps re­lated to the coun­ter­fac­tual-con­sis­tency state­ment? In its sim­ple form, that the coun­ter­fac­tual or po­ten­tial out­come un­der policy “a” equals the fac­tual ob­served out­come when you in fact un­der­take policy “a”, or for­mally, Y^a = Y when A = a.

Pearl has a nice (easy) dis­cus­sion in the jour­nal Epi­demiol­ogy (http://​​www.ncbi.nlm.nih.gov/​​pubmed/​​20864888).

Is this what you are get­ting at, or am I miss­ing the point?

• No, not quite. Coun­ter­fac­tual con­sis­tency is what al­lows you to link ob­served and hy­po­thet­i­cal data (so it is also ex­tremely im­por­tant). Coun­ter­fac­tual definite­ness is even more ba­sic than that. It ba­si­cally sets the size of your on­tol­ogy by al­low­ing you to talk about Y(a) and Y(a’) to­gether, even if we only ob­serve Y un­der one value of A.

edit: Stephen, I think I re­al­ized who you are, please ac­cept my apolo­gies if I seemed to be talk­ing down to you, re: po­ten­tial out­comes, that was not my in­ten­tion. My prior is peo­ple do not know what po­ten­tial out­comes are.

edit 2: Good talks by Richard Gill and Jamie Robins at JSM on this:

http://​​www.am­stat.org/​​meet­ings/​​jsm/​​2015/​​on­line­pro­gram/​​Ac­tivi­tyDe­tails.cfm?Ses­sionID=211222

• Well, this whole thread started be­cause minus­dash and eli_sen­nesh ob­jected to the con­cept of ac­cusal trade for be­ing too meta­phys­i­cal.

• I just need to trans­late that for him to street lingo.

“There is shit we know, shit we could know, and shit could not know no mat­ter how good tech we had, we could not even know the effects it has on other stuff. So why should we say this later stuff ex­ists? Or why should we say this does not ex­ist? We can­not prove ei­ther.”

• My se­ri­ous point is that one can­not avoid meta­physics, and that way too many peo­ple start out from “all this meta­physics stuff is BS, I’ll just use com­mon sense” and end up with there own (bad) counter-in­tu­itive meta­phys­i­cal the­ory that they in­sist is “not meta­physics”.

• You could char­i­ta­bly un­der­stand ev­ery­thing that such peo­ple (who as­sert that meta­physics is BS) say with a silent “up to em­piri­cal equiv­alence”. Doesn’t the prob­lem dis­ap­pear then?

• No be­cause you need a the­ory of meta­physics to ex­plain what “em­piri­cal equiv­alence” means.

• Its in­suffi­ciently ap­pre­ci­ated that phys­i­cal­ism is meta­physics too.

• How about you just jump right to the de­tails of your method, and then back­track to help other peo­ple un­der­stand the nec­es­sary con­text to ap­pre­ci­ate the method? Other­wise, you will lose your au­di­ence.

• See my edit. Part of where I’m com­ing from is re­al­iz­ing how so­cially un­de­vel­oped peo­ple’s in our refer­ence class are tend to be, such that ap­par­ent mal­ice of­ten comes from mi­s­un­der­stand­ings.

• (be­fore peo­ple’s so­cial be­hav­ior had seemed like a com­pli­cated blur be­cause I saw so many vari­ables with­out hav­ing started to cor­rectly iden­tify the la­tent ones).

In­ter­est­ing—what are some ex­am­ples of the la­tent ones?

• I think hav­ing the con­cept of PCAs pre­vents some mis­takes in rea­son­ing on an in­tu­itive day to day level of rea­son­ing. It nudges me to­wards fox think­ing in­stead of hedge­hog think­ing. Nor­mal folk in­tu­ition grasps at the most cog­ni­tively available and ob­vi­ous vari­able to ex­plain causes, and then our Sys­tem 1 acts as if that vari­able ex­plains most if not all the var­i­ance. Look­ing at PCAs many times (and be­ing sur­prised by them) makes me less likely to jump to con­clu­sions about the causal struc­ture of clusters of re­lated events. So maybe I could char­ac­ter­ize it as giv­ing a Sys­tem 1 in­tu­ition for not mak­ing the post hoc ergo propter hoc fal­lacy.

Maybe part of the prob­lem Jonah is run­ning in to ex­plain­ing it is that hav­ing done many many ex­am­ple prob­lems with Sys­tem 2 loaded it into his Sys­tem 1, and the Sys­tem 1 knowl­edge is what he re­ally wants to com­mu­ni­cate?

• What do you mean by get­ting sur­prised by PCAs? Say you have some data, you com­pute the prin­ci­pal com­po­nents (eigen­vec­tors of the co­var­i­ance ma­trix) and the cor­re­spond­ing eigen­val­ues. Were you sur­prised that a few prin­ci­pal com­po­nents were enough to ex­plain a large per­centage of the var­i­ance of the data? Or were you sur­prised about what those vec­tors were?

I think this is not re­ally PCA or even di­men­sion­al­ity re­duc­tion spe­cific. It’s sim­ply the idea of la­tent vari­ables. You could gain the same in­tu­ition from study­ing prob­a­bil­is­tic graph­i­cal mod­els, for ex­am­ple gen­er­a­tive mod­els.

• Sur­prised by ei­ther. Just find­ing a struc­ture of causal­ity that was very un­ex­pected. I agree the in­tu­ition could be built from other sources.

• PCA doesn’t tell much about causal­ity though. It just gives you a “nat­u­ral” co­or­di­nate sys­tem where the vari­ables are not lin­early cor­re­lated.

• Right, one needs to use ad­di­tional in­for­ma­tion to de­ter­mine causal­ity.

• Yes, you seem to have a very clear un­der­stand­ing of where I’m com­ing from. Thanks.

• As I see it this move­ment seems to try to build up a new back­bone of knowl­edge from scratch. But do­ing this they re­peat the mis­takes of the past philoso­phers.

Don’t say the p-word, please ;-).

I do agree that more real-life un­der­stand­ing is gained from just ob­tain­ing a broad sci­en­tific ed­u­ca­tion than from go­ing wow-hunt­ing. But of course, I would say that, since I’m a fa­nat­i­cal text­book pur­chaser.

• I don’t be­lieve you can ob­tain an un­der­stand­ing of the idea that “cor­re­la­tion does not im­ply cau­sa­tion” from even a very deep ap­pre­ci­a­tion of the ma­te­rial in Statis­tics 101. Th­ese courses usu­ally make no at­tempt to define con­found­ing, com­pa­ra­bil­ity etc. If they try to define con­found­ing, they tend to use in­co­her­ent crite­ria based on changes in the es­ti­mate. Any un­der­stand­ing is al­most cer­tainly go­ing to have to origi­nate from out­side of Statis­tics 101; un­less you take a course on causal in­fer­ence based on di­rected acyclic graphs it will be very challeng­ing to get be­yond mem­o­riz­ing the teacher’s password

• Agree com­pletely, and I’ll also point out that at least for me, a very shal­low un­der­stand­ing of the ideas in Causal­ity did much more to help me un­der­stand cor­re­la­tion vs. cau­sa­tion, con­found­ing etc. than any amount of work with Statis­tics 101. And this was enor­mously prac­ti­cal–I was able to make sig­nifi­cantly bet­ter fi­nan­cial de­ci­sions at Fun­da­tion due to un­der­stand­ing con­cepts like Simp­son’s Para­dox on a sys­tem 1 level.

• To chime in as well: my own un­der­stand­ing of ‘cor­re­la­tion does not im­ply cau­sa­tion’ does not come from the ba­sic statis­tics courses and ar­ti­cles and tu­to­ri­als I read. While I knew the say­ing and the con­cepts and a lit­tle bit about causal graphs, it took years of failed self-ex­per­i­ments and the in­tensely frus­trat­ing ex­pe­rience of see­ing cor­re­late af­ter cor­re­late fail ran­dom­ized ex­per­i­ments be­fore I truly ac­cepted it.

I don’t know how helpful, ex­actly, this has been on a prac­ti­cal level, but at least it’s good for me on an epistemic level in that I have since ac­cepted many fewer new be­liefs than I would oth­er­wise have.

• Me four.

Although you know, there is no rea­son in prin­ci­ple you couldn’t get all that stuff An­ders_H is talk­ing about from in­tro stats, it’s just that stats isn’t taught as well as it can be.

• PCA and other di­men­sion­al­ity re­duc­tion tech­niques are great, but there’s an­other very use­ful tech­nique that most peo­ple (even statis­ti­ci­ans) are un­aware of: di­men­sional anal­y­sis, and in par­tic­u­lar, the Buck­ing­ham pi the­o­rem. For some rea­son, this tech­nique is used pri­mar­ily by en­g­ineers in fluid dy­nam­ics and heat trans­fer de­spite its broad ap­pli­ca­bil­ity. This is the tech­nique that al­lows scale mod­els like wind tun­nels to work, but it’s more use­ful than just al­low­ing for scal­ing. I find it very use­ful to re­duce the num­ber of vari­ables when de­vel­op­ing mod­els and con­duct­ing ex­per­i­ments.

Di­men­sional anal­y­sis rec­og­nizes a few ba­sic ax­ioms about mod­els with di­men­sions and sees what they im­ply. You can use these to con­struct new vari­ables from the old vari­ables. The model is usu­ally com­plete in a smaller num­ber of these new vari­ables. The tech­nique does not tell you which vari­ables are “cor­rect”, just how many in­de­pen­dent ones are needed. Iden­ti­fy­ing “cor­rect” vari­ables re­quires data, do­main knowl­edge, or both. (And some­times, there’s no clear “best” vari­able; mul­ti­ple work equiv­a­lently well.)

Di­men­sional anal­y­sis does not help with cat­e­gor­i­cal vari­ables, or num­bers which are already di­men­sion­less (though by luck, some­times com­bi­na­tions of di­men­sion­less vari­ables are ac­tu­ally what’s “cor­rect”). This is the main re­stric­tion that ap­plies. And you can ex­pect at best a re­duc­tion in the num­ber of vari­ables of about 3. Di­men­sional anal­y­sis is most use­ful for phys­i­cal prob­lems with maybe 3 to 10 vari­ables.

The ba­sic idea is this: Di­men­sions are some sort of meta­data which can tell you some­thing about the struc­ture of the prob­lem. You can always rewrite a di­men­sional equa­tion, for ex­am­ple, to be di­men­sion­less on both sides. You should no­tice that some terms be­come con­stants when this is done, and that sim­plifies the equa­tion.

Here’s a phys­i­cal ex­am­ple: Let’s say you want to mea­sure the drag on a sphere (units: N). You know this de­pends on the air speed (units: m/​s), vis­cos­ity (units: m^2/​s), air den­sity (units: kg/​m^3), and the di­ame­ter of the sphere (units: m). So, you have 5 vari­ables in to­tal. Let’s say you want to do a fac­to­rial de­sign with 4 lev­els in each vari­able, with no repli­ca­tions. You’d have to do 4^4 = 256 ex­per­i­ments. This is clearly too com­pli­cated.

What fluid dy­nam­i­cists have rec­og­nized is that you can rewrite the re­la­tion­ship in terms of differ­ent vari­ables, and noth­ing is miss­ing. The Buck­ing­ham pi the­o­rem men­tioned pre­vi­ously says that we only need 2 di­men­sion­less vari­ables given our 5 di­men­sional vari­ables. So, in­stead of the drag force, you use the drag co­effi­cient, and in­stead of the speed, vis­cos­ity, etc., you use the Reynolds num­ber. Now, you only need to do 4 ex­per­i­ments to get the same level of rep­re­sen­ta­tion.

As it turns out, you can use tech­niques like PCA on top of di­men­sional anal­y­sis to de­ter­mine that cer­tain di­men­sion­less pa­ram­e­ters are unim­por­tant (there are other ways too). This fur­ther sim­plifies mod­els.

There’s a lot more on this topic than what I have cov­ered and men­tioned here. I would recom­mend read­ing the book Di­men­sional anal­y­sis and the the­ory of mod­els for more de­tails and the proof of the pi the­o­rem.

(Another ad­van­tage of di­men­sional anal­y­sis: If you dis­cover a use­ful di­men­sion­less vari­able, you can get it named af­ter your­self.)

• In gen­eral, if your prob­lem dis­plays any kind of sym­me­try* you can ex­ploit that to sim­plify things. I think most peo­ple are ca­pa­ble of do­ing this in­tu­itively when the sym­me­try is ob­vi­ous. The Buck­ing­ham pi the­o­rem is a great ex­am­ple of a sys­tem­atic way to find and ex­ploit a sym­me­try that isn’t so ob­vi­ous.

* By “sym­me­try” I re­ally mean “in­var­i­ance un­der a group of trans­for­ma­tions”.

• This is a great point. Other than fairly easy ge­o­met­ric and time sym­me­tries, do you have any ad­vice or know of any re­sources which might be helpful to­wards find­ing these sym­me­tries?

Here’s what I do know: Some­times you can rec­og­nize these sym­me­tries by an­a­lyz­ing a model differ­en­tial equa­tion. Here’s a book on the sub­ject that I haven’t read, but might read in the fu­ture. My PhD ad­vi­sor tells me I already know one re­li­able way to find these sym­me­tries (e.g., like how to find the change of vari­ables used here), so read­ing this would be a poor use of time in his view. This ap­proach also re­quires know­ing a fair bit more about a phe­nom­ena than just which vari­ables it de­pends on.

• The book you linked is the sort of thing I had in mind. The his­tor­i­cal mo­ti­va­tion for Lie groups was to de­velop a sys­tem­atic way to use sym­me­try to at­tack differ­en­tial equa­tions.

• This is a great point. Other than fairly easy ge­o­met­ric and time sym­me­tries, do you have any ad­vice or know of any re­sources which might be helpful to­wards find­ing these sym­me­tries?

Are you fa­mil­iar with Noether’s The­o­rem? It comes up in some ex­pla­na­tions of Buck­ing­ham pi, but the point is mostly “if you already know that some­thing is sym­met­ric, then some­thing is con­served.”

The most similar thing I can think of, in terms of “re­sources for find­ing sym­me­tries,” might be re­lated to find­ing Lya­punov sta­bil­ity func­tions. It seems there’s not too much in the way of au­to­mated func­tion-find­ing for ar­bi­trary sys­tems; I’ve seen at least one au­to­mated ap­proach for sys­tems with polyno­mial dy­nam­ics, though.

• Noether’s the­o­rem has noth­ing to do with Buck­ing­ham’s the­o­rem. Buck­ing­ham’s the­o­rem is quite gen­eral (and vac­u­ous), while Noether’s the­o­rem is only about hamil­to­nian/​la­grangian me­chan­ics.

Added: Ac­tu­ally, Buck­ing­ham and Noether do have some­thing in com­mon: they both taught at Bryn Mawr.

• Noether’s the­o­rem has noth­ing to do with Buck­ing­ham’s the­o­rem.

Both of them are rele­vant to the pro­ject of ex­ploit­ing sym­me­try, and deal with solid­ify­ing a mostly un­der­stood situ­a­tion. (You can’t ap­ply Buck­ing­ham’s the­o­rem un­less you know all the rele­vant pieces.) The more prac­ti­cal piece that I had in mind is that some­one ea­ger to ap­ply Noether’s the­o­rem will need to look for sym­me­tries; they may have found tech­niques for hunt­ing for sym­me­tries that will be use­ful in gen­eral. It might be worth look­ing into ma­te­rial that teaches it, not be­cause it it­self is di­rectly use­ful, but be­cause the com­mu­nity that knows it may know other use­ful things.

• It’s a quite bit more gen­eral than La­grangian me­chan­ics. You can ex­tend it to any func­tional that takes func­tions be­tween two man­i­folds to com­plex num­bers.

• In what sense do you mean Buck­ing­ham’s the­o­rem is vac­u­ous?

• Not fa­mil­iar with Noether’s the­o­rem. Seems use­ful for con­struct­ing mod­els, and per­haps de­ter­min­ing if some­thing else be­yond mass, mo­men­tum, and en­ergy is con­served. Is the con­verse true as well, i.e., does con­ser­va­tion im­ply that sym­me­tries ex­ist?

I’m also afraid I know nearly noth­ing about non-lin­ear sta­bil­ity, so I’m not sure what you’re refer­ring to, but it sounds in­ter­est­ing. I’ll have to read the Wikipe­dia page. I’d be in­ter­ested if you know any other good re­sources for learn­ing this.

• Is the con­verse true as well, i.e., does con­ser­va­tion im­ply that sym­me­tries ex­ist?

I think this is what Lie groups are all about, but that’s a bit deeper in group the­ory than I’m com­fortable speak­ing on.

I’d be in­ter­ested if you know any other good re­sources for learn­ing this.

I learned it the long way by tak­ing classes, and don’t re­call be­ing par­tic­u­larly im­pressed by any text­books. (I can lend you the ones I used.) I re­mem­ber think­ing that read­ing through Akella’s lec­ture notes was about as good as tak­ing the course, and so if you have the time to de­vote to it you might be able to get those from him by ask­ing nicely.

• Con­ser­va­tion gives a lo­cal sym­me­try but there may not be a global sym­me­try.

For in­stance, you can imag­ine a phys­i­cal sys­tem with no forces at all, so ev­ery­thing is con­served. But there are still some pa­ram­e­ters that define the lo­ca­tion of the par­ti­cles. Then the phys­i­cal sys­tem is lo­cally very sym­met­ric, but it may still have some sym­met­ric global struc­ture where the par­ti­cles are con­strained to lie on a sur­face of non­triv­ial topol­ogy.

• I’ve always been amazed at the power of di­men­sional anal­y­sis. To me the best ex­am­ple is the prob­lem of calcu­lat­ing the pe­riod of an os­cillat­ing mass on a spring. The rele­vant val­ues are the spring con­stant K (kg/​s^2) and the mass M (kg), and the pe­riod T is in (s). The only way to com­bine K and M to ob­tain a value with di­men­sions of (s) is sqrt(M/​K), and that’s the cor­rect form of the ac­tual an­swer—no calcu­lus re­quired!

• Ac­tu­ally, there’s an­other pa­ram­e­ter, the dis­place­ment. It turns out that the spring pe­riod does not de­pend on the dis­place­ment, but that’s a mir­a­cle that is spe­cial to springs. In­stead, look at the pen­du­lum. The same di­men­sional anal­y­sis gives the square root of the length di­vided by grav­i­ta­tional ac­cel­er­a­tion. That’s off by a di­men­sion­less con­stant, 2π. More­over, even that is only ap­prox­i­mately cor­rect. The real an­swer de­pends on the dis­place­ment in a com­pli­cated way.

• This is a good point. At best you can figure out that pe­riod is pro­por­tional to (not equal to) sqrt(M/​K) mul­ti­plied by some func­tion of other pa­ram­e­ters, say, one in­volv­ing dis­place­ment and an­other char­ac­ter­iz­ing the non-lin­ear­ity (if K is just the ini­tial slope, as I’ve seen done be­fore). It’s a for­tu­nate co­in­ci­dence if the other pa­ram­e­ters are unim­por­tant. You can not de­ter­mine based solely on di­men­sional anal­y­sis whether cer­tain pa­ram­e­ters are unim­por­tant.

• That’s be­cause out­side of physics (and pos­si­bly chem­istry) there are enough con­stants run­ning around that all quan­tities are effec­tively di­men­sion­less. I’m hav­ing a hard time see­ing a situ­a­tion in say biol­ogy where I could pro­pose di­men­sional anal­y­sis with a straight face, to say noth­ing of softer sci­ences.

• As I said, di­men­sional anal­y­sis does not help with cat­e­gor­i­cal vari­ables. And when the num­ber of di­men­sions is low and/​or the num­ber of vari­ables is large, di­men­sional anal­y­sis can be use­less. I think it’s a nec­es­sary com­po­nent of any model builder’s toolbox, but not a tool you will use for ev­ery prob­lem. Still, I would ar­gue that it’s un­der­uti­lized. When di­men­sional anal­y­sis is use­ful, it definitely should be used. (For ex­am­ple, de­spite its ob­vi­ous ap­pli­ca­tions in physics, I don’t think most physics un­der­grads learn the Buck­ing­ham pi the­o­rem. It’s usu­ally only taught to en­g­ineers learn­ing fluid dy­nam­ics and heat trans­fer.)

Two very com­mon di­men­sion­less pa­ram­e­ters are the ra­tio and frac­tion. Both cer­tainly ap­pear in biol­ogy. Also, the sub­ject of al­lom­e­try in biol­ogy is ba­si­cally sim­ple di­men­sional anal­y­sis.

I’ve seen di­men­sional anal­y­sis ap­plied in other soft sci­ences as well, e.g., poli­ti­cal sci­ence, psy­chol­ogy, and so­ciol­ogy are a few ex­am­ples I am aware of. I can’t com­ment much on the util­ity of its ap­pli­ca­tion in these cases, but it’s such a sim­ple tech­nique that I think it’s worth try­ing when­ever you have data with units.

Speak­ing more gen­er­ally, the idea of sim­plifi­ca­tion com­ing from ap­ply­ing trans­for­ma­tions to data has broad ap­pli­ca­bil­ity. Di­men­sional anal­y­sis is just one ex­am­ple of this.

• One thing that most sci­en­tists in these soft sci­en­tists already have a good grasp on, but a lot of laypeo­ple do not, is the idea of ap­pro­pri­ately nor­mal­iz­ing pa­ram­e­ters. For in­stance di­vid­ing some­thing by the mass of the body, or the pop­u­la­tion of a na­tion, to do com­par­i­sons be­tween in­di­vi­d­u­als/​na­tions of differ­ent sizes.

Peo­ple will of­ten make bad com­par­i­sons where they don’t nor­mal­ize prop­erly. But hope­fully most peo­ple read­ing this ar­ti­cle are not at risk for that.

• What re­sources would you recom­mend for learn­ing ad­vanced statis­tics?

• What would you call “ad­vanced” statis­tics? But let’s start list­ing classes:

1) In­tro to Discrete and Con­tin­u­ous Prob­a­bil­ity—you’ll need this for ev­ery pos­si­ble path

Now we need to start branch­ing out. Choose your ad­ven­ture: ap­plied or the­o­ret­i­cal? Fre­quen­tist, Bayesian, Like­li­hood­ist, or “Ma­chine” Learn­ing?

Your nor­mal uni­ver­sity statis­tics se­quence will prob­a­bly give you In­tro to Fre­quen­tist Statis­tics 1 at this point. That’s a fine way to go, but it’s not the only way. In fact, many de­part­ments in the em­piri­cal sci­ences will teach Data Anal­y­sis classes, or the like, which in­tro­duce ap­plied statis­tics be­fore teach­ing you the the­ory, which would mean you’ve ac­tu­ally dealt with real data be­fore you learn the the­ory. I think that might be a Very Good Idea.

Now let’s hope you’ve taken one of the fol­low­ing paths:

• Data Anal­y­sis and In­tro to Fre­quen­tist Stats 1

• In­tro to Bayesian Statis­tics 1

• In­tro to Ma­chine Learn­ing (with lab­o­ra­tory ex­er­cises to get ex­pe­rience)

From there I would recom­mend know­ing lin­ear alge­bra de­cently well be­fore mov­ing on. Then you can start tak­ing courses/​read­ing text­books in more ad­vanced/​the­o­ret­i­cal ma­chine learn­ing, com­pu­ta­tional Bayesian meth­ods, mul­ti­di­men­sional fre­quen­tist statis­tics, causal anal­y­sis, or just more and more ap­plied data anal­y­sis. You should prob­a­bly check what sort of statis­ti­cal meth­ods are fa­vored “in the field” that you ac­tu­ally care about.

• 26 Jun 2015 23:07 UTC
1 point

Real world data of­ten has the sur­pris­ing prop­erty of “di­men­sion­al­ity re­duc­tion”: a small num­ber of la­tent vari­ables ex­plain a large frac­tion of the var­i­ance in data.

Why is that sur­pris­ing? The causal struc­ture of the world is very sparse, by the na­ture of causal­ity. One cause has sev­eral effects, so once you scale up to lots of causative vari­ables, you ex­pect to find that large por­tions of the var­i­ance in your data are ex­plained by only a few causal fac­tors.

Causal­ity is in­deed the skele­ton of data. And oh boy, wait un­til you hit hi­er­ar­chi­cal Bayes mod­els!

Only, the vari­ables that ex­plain a lot usu­ally aren’t the vari­ables that are im­me­di­ately visi­ble – in­stead they’re hid­den from us, and in or­der to model re­al­ity, we need to dis­cover them, which is the func­tion that PCA serves.

Not quite. PCA helps you re­duce di­men­sion­al­ity by dis­cov­er­ing the di­rec­tions of vari­a­tion in your fea­ture-space that ex­plain most of the vari­a­tion (in fact, a to­tal or­der­ing of the di­rec­tions of vari­a­tion in the data by how much vari­a­tion they ex­plain). Then there’s In­de­pen­dent Com­po­nents Anal­y­sis, which sep­a­rates your fea­ture data into its most in­de­pen­dent/​or­thog­o­nal di­rec­tions of vari­a­tion.

• The causal struc­ture of the world is very sparse, by the na­ture of causal­ity.

Can you ex­pand your rea­son­ing? We do see around us sparse — that is, un­der­stand­able — causal sys­tems. And even chaotic ones of­ten give rise to sim­ple prop­er­ties (e.g. mo­tion of huge num­bers of molecules → gas laws). But why (ig­nor­ing an­thro­pocen­tric ar­gu­ments) would one ex­pect to see this?

• 28 Jun 2015 18:52 UTC
−1 points
Parent

There are re­ally just three ways the causal struc­ture of re­al­ity could go:

• Many causes → one effect

• One cause → one effect, strictly

• One cause → many effects

Since the lat­ter will gen­er­ate more (ap­par­ent) ran­dom vari­ables, most ob­serv­ables will end up de­riv­ing from a rel­a­tively sparse causal struc­ture, even if we as­sume that the causal struc­tures them­selves are sam­pled uniformly from this se­lec­tion of three.

So, for in­stance, pa­ram­e­ter-space com­pres­sion (which is its own topic to ex­plain, but oh well), aka: the hi­er­ar­chi­cal struc­ture of re­al­ity, ac­tu­ally does fol­low that first item: many micro-level causes give rise to a sin­gle macro-level ob­serv­able. But you’ll still find that most ob­serv­ables come from non-com­pres­sive causal struc­tures.

This is why we ac­tu­ally have to work re­ally hard to find out about micro-scale phe­nom­ena (things lower on the hi­er­ar­chy than us): they have fewer ob­serv­ables whose var­i­ance is uniquely ex­pli­ca­ble by refer­ence to a micro-scale causal struc­ture.

• I need that ex­panded a lot more. Why not many causes → many effects, for ex­am­ple?

• 28 Jun 2015 23:11 UTC
−1 points
Parent

Ah, you mean a densely in­ter­con­nected “al­most all to al­most all” causal struc­ture. Well, I’d have to guess: be­cause that would look far more like ran­dom be­hav­ior than causal or­der, so we wouldn’t even no­tice it as some­thing to causally an­a­lyze!

• We do no­tice tur­bu­lence as some­thing doesn’t look ran­dom, and is hard-to-im­pos­si­ble to causally an­a­lyze.

Here’s an anec­dote. I can’t copy and paste it, but it’s in the mid­dle column.

• This is a very in­ter­est­ing point. PCA (or as its time and/​or space se­ries ver­sion is called, the Karhunen-Loève ex­pan­sion and/​or POD) has not been found to be use­ful for tur­bu­lence mod­el­ing, as I re­call. There’s a brief sec­tion in Pope’s book on tur­bu­lence about mod­el­ing with this. From what I un­der­stand, POD is mostly used for vi­su­al­iza­tion pur­poses, not to help build mod­els. (It’s worth not­ing that while my back­ground in fluid dy­nam­ics is strong, I know lit­tle to noth­ing about PCA and the like aside from what they ap­par­ently do.)

Maybe I don’t ac­tu­ally un­der­stand causal­ity, but I think in terms of mod­el­ing, we do have a good model (the Navier-Stokes, or N-S, equa­tions) and so in some sense, it’s clear what causes what. In prin­ci­ple, if you run a com­puter simu­la­tion with these equa­tions and the cor­rect bound­ary con­di­tions, the re­sult will be rea­son­ably ac­cu­rate. This has been demon­strated through di­rect simu­la­tions of some rel­a­tively sim­ple cases like flow through a chan­nel. So that’s not the is­sue. The ac­tu­ally is­sue is that you need a lot of com­put­ing power to simu­late even ba­sic flows, and at­tempts to de­velop lower or­der mod­els have been fairly un­suc­cess­ful. So as a model, N-S is of limited util­ity as-is.

In my view, the “tur­bu­lence prob­lem” comes down to two facts: 1. the N-S equa­tions are chaotic (sen­si­tive to ini­tial con­di­tions, so small changes can cause big effect) and 2. they ex­hibit large scale sep­a­ra­tion (so the small­est de­tails you need to re­solve, the Kol­mol­gorov scales in most cases are much smaller than the phys­i­cal di­men­sions of a prob­lem, say the length of a wing). To un­der­stand these points bet­ter, imag­ine that rigid body dy­nam­ics was in­ac­cu­rate (say, mod­el­ing the tra­jec­tory of a base­ball), and you had to model all the in­di­vi­d­ual atoms to get it right. And if one was off that might pos­si­bly have a big effect. Ob­vi­ously that’s a lot harder, and it’s prob­a­bly com­pu­ta­tion­ally in­tractable out­side of a few sim­ple cases. (The chaos part is “avoided” be­cause you prob­a­bly would simu­late an en­sem­ble of ini­tial con­di­tions via Monte Carlo or some­thing else, and get an “en­sem­ble mean” which you would com­pare against an ex­per­i­ment. This works well from what I un­der­stand even if the de­tails are un­clear.)

So in some sense, yes, this looks like an “al­most all to al­most all” causal struc­ture. Though, I looked up a bit about causal di­a­grams and it’s not even clear to me how you might draw one for tur­bu­lence, and not be­cause of tur­bu­lence it­self. It’s not clear what an “event” might be to me. There isn’t even a pre­cise defi­ni­tion of “tur­bu­lence” to be­gin with, so maybe this should be ex­pected. I sup­pose on some level such things are ar­bi­trary and you could define an event to be fluid move­ment in some di­rec­tion, for each di­rec­tion, each point in space, and each time. I’m not sure if any­one has done this sort of anal­y­sis.

(For the in­com­press­ible N-S equa­tions, you can eas­ily say that ev­ery­thing causes ev­ery­thing be­cause the equa­tions are el­lip­tic, so the speed of sound is in­finite (which means changes in some place are felt ev­ery­where in­stan­ta­neously). In other words, the “do­main of de­pen­dence” is ev­ery­where. But I don’t know if that means these effects are sub­stan­tial. Ob­vi­ously in re­al­ity, far away from some­thing quiet that’s hap­pen­ing, you don’t no­tice it, even if the sound waves had time to reach you. In prac­tice, this means that do­ing in­com­press­ible fluid dy­nam­ics re­quires the solu­tion of an el­lip­tic PDE, which can be a pain for rea­sons un­re­lated to tur­bu­lence.)

• I dis­agree that you can get an un­der­stand­ing of the idea that “cor­re­la­tion does not im­ply cau­sa­tion” from Stats 101. I don