# Historical mathematicians exhibit a birth order effect too

[Epistemic sta­tus: pi­lot study. I’m hop­ing that oth­ers will help to ver­ify or falsify my con­clu­sion here. I’ve never done an anal­y­sis of this sort be­fore, and would ap­pre­ci­ate cor­rec­tion of any er­rors.

A pre­vi­ous ver­sion of this post has some minor er­rors in the anal­y­sis, which have since been cor­rected. Most no­tably, de­vi­a­tion from ex­pected rate of first borns was origi­nally noted as 14.98 per­centage points. It is ac­tu­ally 16.65 per­centage points.]

A big thank you to Dan Keys for work­ing through the statis­tics with me.

Since the late 1800′s, pop psy­chol­ogy has pos­tu­lated that a per­son’s birth or­der (whether one is the first, last, mid­dle, etc. of one’s siblings) has an im­pact on his/​her life­time per­son­al­ity traits. How­ever, rigor­ous large-scale analy­ses have re­li­ably found no sig­nifi­cant effect on sta­ble per­son­al­ity, with some ev­i­dence for a small effect on in­tel­li­gence. (The Wikipe­dia page lists some rele­vant pa­pers on birth or­der effects on per­son­al­ity (1, 2, 3) and on in­tel­li­gence (1, 2, 3).)

So, we were all pretty sur­prised when, around 2012, sur­vey data sug­gested a very strong birth or­der effect amongst those in the broader ra­tio­nal­ity com­mu­nity.

The Less Wrong com­mu­nity is de­mo­graph­i­cally dom­i­nated by first-borns: a startlingly large per­centage of us have only younger siblings. On av­er­age, it looks like there’s about a twenty-two per­centage point differ­ence be­tween the ac­tual rate of first borns and the ex­pected rate, from the 2018 Slate Star Codex Sur­vey data Scott cites in the linked post above. (More speci­fi­cally, the ex­pected rate of first-borns is 39% and the ac­tual oc­cur­rence in the sur­vey data is 62%.) The 2012 Less Wrong sur­vey also found a 22 per­centage point differ­ence. This effect is highly sig­nifi­cant, in­clud­ing af­ter tak­ing into ac­count other de­mo­graphic fac­tors.

A few weeks ago, Scott Garrabrant (one of the re­searchers at MIRI) off-hand­edly won­dered aloud if great math­e­mat­i­ci­ans (who plau­si­bly share some im­por­tant fea­tures with LessWrongers), also ex­hibit this same trend to­wards be­ing first born.

The short an­swer: Yes, they do, as near as I can tell, but not as strongly as LessWrongers.

My data and anal­y­sis is doc­u­mented here.

## Methodology

Fol­low­ing Sarah Con­stantin’s fact post method­ol­ogy, I started by tak­ing a list of the 150 great­est math­e­mat­i­ci­ans from here. This is per­haps not the most ac­cu­rate or sci­en­tific rank­ing of his­tor­i­cal math tal­ent, but in prac­tice, there’s enough broad agree­ment about who the big names are, that quib­bles over who should be in­cluded are mostly ir­rele­vant to our pur­pose. If a per­son could plau­si­bly be in­cluded on a list of the great­est 150 math­e­mat­i­ci­ans in his­tory, he/​she was prob­a­bly a pretty good math­e­mat­i­cian.

I then went through the list, and tried to find out how many older and younger siblings each math­e­mat­i­cian had. For the most part this amounted to googling “[math­e­mat­i­cian’s name] siblings” and then trawl­ing through the re­sults to find one that gave me the in­for­ma­tion I wanted. Where pos­si­ble, I noted not just the birth or­der and num­ber of siblings, but also the sex of the siblings and whether they died dur­ing in­fancy. (For the ones for whom I couldn’t get data, I marked the row as “Couldn’t find” or “Un­known”)

Most bi­o­graph­i­cal sources don’t list the num­ber of siblings of the fam­ily of ori­gin. The sources that I ended up rely­ing on the most were:

This was a very quick cur­sory search, so my data is prob­a­bly not su­per re­li­able. At least twice, I found two sources that dis­agreed, and I don’t know how much I would have en­coun­tered con­flict­ing in­for­ma­tion if I had dug deeper into each per­son’s bi­og­ra­phy, in­stead of mov­ing on to the next math­e­mat­i­cian as soon as I found a sen­tence that an­swered my query.

If you hap­pen to per­son­ally know bi­o­graph­i­cal de­tails of elite math­e­mat­i­ci­ans and you can cor­rect any er­rors in these data, I’d be pleased to make those cor­rec­tions.

## Results

The sim­plest anal­y­sis is to cat­e­go­rize the data by fam­ily size (all the math­e­mat­i­ci­ans that had no siblings, one sibling, 2 siblings, etc.), count how many first borns there were in each bucket, and com­pare that to the num­ber we would ex­pect by chance.

For nearly ev­ery bucket, the fre­quency of first born chil­dren ex­ceeded ran­dom chance. Across all cat­e­gories, the differ­ence in per­centage points be­tween the ac­tual and ex­pected fre­quen­cies was about 16.5%.

After re­mov­ing the in­di­vi­d­u­als that I couldn’t find data for, we had a sam­ple size of 82. A paired t-test, com­par­ing the num­ber of first-borns with the ex­pected num­ber of first-borns (one data point for each of the 82 math­e­mat­i­ci­ans) was statis­ti­cally sig­nifi­cant, t(81)=3.14, p = 0.00239.

I can show you some bar graphs, like Scott uses in his post, but be­cause this data is of a much smaller sam­ple and the effect isn’t as large, they don’t look as neat. (Also, I don’t know how to in­clude those nice dot­ted lines mark­ing the ex­pected fre­quency.)

Nev­er­the­less, you can see a sys­tem­atic trend: be­ing the first of n siblings is over­rep­re­sented among the math­e­mat­i­ci­ans in the sam­ple I used.

The effect in these data (17 per­centage points) is smaller than the effect in ei­ther the Less Wrong or Slate Star Codex sur­veys (22 per­centage points). The 95% con­fi­dence in­ter­val for the math­e­mat­i­cian data is a range of 6 per­centage points to 27 per­centage points. Given this range, we can’t rule out that the differ­ence in effect sizes is due to noise, but it seems most plau­si­ble that there is a real differ­ence in the size of the un­der­ly­ing effect be­tween the pop­u­la­tions.

## A dis­cus­sion of bias in this data

As I say, my data is not very re­li­able, it seems plau­si­ble that some of my sources were faulty, and I was go­ing quickly, so I may have made some er­rors in do­ing data col­lec­tion. Fur­ther­more, I was only able to find data for 82 of the 150 math­e­mat­i­ci­ans.

But in ex­pec­ta­tion, those er­rors will can­cel out, un­less there’s some sys­tem­atic bias in the sources I was us­ing. I can think of at least two causes of bias, but nei­ther one seems like it could be the cause of the ob­served trend.

Higher re­port­ing rate for first born children

First, maybe first borns are recorded more read­ily? If the first born child was the heir to a fam­ily’s prop­erty, then they may have been more likely to be men­tioned in le­gal and other doc­u­ments, so there may be much bet­ter his­tor­i­cal records of first-born chil­dren.

But our sub­jects are all fa­mous math­e­mat­i­ci­ans, in­de­pen­dent from their in­her­i­tance-sta­tus. So, if there was a his­tor­i­cal re­port­ing bias that fa­vored the first born, this would ac­tu­ally push against our ob­served effect. First born mem­bers of our sam­ple would ei­ther be listed as only chil­dren, or noted as hav­ing an un­known num­ber of siblings. Younger-sibling math­e­mat­i­ci­ans, on the other hand, would be noted as younger siblings, be­cause their older brother is added to the his­tor­i­cal record on the ba­sis of their heir­ship.

Un­der­re­port­ing of females

Another way in which the available record of sibling data may be bi­ased, which does not di­rectly af­fect the val­idity of this anal­y­sis, is that women might have gone un­recorded more of­ten than men. The size of this effect tells us some­thing about the ex­tent to which the available record of sibling data is bi­ased.

It was rel­a­tively easy to do a quick check for a re­port­ing bias in fa­vor of male siblings: I just summed all the broth­ers that I found, and all the sisters.

All to­gether, I recorded 110.5 broth­ers and half broth­ers and 100.5 sisters and half sisters. (The point five comes from Jean-Bap­tiste Joseph Fourier’s en­try. I found that he had 3 half siblings by his father’s first mar­riage, but I didn’t know of what sex. So I split the differ­ence by say­ing he had 1.5 half broth­ers and 1.5 half sisters, in ex­pec­ta­tion. I was com­fortable do­ing this be­cause I mostly care about whether siblings are younger or older, and only sec­on­dar­ily about if they are male or fe­male.)

So there are slightly more males listed, at least in the sources I could find. But a differ­ence of 10 out of 211 siblings with recorded sex, isn’t very large. I’m sure there are some statis­tics I could do to show it, but I don’t think that slight bias is suffi­cient to ac­count for our ob­served birth or­der effect.

I’m hop­ing that oth­ers can think of rea­sons why we might see a trend in these data even if the birth or­der effect wasn’t real.

## Conclusion

This is a pretty in­trigu­ing re­sult, and I’m sur­prised no one (that I know of) has no­ticed it be­fore now.

I think this post should be thought of as a pi­lot study. I put in about 20 hours to in­ves­ti­gate the hy­poth­e­sis, but only in a quick and cur­sory way. I would be ex­cited for oth­ers, who are bet­ter in­formed and bet­ter-equipped than I am, to do a more in-depth anal­y­sis into these top­ics.

Do math­e­mat­i­ci­ans of lesser renown dis­play this birth or­der effect? What about promi­nent (or av­er­age) in­di­vi­d­u­als from other STEM fields? Non-STEM fields? I’d be in­ter­ested to see an anal­y­sis of the most suc­cess­ful busi­ness ex­ec­u­tives, for in­stance.

Fur­ther­more, more in­ves­ti­ga­tion could un­cover de­tail about how hav­ing older siblings gives rise to this effect.

Some ex­pla­na­tions for this phe­nomenon rest on so­cial in­ter­ac­tion with older siblings in one’s first few years. Others de­pend on biolog­i­cal con­se­quences of spend­ing one’s fe­tal pe­riod in a womb that was pre­vi­ously oc­cu­pied by older siblings. In prin­ci­ple we should be able to tease out which of these mechanisms gen­er­ates the effect by look­ing at much more data that tracks older siblings that died in in­fancy, and older half siblings. (Siblings that died in in­fancy can’t me­di­ate the so­cial effect, while half siblings can me­di­ate a biolog­i­cal effect de­pend­ing on which par­ent is shared, and can me­di­ate a so­cial effect de­pend­ing on whether they were liv­ing in the house­hold at the time of birth.) If some­one found a larger dataset that tracked these fac­tors, we might be able to falsify one or the other of these sto­ries.

And again, please in­form me of any er­rors.

• This feels like a pretty cen­tral ex­am­ple of ‘things we found out on less­wrong in 2018’. Great work all round, so I’m nom­i­nat­ing it. Next year, I’ll also nom­i­nate the fur­ther work on this that came out in 2019.

• This is now a hy­poth­e­sis I look out for and see many places, thanks in part to this post.

• I was go­ing to write a longer re­view but I re­al­ised that Ben’s cu­ra­tion no­tice ac­tu­ally ex­plains the strengths of this post very well so you should read that!

In terms of in­clud­ing this in the 2018 re­view I think this de­pends on what the re­view is for.

If the re­view is pri­mar­ily for the pur­pose of build­ing com­mon knowl­edge within the com­mu­nity then in­clud­ing this post maybe isn’t worth it as it is already fairly well known, hav­ing been linked from SSC.

On the other hand if the re­view pro­cess is at least partly for, as Rae­mon put it:

“I want LessWrong to en­courage ex­tremely high qual­ity in­tel­lec­tual la­bor.”

Then this post feels like an ex­tremely strong can­di­date.

(Per­sonal foot­note: This post was es­sen­tially what con­verted me from a LessWrong lurker to a reg­u­lar com­men­tor/​con­trib­u­tor—I think it was mainly just be­ing im­pressed with how thor­ough it was and think­ing that’s the kind of com­mu­nity I’d like to get in­volved with.)

• (Per­sonal foot­note: This post was es­sen­tially what con­verted me from a LessWrong lurker to a reg­u­lar com­men­tor/​con­trib­u­tor—I think it was mainly just be­ing im­pressed with how thor­ough it was and think­ing that’s the kind of com­mu­nity I’d like to get in­volved with.)

: )

• I would not in­clude this in the Best-of-2018 Re­view.

While it’s good and well-re­searched, it’s more or less a foot­note to the Slate Star Codex post linked above. (I think there’s an ar­gu­ment for back-port­ing old SSC posts to LW with Scott’s con­sent, and if that were done I’d have nom­i­nated sev­eral of those.)

• I cu­rated this post for these rea­sons (start­ing with the most im­por­tant):

• This post is a cen­tral ex­am­ple of the sort of in­tel­lec­tual labour that many peo­ple can do, is ac­tu­ally use­ful, and that I’d love to see peo­ple do­ing more on LessWrong. Tak­ing on a nearby op­er­a­tional­i­sa­tion of the ques­tion (great math­e­mat­i­ci­ans), spend­ing 20 hours gath­er­ing + analysing data, and open-sourc­ing it all, re­ally helps shine a light on an em­piri­cal ques­tion like this.

• The post is sur­pris­ingly clearly writ­ten. It uses an aca­demic struc­ture (in­tro, method­ol­ogy, re­sults, dis­cus­sion, con­clu­sion) yet doesn’t add need­less com­plex­ity, and is short. I en­joyed read­ing it.

• It adds to prior work done by a mem­ber of the broader ra­tio­nal­ity com­mu­nity. I might have thought there’d just been some weird mis­take with LW/​SSC com­mu­nity data, but hav­ing it for this in­de­pen­dent group is start­ing to be­come a much stronger ar­gu­ment.

Thoughts on fur­ther work:

• As the OP says, see­ing some more repli­ca­tion at­tempts on other pop­u­la­tions (e.g. less fa­mous math­e­mat­i­ci­ans, but also vastly differ­ent parts of so­ciety) would be in­ter­est­ing.

• Some peo­ple de­vel­op­ing first-prin­ci­ples mod­els that pre­dict this effect (e.g. mod­els of biol­ogy/​chem­istry in the womb, mod­els of so­cial­is­ing effects of siblings), that make other testable pre­dic­tions we can ex­plore, would also be great to see (as the OP points to in the con­clu­sion).

• If peo­ple think of strong rea­sons why this dataset (or the ear­lier LW/​SSC ones) are bi­ased, that would also be a big help.

Over­all, this is a great repli­ca­tion work. Please do more of this!

• Would it be fair to say that any his­tor­i­cal data on suc­cess­ful sci­en­tists/​math­e­mat­i­ci­ans will be over rep­re­sented by first­borns due to pri­mo­gen­i­ture in­her­i­tance laws and cus­toms? His­tor­i­cally those in­volved in the sci­ences mainly had to be in­de­pen­dently wealthy and be­ing a first born would tend to help with that with those born af­ter more likely to have to work for a liv­ing. Maybe fa­mous his­tor­i­cal lawyers would tend to be un­der rep­re­sented by first­borns?

I’d ex­pect this to be a fairly large se­lec­tion effect similar in size to the Less Wrong sur­vey but pre­sum­ably caused by a differ­ent mechanism.

Pos­si­bly a data set which would have more bear­ing on the ques­tion of birth or­der effects in mod­ern times would be Fields medal, Abel prize, Tur­ing award, No­bel prizes in Physics, Chem­istry, Medicine and Eco­nomics in the last 30 years or so—I don’t have a great feel for how long ago the pri­mo­gen­i­ture in­her­i­tance thingy stopped be­ing rele­vant but given an av­er­age No­bel lau­re­ate age of 59 this would mean peo­ple born since ~1930. Th­ese might be eas­ier to find data on than Thales of Mile­tus too!!

• His­tor­i­cally those in­volved in the sci­ences mainly had to be in­de­pen­dently wealthy

There have been pro­fes­sor­ships of math­e­mat­ics in Europe since at least the 1500′s, and most of the math­e­mat­i­ci­ans on this list were em­ployed by uni­ver­si­ties. Fund­ing doesn’t seem to have been a con­straint, at least for math­e­mat­i­ci­ans of this cal­iber.

Ed­u­ca­tion, how­ever does seem rele­vant. Go­ing through the data, I fre­quently no­ticed the bi­o­graph­i­cal pat­tern “X-per­son’s ex­cep­tional math­e­mat­i­cal tal­ent was no­ticed in [early school­ing], and he was sent to [some uni­ver­sity].” I don’t know how com­mon it was for chil­dren to at­tend the equiv­a­lent of el­e­men­tary school be­fore the 1900s.

From my very cur­sory look at the bi­o­graph­i­cal de­tails of these math­e­mat­i­ci­ans I can say that...

• At least a few came from very poor fam­i­lies, but nev­er­the­less re­ceived early school­ing of some kind. (I don’t know how rare this was, maybe only one out of 50 poor fam­i­lies send their kids to school.)

• Siblings were of­ten men­tioned to have also re­ceived an ed­u­ca­tion at the same in­sti­tu­tion. This leads me to guess that school­ing was not a priv­ilege awarded to only some of the (male, at least) chil­dren of a fam­ily.

Again, if any­one knows more about these things than I do, feel free to chip in.

Pos­si­bly a data set which would have more bear­ing on the ques­tion of birth or­der effects in mod­ern times would be Fields medal, Abel prize, Tur­ing award, No­bel prizes in Physics, Chem­istry, Medicine and Eco­nomics in the last 30 years or so

Yep. I think that would be use­ful.

• Thanks, that seems to rule in­her­i­tance laws out as a sig­nifi­cant fac­tor. I’m quite tempted to cre­ate that data set my­self. Any ob­jec­tion if I use your anal­y­sis spread­sheet as a tem­plate?

• Feel free!

• After re­mov­ing the in­di­vi­d­u­als that I couldn’t find data for, we had a sam­ple size of 86. A paired t-test, com­par­ing the num­ber of first-borns with the ex­pected num­ber of first-borns (one data point for each of the 86 math­e­mat­i­ci­ans) was statis­ti­cally sig­nifi­cant, t(85)=2.86, p = 0.00529.

Wouldn’t this be a chi-squared/​pro­por­tion test? Or a bino­mial re­gres­sion? (What would you be com­par­ing means of, tak­ing birth cat­e­gory as an in­te­ger and av­er­ag­ing them?)

• For each math­e­mat­i­cian, ac­tual first­born­ness was coded as 0 or 1, and ex­pected first­born­ness as 1/​n (where n is the num­ber of chil­dren that their par­ents had). Then we just did a paired t-test, which is equiv­a­lent to sub­tract­ing ac­tual minus ex­pected for each data point and then do­ing a one sam­ple t-test against a mean of 0. You can see this all in Eli’s spread­sheet here; the data are also all there for you to try other statis­ti­cal tests if you want to.

• I not sure t-tests are the best ap­proach to take com­pared to some­thing non-para­met­ric, given small­ish sam­ple, con­sid­er­able skew, etc. (this pa­per’s statis­ti­cal meth­ods sec­tion is pretty handy). Nonethe­less I’m con­fi­dent the con­sid­er­able effect size (in rel­a­tive terms, al­most a dou­bling) is not an arte­fact of statis­ti­cal tech­nique: when I plugged the num­bers into a chi-squared calcu­la­tor I got P < 0.001, and I’m con­fi­dent a per­mu­ta­tion tech­nique or similar would find much the same.

• A (com­pletely un­vet­ted) idea that was just sug­gested to me by some­one:

There’s some folk wis­dom that first born chil­dren are born later, and spend more time in the womb on av­er­age. If this is true, per­haps it me­di­ates the in­tel­li­gence boost­ing effect? (I have no strong rea­son to sus­pect that it does, but it seems good to note pos­si­ble hy­pothe­ses here.)

Does any­one know if the folk wis­dom is true? Does be­ing first born cor­re­late with a longer na­tal in­cu­ba­tion time?

• Others de­pend on biolog­i­cal con­se­quences of spend­ing one’s fe­tal pe­riod in a womb that was pre­vi­ously oc­cu­pied by older siblings.

The effect has been refer­enced with re­gards to ‘First born/​s’. Could it be old­est child in­stead? (Adop­tion might dis­t­in­guish be­tween them, but I don’t know if it’s com­mon enough to tell. We also prob­a­bly wouldn’t have the data on whether an adopted child is a ‘first born’ or not.)

• > So there are slightly more males listed, at least in the sources I could find. But a differ­ence of 10 out of 211 siblings with recorded sex, isn’t very large. I’m sure there are some statis­tics I could do to show it, but I don’t think that slight bias is suffi­cient to ac­count for our ob­served birth or­der effect.

Al­most all of these math­e­mat­i­ci­ans were male and the sex of suc­ces­sive siblings is [cor­re­lated](https://​​www.biorxiv.org/​​con­tent/​​biorxiv/​​early/​​2015/​​11/​​12/​​031344.full.pdf). Com­bined with ran­dom vari­a­tion, I don’t think we can con­clude bias in this sam­ple.

• Is the premise that while the effect of birth or­der on mean in­tel­li­gence is small, we can see it mag­nified among our com­mu­nity mem­bers and in great math­e­mat­i­ci­ans be­cause each group is likely far more in­tel­li­gent than av­er­age?

I re­call read­ing this https://​​putanu­monit.com/​​2015/​​11/​​10/​​003-soc­cer1/​​, which demon­strates that small mean differ­ences will have out­sized effects on groups com­prised by the dis­tri­bu­tions’ tails.

• Not a premise, but a plau­si­ble hy­poth­e­sis, I think.

If you se­lect very strongly for in­tel­li­gence, you’re go­ing to tend to se­lect for first borns, since those cor­re­late.

But my guess is that isn’t all that’s hap­pen­ing, be­cause the effect size is smaller for the Math­e­mat­i­ci­ans than for LessWrongers. Ra­tion­al­ists are pretty smart, but these are some of the most brilli­ant peo­ple who have ever lived.

It seems like there might be an ad­di­tional trend, amongst ra­tio­nal­ists, to­wards be­ing first born, even af­ter ac­count­ing for high in­tel­li­gence.

[edit: or maybe the first born effect isn’t me­di­ated by in­tel­li­gence at all.]