# Birth order effect found in Nobel Laureates in Physics

[Epistemic sta­tus: Three differ­ent data sets point­ing to some­thing similar is at least in­ter­est­ing, make your own mind up as to how in­ter­est­ing!]

In Eli Tyre’s anal­y­sis of birth or­der in his­tor­i­cal math­e­mat­i­ci­ans, he men­tioned analysing other STEM sub­jects for similar effects. In the com­ments I kinda–sorta pre­reg­istered a study into this. Fol­low­ing his com­ments I dropped the age re­quire­ment I men­tioned as it no longer seemed nec­es­sary.

I found that No­bel Lau­re­ates in Physics are more likely to be first­born than would be ex­pected by chance. This effect (10 per­centage points) is smaller than the effect found in the ra­tio­nal­ist com­mu­nity or his­tor­i­cal math­e­mat­i­ci­ans (22 and 16.7 per­centage points re­spec­tively) but is sig­nifi­cant (p=0.044).

More broth­ers were found in the study then sisters (125:92 (58%)). After cor­rect­ing for the cor­rect ex­pected ra­tio (~52%) this was found to not be sig­nifi­cant (p=0.11).

I was un­able to find suffi­cient data on Fields medal, Abel prize and Tur­ing award win­ners.

My data and anal­y­sis is doc­u­mented here. With Eli’s kind per­mis­sion I used his spread­sheet as a tem­plate. I have kept Eli’s data on the same Table – rows 4-153 are his.

## Methodology

My meth­ods matched Eli’s closely ex­cept for the data sets I looked at, see his post for more in­for­ma­tion.

Ini­tially I at­tempted to repli­cate Eli’s re­sults in other math­e­mat­i­ci­ans by analysing Fields medal and Abel prize win­ners. Un­for­tu­nately I was un­able to gather suffi­cient ad­di­tional data. This is partly due to crossover in names be­tween these math­e­mat­i­ci­ans and the list from which Eli was work­ing.

It also seems to be the case that less bi­o­graph­i­cal in­for­ma­tion is available for peo­ple born af­ter ~1950. This might be partly due to these peo­ple and their siblings be­ing more likely to be still al­ive so data pro­tec­tion rules pre­vent e.g. geni from list­ing their full de­tails (siblings’ de­tails are of­ten set to “pri­vate”) but there could be other rea­sons. For Fields medals awarded be­fore 1986 I found data on 1230 re­cip­i­ents, af­ter that only 330.

I had a brief look at Tur­ing award win­ners, as this would have seemed a rele­vant field to com­pare to the re­sults from the ra­tio­nal­ist com­mu­nity that in­spired the stud­ies, but came across the same prob­lem.

Fi­nally, I looked at No­bel lau­re­ates in Physics. A mas­sive help in data col­lec­tion here was the fact that since the 1970s No­bel lau­re­ates have been asked to sup­ply an au­to­bi­og­ra­phy, which is pub­lished on the No­bel web­site. Even be­fore then there are bi­ogra­phies of each lau­re­ate al­though these sel­dom men­tion birth or­der.

Between the No­bel site, Wikipe­dia and geni I was able to find use­ful data on 100207 Physics lau­re­ates. The other 107 ei­ther had no siblings or I couldn’t find suffi­cient data on them – ei­ther way they weren’t in­cluded in the anal­y­sis.

As a com­ment on data sources, I found geni to be some­what un­re­li­able. It con­tra­dicted the au­to­bi­ogra­phies or some­times even con­tra­dicted it­self. At other times, the list of siblings was in­com­plete or miss­ing com­pletely.

## Results

Cat­e­goris­ing by fam­ily size shows that for all fam­ily sizes with ≥10 data points there are more first­borns than would be ex­pected by chance.

Due to small sam­ple size I have grouped all fam­i­lies of 6+ siblings into a sin­gle bucket and even then n=14. Ex­pected birth or­der then varies with higher birth or­der as there are fewer fam­i­lies in the sam­ple with at least that many chil­dren.

Analysing the data as a whole gives a 10 per­centage point effect (0.2 to 19.8 per­centage points, 95% con­fi­dence). This is less than both the SSC /​ Less Wrong sur­veys and Eli’s his­tor­i­cal math­e­mat­i­ci­ans anal­y­sis (22 and 17 per­centage point re­spec­tively). I haven’t got a num­ber for over­all con­fi­dence level for the SSC data but due to the large data set and very low p quoted for the 2 sibling ex­am­ple, it is un­likely that the 95% con­fi­dence in­ter­val over­laps with this new data, sug­gest­ing that the effect is truly a differ­ent size and not due to chance.

## Discussion

Au­to­bi­ogra­phies as source material

Us­ing au­to­bi­ogra­phies as the source for a sig­nifi­cant num­ber of the data points should have helped with the re­li­a­bil­ity of the data. It is pos­si­ble that when writ­ing an au­to­bi­og­ra­phy one would be more likely to men­tion siblings and birth or­der if one was the el­dest but this doesn’t seem likely.

Gen­der imbalance

Eli dis­cussed un­der re­port­ing of fe­males as a po­ten­tial source of bias. How­ever, he found that the broth­ers:sisters ra­tio in his data was not un­rea­son­able.

Run­ning the same anal­y­sis on the physics No­bel lau­re­ate data I get a ra­tio of 125:92 broth­ers:sisters. This makes the siblings 58% male, with p=0.03 (bino­mial dis­tri­bu­tion, two tailed). This effect is ac­tu­ally more sig­nifi­cant than the birth or­der effect.

Look­ing at the SSC data and Eli’s data and found that there were 52% broth­ers in both. I did a lit­tle re­search and found that ac­tu­ally 51-52% is roughly the ex­pected brother:sister ra­tio. I feel like this is some­thing I should have already known but didn’t.

Another effect which might in­crease the pro­por­tion of No­bel lau­re­ates broth­ers is that men can have a dis­po­si­tion to have boys or a dis­po­si­tion to have girls. As al­most all of the lau­re­ates are male it would be rea­son­able to think more of their Dads were pre­dis­posed to hav­ing boys. How­ever as this isn’t seen in SSC or his­tor­i­cal math­e­mat­i­ci­ans data (both also male dom­i­nated) this doesn’t re­ally get us much fur­ther.

Us­ing 52% as the ex­pected ra­tio (in­stead of 50%) means that the 58% re­sult from No­bel lau­re­ates no longer rises to sig­nifi­cance (p=0.11) and should in­stead be la­bel­led as “hey, look at this in­ter­est­ing sub­group anal­y­sis” or pos­si­bly “slightly odd but not im­plau­si­ble”.

As I men­tioned pre­vi­ously, most of the data since the 1970s No­bels is based on au­to­bi­ogra­phies. Look­ing at only data since then, the brother:sister ra­tio is 51:35 (59%). It seems un­likely that No­bel lau­re­ates for­got about some of their sisters, mak­ing it less likely that the gen­der im­bal­ance is due to in­cor­rect data.

One po­ten­tial source of er­ror in the gen­der bal­ance may be in the siblings whose gen­der I was un­able to de­ter­mine. There were 50 of these. Most (41) of these came from fam­i­lies where I had no data ex­cept the num­ber of siblings and the po­si­tion of the lau­re­ate within the fam­ily (e.g. “I was the fourth of five chil­dren.”). It is pos­si­ble that some of the miss­ing sisters are in this cat­e­gory.

How­ever, this would im­ply that if some­one has more broth­ers they are more likely to list the gen­ders of their siblings than if they have more sisters. Per­haps as most of the lau­re­ates were male they might have had more in com­mon with broth­ers and spend more time with them, mak­ing them statis­ti­cally more likely to men­tion their broth­ers’ gen­der. This seems plau­si­ble but un­likely to cause a big effect even if it were true.

For the mo­ment, I am work­ing with the as­sump­tion that the sam­ple is ac­cu­rate and that the gen­der im­bal­ance is just an out­lier. Any other thoughts on causes of bias are wel­come. Th­ese would have to ex­plain how this effect was seen both in data from both geni and the lau­re­ates’ au­to­bi­ogra­phies.

## Conclusion

No­bel lau­re­ates in physics ex­hibit a birth or­der effect such that they are 10 per­centage points more likely to be the el­dest child than would be ex­pected (p=0.044). This effect is less than data from both SSC read­ers and his­tor­i­cal math­e­mat­i­ci­ans (22 and 17 per­centage points re­spec­tively).

There was a gen­der im­bal­ance be­tween broth­ers and sisters (58% broth­ers) but, tak­ing into ac­count the ex­pected ra­tio of 52%, this was not sig­nifi­cant (p=0.11). This effect is not seen in SSC read­ers or his­tor­i­cal math­e­mat­i­ci­ans (52% in both)

I would recom­mend that any­one who wishes to col­late ad­di­tional his­tor­i­cal data con­sider No­bel lau­re­ates in other awards due to the availa­bil­ity of ac­cu­rate data from the au­to­bi­ogra­phies. My anal­y­sis took per­haps 12 hours but a lot of that was spent on wild goose chases in look­ing for data on Fields medal and Abel prize re­cip­i­ents. I saved a lot of time by reusing Eli’s spread­sheet (thanks for the per­mis­sion). I would es­ti­mate get­ting data on the en­tire his­tory of an­other No­bel prize cat­e­gory and analysing it would take ~6-8 hours so it shouldn’t be too daunt­ing for some­one to take on.

• This feels like a pretty cen­tral ex­am­ple of ‘things we found out on less­wrong in 2018’. Great work all round, so I’m nom­i­nat­ing it. Next year, I’ll also nom­i­nate the fur­ther work on this that came out in 2019.

• This is now a hy­poth­e­sis I look out for and see many places, thanks in part to this post.

• This is a re­view of my own post.

The first thing to say is that for the 2018 Re­view Eli’s math­e­mat­i­ci­ans post should take prece­dence be­cause it was him who took up the challenge in the first place and in­spired my post. I hope to find time to write a re­view on his post.

If peo­ple were in­ter­ested (and Eli was ok with it) I would be happy to write a short sum­mary of my find­ings to add as a foot­note to Eli’s post if it was cho­sen for the re­view.

***

This was my first post on LessWrong and look­ing back at it I think it still holds up fairly well.

There are a cou­ple of things I would change if I were do­ing it again:

• Put less time into the sons vs daugh­ters thing. I think this sec­tion could have two thirds of it chopped out with­out los­ing much.

• Un­named’s com­ment is re­ally im­por­tant in point­ing out a mis­take I was mak­ing in my fi­nal para­graph.

• I might have tried to analyse whether it is a first­born thing vs an ear­ly­born thing. In the SSC data it is strongly a first­born thing and if I com­bined Eli and my datasets I might be able to con­firm whether this is also the case in our datasets. I’m not sure if this would provide a de­ci­sive an­swer as our sam­ple size is much smaller even when com­bin­ing the sets.

• Now that we have data on LWers/​SSCers, math­e­mat­i­ci­ans, and physi­cists, if any­one wants to put more work into this I’d like to see them look some­place differ­ent. We don’t want to fall into the Wa­son 2-4-6 trap of only look­ing for birth or­der effects among smart STEM folks. We want data that can dis­t­in­guish Scott’s in­tel­li­gence /​ in­tel­lec­tual cu­ri­os­ity hy­poth­e­sis from other pos­si­bil­ities like some non-big-5 per­son­al­ity differ­ence or a gen­eral first­borns more likely phe­nomenon.

• Very good point. That would mean ex­am­in­ing No­bel lau­re­ates in chem­istry, medicine and eco­nomics might not give much ex­tra in­for­ma­tion.

Liter­a­ture lau­re­ates would, I pre­sume, have sig­nifi­cant in­tel­lec­tual cu­ri­os­ity so might give an in­di­ca­tion as to whether the effect is spe­cific to STEM sub­jects or more gen­er­al­ised cu­ri­os­ity/​some­thing else.

I’m not sure about Peace lau­re­ates – a nega­tive re­sult wouldn’t tell us much but a pos­i­tive re­sult would be in­ter­est­ing.

Any other sug­ges­tions for data which might be pub­li­cly available and point us to­wards one hy­poth­e­sis or an­other?

We’re only go­ing to get a good pic­ture of what’s go­ing on by some­one de­cid­ing to do a proper re­search pro­ject on this but I feel like any ad­di­tional data which peo­ple could col­lect would be helpful to point the pro­ject in the right di­rec­tion.

• Seems in­ter­est­ing to get data on:

Some group that isn’t heav­ily se­lected for in­tel­li­gence /​ in­tel­lec­tual cu­ri­os­ity: skate­board­ers, protestors, pro­fes­sional hockey play­ers...

Some non-STEM group that is se­lected for suc­cess based on men­tal skills: liter­a­ture lau­re­ates, gov­er­nors, …

Not sure which groups it would be easy to get data on.

There is also the op­tion of look­ing into ex­ist­ing re­search on birth or­der to see what groups other peo­ple have already looked at.

• I’m in­ter­ested in teas­ing apart “high achieve­ment” from “high achieve­ment in a STEM field”.

I’d be in­ter­ested in anal­y­sis of for­tune 500 CEOs, for in­stance.

• The sug­ges­tion about For­tune 500 CEO seems good; “self-made” mil­lion­aires are a cat­e­gory far enough from STEM, and, due to their sta­tus, they are more likely to have re­li­able bi­o­graph­i­cal in­for­ma­tion. If you want to go in a com­pletely differ­ent di­rec­tion, how about some­thing like the Dar­win Awards?

• Stay­ing ‘out of the 2-4-6’ does seem use­ful, but it’d also be nice to know if it is a ‘STEM thing’.

• Which you also can’t know if you don’t test other fields. I think there are at least 3 con­cen­tric lev­els to dis­t­in­guish : ( fa­mous ( in­tel­li­gent ( STEM ) ) ).

• So po­ten­tially ( Sports play­ers ( Liter­a­ture lau­re­ates /​ gov­er­nors ( Physi­cists /​ Math­e­mat­i­ci­ans ) ) ) ?

• Some sports play­ers are pretty smart and prob­a­bly some gov­er­nors aren’t. What about ( Real­ity TV celebri­ties ( heads of state of UNSC coun­tries ( Physi­cists /​ Math­e­mat­i­ci­ans /​ Eng­ineers ) ) ).

(1 minute of thought did not provide an­other group of fa­mous & not-even-a-lit­tle-bit-se­lected-for-in­tel­li­gence peo­ple, un­less there’s a database of lot­tery win­ners, which I doubt. Cu­ri­ous for sug­ges­tions.)

(Fa­mous en­g­ineers: of course Wikipe­dia does not dis­ap­point.)

• That’s a good point, which ap­plies to both this and the prior post. The rea­son ‘No­bel’ Lau­re­ates are easy is prob­a­bly the fame com­po­nent.

• Olympic medal­ists? Not just in­tel­li­gence but a cer­tain kind of poli­ti­cal savvy and per­se­ver­ance.

• Thanks for do­ing this!

I feel more val­i­dated in hav­ing spent the time do­ing data-col­lec­tion for the math­e­mat­i­cian data set af­ter see­ing that it prompted some­one else to in­ves­ti­gate in this area. It’s pretty en­courag­ing to know that if I write up some­thing of in­ter­est on LessWrong, other peo­ple might build on it.

• Just ask­ing about the birth or­der here. What is the im­pli­ca­tion of the find­ing—why is this seen? Any thoughts?

• One of the things I find most in­ter­est­ing is that the effect seems to be strongest for the ra­tio­nal­ity com­mu­nity.

I would sug­gest a sister the­ory to the in­tel­lec­tual cu­ri­os­ity an­gle that Scott men­tioned. Eldest chil­dren spend their for­ma­tive years with­out an­other child in the fam­ily to look up to. I would think that this would lead to, on av­er­age, a lower ac­cep­tance of in­for­ma­tion from au­thor­ity.

This comes a bit my per­sonal ex­pe­rience—I had an el­der brother whose opinion I would take on as my own, even through my teens. I’d be in­ter­ested to know if other younger siblings in the com­mu­nity have had a similar ex­pe­rience.

It would ex­plain why the effect is so strong in the ra­tio­nal­ist com­mu­nity. In Science and Maths it helps to challenge au­thor­i­ta­tive sources but this is ac­tively en­couraged, so the in­trin­sic effect needs to be less strong to get you to do it. The ra­tio­nal­ist com­mu­nity is of­ten challeng­ing things which, cul­turally, aren’t sup­posed to be challenged which would give a higher in­trin­sic bar to en­try.

• I would sug­gest Re­gres­sion to the Mean in­stead—we are only in­ter­ested in this hy­poth­e­sis be­cause of its un­usual high num­ber on the sur­vey in the first place.

• I wan­dered about re­gres­sion to the mean but with the SSC data be­ing so large there isn’t much room for a big effect—even mov­ing 4SD on the SSC data won’t move the mean much. I’m afraid I don’t know how to do the maths be­yond com­par­ing con­fi­dence in­ter­vals as I did in the text.

• You’re right, I should have dou­ble-checked the stan­dard de­vi­a­tions be­fore sug­gest­ing re­gres­sion to the mean. I agree that re­gres­sion doesn’t plau­si­bly ex­plain the data.

• Just wild guesses.

I had a ques­tion about how adop­tion fits in to the pic­ture, so two things come to mind:

A) It still ex­ists for adopted kids.

Since it’s not clear what the effect seems mean­ingless for sin­gle chil­dren, maybe it doesn’t ex­ist for them.

Maybe older siblings are trusted with more re­spon­si­bil­ity or speci­fi­cally watch­ing out for younger siblings, and that has way more effect than any­one ex­pected. This would be a pain to test, but one might try to check by see­ing if the gaps be­tween the ages of kids mat­ter. It’s a pretty spe­cific hy­poth­e­sis though, so maybe it’d be some­thing else about in­ter­act­ing a lot with younger kids while young (but older).

B) It doesn’t.

That might mean it has some­thing to do with birth, as op­posed to differ­ences in how later/​ear­lier chil­dren are raised.

• I did a search on the first born more in­tel­li­gent query and go a hit to some ar­ti­cle pub­lished in late 2016 or early 2017 -- news pa­per re­ported on the study in Feb 2017. What the hy­poth­e­sis seemed to be was that par­ent in­ter­act with the first child differ­ently than the later chil­dren and provide a more men­tally stim­u­lat­ing en­vi­ron­ment for that child.

If so any bets on when the first law suit for com­pen­sa­tion by the younger siblings will be filed for a great share of any in­her­i­tance? (semi-jok­ing...)

• I was read­ing through the com­ments on “Ad­vances in baby for­mula” and I no­ticed 2 claims: ba­bies that are breast­fed have higher IQ, and moth­ers breast­feed less with later chil­dren.

• If so I won­der if that might not be traced back to im­mune sys­tems—breast­feed­ing al­lows the baby to de­velop a strong im­mune sys­tem I think given the baby can bor­row from mom rather than de­vel­op­ing the re­sponse alone.