Comments on Power Law Distribution of Individual Impact

I had a dis­cus­sion on­line yes­ter­day, stem­ming from whether you should ex­pect to be able to iden­tify in­di­vi­d­u­als who will most shape the long term fu­ture of hu­man­ity. It was on a dis­cus­sion of whether CEA should have staff work on do­ing this full time, and I was ex­pect­ing bor­ing com­ments that just ex­pressed a poli­ti­cal opinion about what CEA should do. How­ever, Jan Kul­veit offered some con­crete mod­els for me to dis­agree with, and I had a fun ex­change and ap­pre­ci­ated the chance to make ex­plicit some of my mod­els in this area.

With per­mis­sion of all in­volved, I have re­pro­duced the ex­change be­low.


Jan:

I would be also wor­ried. Ho­mophily is of the best pre­dic­tors of links in so­cial net­works, and fac­tors like be­ing mem­ber of the same so­cial group, hav­ing similar ed­u­ca­tion, opinions, etc. are known to bias se­lec­tion pro­cesses again to­ward se­lect­ing similar peo­ple. This risks hav­ing the core of the move­ment be more self en­cap­su­lated that it is, which is a shift in bad di­rec­tion.

Also I would be wor­ried with 80k hours shift­ing also more to­ward in­di­vi­d­ual coach­ing, there is now a bit overem­pha­sis on “in­di­vi­d­ual” ap­proach and too lit­tle on “cre­at­ing sys­tems”.

Also it seems lot of this would benefit from knowl­edge from the fields of “sci­ence of suc­cess”, gen­eral sci­en­tom­e­try, net­work sci­ence, etc. E.g. when I read con­cepts like “next Peter Singer” or a lot of think­ing along the line “most of the value is cre­ated by just a few pe­ple”, I’m wor­ried. While such think­ing is in­tu­itively ap­peal­ing, it can be quite su­perfi­cial. E.g., a toy model: Imag­ine a land­scape with gold scat­tered in power-law sized de­posits. And prospec­tors, walk­ing ran­domly, and ran­domly dis­cov­er­ing de­posits of gold. What you ob­serve is the value of gold col­lected by prospec­tors is also power-law dis­tributed. But ob­vi­ously the at­tempts to em­u­late “the best” or find the “next best” would be fu­tile. It seems open ques­tion (worth study­ing) how much some spe­cific knowl­edge land­scape re­sem­bles this model, or how big part of the suc­cess is at­tributable to luck.


Ben (me):

That’s a nice toy model, thanks for be­ing so clear :-)

But it’s definitely wrong. If you look at Bostrom on AI or Ein­stein on Rel­a­tivity or Feyn­man on Quan­tum Me­chan­ics, you don’t see peo­ple who are roughly as com­pe­tent as their peers, just be­ing lucky in which part of the re­search space was divvied up and given to them. You tend to see peo­ple with rare and use­ful think­ing pro­cesses hav­ing mul­ti­ple im­por­tant in­sights about their field in suc­ces­sion—get­ting many thing right that their peers didn’t, not just one as your model would pre­dict (if be­ing right was ran­dom luck). Bostrom has looked into half a dozen sci-fi look­ing ar­eas that oth­ers looked to figure out which were im­por­tant, be­fore con­clud­ing with xrisk and AI, and he looked into ar­eas and asked ques­tions that were on no­body’s radar. Feyn­man made break­throughs in many differ­ent sub­fields, and his suc­cess looked like be­ing very good at fun­da­men­tals like be­ing con­crete and notic­ing his con­fu­sion. I know less about Ein­stein, but as I un­der­stand it to get to Rel­a­tivity re­quired a long chain of rea­son­ing that was un­clear to his con­tem­po­raries. “How would I de­sign the uni­verse if I were god” was prob­a­bly not a stan­dard tool that was handed out to many physi­cists to try.

You may re­spond “sure, these peo­ple came up with lots of good ideas that their con­tem­po­raries wouldn’t have, but this was prob­a­bly due to them us­ing the right heuris­tics, which you can think of as hav­ing been handed out ran­domly in grad school to all the differ­ent re­searchers, so it still is ran­dom just on the level of cog­ni­tive pro­cesses”.

To this I’d say that, you’re right, look­ing at peo­ple’s gen­eral cog­ni­tive pro­cesses is re­ally im­por­tant, but I think I can do much bet­ter than ran­dom chance in pre­dict­ing what cog­ni­tive pro­cesses will pro­duce valuable in­sights. I’ll point to Su­perfore­cast­ers and Ra­tion­al­ity: AI to Zom­bies as books with many in­sights into which cog­ni­tive pro­cesses are more likely to find novel and im­por­tant truths than oth­ers.

In sum: I think the peo­ple who’ve had the most pos­i­tive im­pact in his­tory are power law dis­tributed be­cause of their rare and valuable cog­ni­tive pro­cesses, not just ran­dom luck, and that these can be learned from and that can guide my search for peo­ple who (in fu­ture) will have mas­sive im­pact.


Jan:

Ob­vi­ously the toy model is wrong in de­scribing re­al­ity: it’s one end of the pos­si­ble spec­trum, where you have com­plete ran­dom­ness. On the other you have an­other toy model: re­sults in a field neatly or­dered by cog­ni­tive difficulty, and the best per­son at a time picks all the available fruit. My ac­tual claims roughly are

  • re­al­ity is some­where in between

  • it is field-dependent

  • even in fields more to­ward the ran­dom end, there ac­tu­ally would be differ­ences like differ­ent speeds of travel among prospectors

It is quite un­clear to me where on this scale the rele­vant fields are.

I be­lieve your con­clu­sion, that the power law dis­tri­bu­tion is all due to the prop­er­ties of the peo­ples cog­ni­tive pro­cesses, and no to the ran­dom­ness of the field, is not sup­ported by the sci­en­to­met­ric data for many re­search fields.

Thanks for a good pre­emp­tive an­swer :) Yes if you are good enough in iden­ti­fy­ing the “golden” cog­ni­tive pro­cesses. While it is clear you would be bet­ter than ran­dom chance, it is very un­clear to me how good you would be. *

I think its worth dig­ging into an ex­am­ple in de­tail: if you look a at early Ein­stein, you ac­tu­ally see some­one with an un­usu­ally de­vel­oped ge­o­met­ric think­ing and the very lucky heuris­tic of in­ter­pret­ing what the equa­tions say as the ac­tual re­al­ity. Fa­mously spe­cial rel­a­tivity trans­for­ma­tions were writ­ten first by Poin­care. “All” what needed to be done was to take it se­ri­ously. Gen­eral rel­a­tivity is a differ­ent story, but at that point Ein­stein was already fa­mous and pos­si­bly one of the few brave enough to at­tack the prob­lem.

Con­tin­u­ing with the same ex­am­ple, I would be ex­tremely doubt­ful if Ein­stein would be picked by se­lec­tion pro­cess similar to what CEA or 80k hours will be prob­a­bly run­ning, be­fore he be­come fa­mous. 2nd grade patent clerk? Un­im­pres­sive. Well con­nected? No. Unusual ge­o­met­ric imag­i­na­tion? I’m not aware of any LessWrong se­quence which would lead to pick­ing this as that im­por­tant :) Lucky heuris­tic? Pure gold, in hind­sight.

(*) At the end you can take this as an op­ti­miza­tion prob­lem de­pend­ing how good your su­pe­rior-cog­ni­tive-pro­cess se­lec­tion abil­ity is. Let’s have a prac­ti­cal ex­am­ple: You have 1000 ap­pli­cants. If your se­lec­tion abil­ity is great enough, you should take 20 for in­di­vi­d­ual sup­port. But maybe its just good, and than you may get bet­ter ex­pected util­ity if you are able to reach 100 po­ten­tially great peo­ple in work­shops. Maybe you are much bet­ter than chance, but not re­ally good… than, maybe you should cre­ate on­line course tak­ing in 400 par­ti­ci­pants.


Ben (me):

Ex­am­ples are to­tally worth dig­ging into! Yeah, I ac­tu­ally find my­self sur­prised and slightly con­fused by the situ­a­tion with Ein­stein, and do make the ac­tive pre­dic­tions that he had some­strong con­nec­tions in physics (e.g. at some point had a re­ally great physics teacher who’d done some re­search). In gen­eral I think Ra­manu­jan-like sto­ries of ge­niuses ap­pear­ing from nowhere are not the typ­i­cal ex­am­ple of great thinkers /​ peo­ple who sig­nifi­cantly change the world. If I’m I right I should be able to tell such sto­ries about the oth­ers, and in gen­eral I do think that great peo­ple tend to get net­worked to­gether, and that the think­ing pat­terns of the great­est peo­ple are no­ticed by other good peo­ple be­fore they do their sem­i­nal work cf. Bell Labs (Shan­non/​Feyn­man/​Tur­ing etc), Pay­pal Mafia (Thiel/​Musk/​Hoff­man/​Nosek etc), SL4 (Han­son/​Bostrom/​Yud­kowsky/​Legg etc), and maybe the Repub­lic of Let­ters dur­ing the en­light­en­ment? But I do want to spend more time dig­ging into some of those.

To ap­proach from the other end, what heuris­tics might I use to find peo­ple who in the fu­ture will cre­ate mas­sive amounts of value that oth­ers miss? One ex­am­ple heuris­tic that Y Com­bi­na­tor uses to de­ter­mine who in ad­vance is likely to find novel, deep mines of value that oth­ers have missed is whether the in­di­vi­d­u­als reg­u­larly build things to fix prob­lems in their life (e.g. Zucker­berg built lots of sim­ple on­line tools to help his fel­low stu­dents study while at col­lege).

Some heuris­tics I use to tell whether I think peo­ple are good at figur­ing out what’s true, and make plans for it, in­clude:

  • Does the per­son, in con­ver­sa­tion, reg­u­larly take long silent pauses to or­ganise their thoughts, find good analo­gies, analyse your ar­gu­ment, etc? Many peo­ple I talk to take silence as a sig­nifi­cant cost, due to so­cial awk­ward­ness, and do not make the trade-off to­ward figur­ing out what’s true. I always trust the peo­ple more that I talk to who make these small trade-offs to­ward truth ver­sus so­cial cost

  • Does the per­son have a his­tory of ex­e­cut­ing long-term plans that weren’t in­cen­tivised by their lo­cal en­vi­ron­ment? Did they de­cide a per­sonal-pro­ject (not, like, get­ting a de­gree) was worth putting 2 years into, and then put 2 years into it?

  • When I ask about a non-stan­dard be­lief they have, can they give me a straight­for­ward model with a few vari­ables and sim­ple re­la­tions, that they use to un­der­stand the topic we’re dis­cussing? In gen­eral, how trans­par­ent are their mod­els to them­selves, and are the mod­els gen­eral sim­ple and backed by lots of lit­tle pieces of con­crete ev­i­dence?

  • Are they good at find­ing gen­uine in­sights in the think­ing of peo­ple who they be­lieve are to­tally wrong?

My gen­eral thought is that there isn’t ac­tu­ally a lot of op­ti­mi­sa­tion pro­cess put into this, es­pe­cially in ar­eas that don’t have in­sti­tu­tions built around them ex­actly. For ex­am­ple academia will prob­a­bly no­tice you if you’re very skil­led in one dis­ci­pline and com­pete di­rectly in it, but it’s very hard to be no­ticed if you’re in­ter­dis­ci­plinary (e.g. Robin Han­son’s book sit­ting be­tween neu­ro­science and eco­nomics) or if you’re not com­pet­ing along even just one or two of the di­men­sions it op­ti­mises for (e.g. MIRI re­searchers don’t op­ti­mise for pub­lish­ing ba­si­cally at all, so when they make big break­throughs in de­ci­sion the­ory and log­i­cal in­duc­tion it doesn’t get them much no­tice from stan­dard academia). So even our best in­sti­tu­tions at notic­ing great thinkers with gen­uine and valuable in­sights seem to fail at some of the ex­am­ples that seem most im­por­tant. I think there is lots of low hang­ing fruit I can pick up in terms of figur­ing out who thinks well and will be able to find and mine deep sources of value.

Edit: Re­moved Bostrom as an ex­am­ple at the end, be­cause I can’t figure out whether his suc­cess in academia, while nonethe­less go­ing through some­thing of a non-stan­dard path, is ev­i­dence for or against academia’s abil­ity to figure out whose cog­ni­tive pro­cesses are best at figur­ing out what’s sur­pris­ing+true+use­ful. I have the sense that he had to push against the stan­dard in­cen­tive gra­di­ents a lot, but I might just be false and Bostrom is one of academia’s suc­cess sto­ries this gen­er­a­tion. He doesn’t look like he just rose to the top of a well-defined field though, it looks like he kept hav­ing to pick which top­ics were im­por­tant and then find some route to pub­lish­ing on them, as op­posed to the other way round.


Greg Lewis sub­se­quently also re­sponded to Jan’s com­ment:

I share your cau­tion on the difficulty of ‘pick­ing high im­pact peo­ple well’, be­sides the risk of over-fit­ting on anec­data we hap­pen to latch on to, the past may sim­ply prove un­der­pow­ered for for­ward pre­dic­tion: I’m not sure any sys­tem could re­li­ably ‘pick up’ Ein­stein or Ra­manu­jan, and I won­der how much ‘think­ing tools’ etc. are just epiphe­nom­ena of IQ.

That said, fairly bor­ing met­rics are fairly pre­dic­tive. Peo­ple who do ex­cep­tion­ally well at school tend to do well at uni­ver­sity, those who ex­cel at uni­ver­sity have a bet­ter chance of ex­cep­tional pro­fes­sional suc­cess, and so on and so forth. SPARC (a pro­gram aimed at ex­traor­di­nar­ily math­e­mat­i­cally able youth) seems a neat ex­am­ple. I ac­cept none of these sup­ply an easy model for ‘tal­ent scout­ing’ in­tra-EA, but they sug­gest one can do much bet­ter than chance.

Op­ti­mal se­lec­tivity also de­pends on the size of boost you give to peo­ple, even if they are im­perfectly se­lected. It’s plau­si­ble this re­la­tion­ship could be con­vex over the ‘one-to-one men­tor­ing to web­page’ range, and so you might have to gam­ble on some­thing in­ten­sive even in ex­pec­ta­tion of you failing to iden­tify most or nearly all of the po­ten­tially great peo­ple.

(Aside: Although tricky to put hu­man abil­ity on a car­di­nal scale, nor­mal-dis­tri­bu­tion prop­er­ties for things like work­ing mem­ory sug­gest cog­ni­tive abil­ity (how­ever cashed out) isn’t power law dis­tributed. One ex­pla­na­tion of how this could drive power-law dis­tri­bu­tions in some fields would be a Matthew effect: be­ing marginally bet­ter than com­pet­ing sci­en­tists lets one take the ma­jor­ity of the great new dis­cov­er­ies. This may sug­gest more ne­glected ar­eas, or those where the cru­cial con­sid­er­a­tion is whether/​when some­thing is dis­cov­ered, rather than who dis­cov­ers it (com­pare a malaria vac­cine to an AGI), are those where the pre­mium to re­ally ex­cep­tional tal­ent is less. )


Jan’s last re­sponse to me:

For sci­en­tific pub­lish­ing, I looked into the lat­est available pa­per[1] and ap­par­ently the data are best fit­ted by a model where the im­pact of sci­en­tific pa­pers is pre­dicted by Q.p, where p is “in­trin­sic value” of the pro­ject and Q is a pa­ram­e­ter cap­tur­ing the cog­ni­tive abil­ity of the re­searcher. Notably, Q is in­de­pen­dent of the to­tal num­ber of pa­pers writ­ten by the sci­en­tist, and Q and p are also in­de­pen­dent. Trans­lat­ing into the lan­guage of dig­ging for gold, the prospec­tors differ in their speed and abil­ity to ex­tract gold from the de­posits (Q). The gold in the de­posits ac­tu­ally is ran­domly dis­tributed. To ex­tract ex­cep­tional value, you have to have both high Q and be very lucky. What is en­courag­ing in se­lect­ing the tal­ent is the Q seems rel­a­tively sta­ble in the ca­reer and can be use­fully es­ti­mated af­ter ~20 pub­li­ca­tions. I would guess you can pre­dict even with less data, but the cor­rect “for­mula” would be try­ing to dis­en­tan­gle in­ter­est­ing­ness of the prob­lems the per­son is work­ing on from the in­ter­est­ing­ness of the re­sults.

(As a side note, I was wrong in guess­ing this is strongly field-de­pen­dent, as the model seems sta­ble across sev­eral dis­ci­plines, time pe­ri­ods, and many other pa­ram­e­ters.)

In­ter­est­ing heuris­tics about peo­ple :)

I agree the prob­lem is some­what differ­ent in ar­eas not that es­tab­lished/​in­sti­tu­tion­al­ized where you don’t have clear di­men­sions of com­pe­ti­tion, or the well mea­surable di­men­sions are not that well al­igned with what is im­por­tant. Loooks like an­other un­der­stud­ied area.

[1] Quan­tify­ing the evolu­tion of in­di­vi­d­ual sci­en­tific im­pact, Si­na­tra et.al. Science, http://​​www.sci­ence­suc­cess.org/​​up­loads/​​1/​​5/​​5/​​4/​​15543620/​​sci­ence_quan­tify­ing_aaf5239_sina­tra.pd