Exercises in Comprehensive Information Gathering

Look­ing back, sev­eral of the most durably-valuable ex­er­cises I’ve done over the years have a gen­eral theme of com­pre­hen­sive in­for­ma­tion gath­er­ing.

The most re­cent ex­am­ple in­volves cap­i­tal in­vest­ments. Economists talk about “cap­i­tal goods” as phys­i­cal stuff—ma­chines, build­ings, etc. But in prac­tice, sav­ings and in­vest­ments are passed through banks and ETFs, bun­dled and se­cu­ri­tized, in­volve debts and shares of com­pa­nies which own debts and shares of other com­pa­nies, and so forth… where does all that cap­i­tal end up? To get an in­tu­itive sense, I pul­led up fun­da­men­tal data on about 7000 US pub­li­cly-traded com­pa­nies in quan­topian, sorted them by amount of non-fi­nan­cial as­sets, and found that the top 100 ac­counted for about 50% of the non-fi­nan­cial as­sets of the whole set. Then, I looked at an­nual re­ports for each of those 100 com­pa­nies, to see what cap­i­tal as­sets they had. I googled around for pic­tures and maps of where those as­sets were lo­cated, and read up on any­thing I hadn’t heard of be­fore. What’s a “cen­tral office”, where are they, what do they look like, and why does AT&T have $90B worth of them? What are the ma­jor US oil bas­ins, where are the wells, and what all goes into drilling them? What are the tech­ni­cal differ­ences be­tween tra­di­tional phone, ca­ble, satel­lite, and cell net­works, and how do those tech­ni­cal differ­ences im­pact the cap­i­tal re­quire­ments of each? Who runs power plants and the power grid in var­i­ous parts of the coun­try? What are the ma­jor US railroads, and where are they? Why did GE own so many air­planes? Th­ese are the kinds of ques­tions which come up when you want to know what “cap­i­tal goods” ac­tu­ally con­sist of, in the real world.

Another in­ter­est­ing ex­er­cise: I read through five years of Na­ture archives, read­ing all the ti­tles and any ab­stracts which sounded novel/​in­ter­est­ing. I didn’t google ev­ery­thing I hadn’t heard of; in­stead, I’d wait un­til the same acronym popped up a few times be­fore look­ing it up. This took maybe a week of evenings af­ter work. By the end, I could at least place the large ma­jor­ity of ar­ti­cles in con­text. Now, when I see a ti­tle full of jar­gon in a field I haven’t stud­ied, like “Novel tau fila­ment fold in cor­ti­cobasal de­gen­er­a­tion”, I usu­ally at least un­der­stand enough to guess at what it’s rele­vant to (in this case: neu­rode­gen­er­a­tive dis­ease in­volv­ing pro­tein ag­gre­gates, prob­a­bly Alzheimers?). I can gen­er­ally fol­low con­ver­sa­tions in a bunch of differ­ent fields—not nec­es­sar­ily be­tween spe­cial­ists in the same sub-sub-field, but at least the level of a typ­i­cal con­fer­ence talk, and when I meet new peo­ple I can ask not-too-em­bar­rass­ing ques­tions about what they’re re­search­ing.

Go­ing back fur­ther, if you’re in col­lege, I strongly recom­mend read­ing your en­tire course cat­a­logue, googling any­thing you’ve never heard of at all, and mark­ing any­thing that sounds po­ten­tially in­ter­est­ing. This seems re­ally ob­vi­ous; it only takes a few hours, and some­thing some­thing a pile of value sit­ting on a silver plat­ter right in front of you. (Note: I went to a small STEM school; if you’re at a big school with a ba­jillion courses or a school with poor STEM cov­er­age or not at col­lege at all, con­sider read­ing an MIT/​Caltech course cat­a­logue in­stead, to get a feel for what all is out there.) You never know what sur­pris­ing and in­ter­est­ing top­ics might be hid­ing in there—microfluidics, un­der­ac­tu­ated robotics, re­cur­sive macroe­co­nomics, sys­tems biol­ogy, syn­thetic biol­ogy, origami al­gorithms, com­pu­ta­tional pho­tog­ra­phy, evo-devo, pro­ce­du­ral graph­ics, and on and on.

Th­ese sort of ex­er­cises provide value in a few ways:

  • They re­veal un­known un­knowns—things you didn’t even re­al­ize were miss­ing from your pic­ture of the world.

  • You can’t make a map of a city by sit­ting in your room with the shades drawn; ex­er­cises like these force you to look at large slices of the world.

  • Knowl­edge within fields tends to have de­creas­ing marginal re­turns—your first physics or CS class will teach you much more than your eighth. Th­ese ex­er­cises give a broad, brief glance at many ar­eas where you prob­a­bly haven’t reached de­creas­ing marginal re­turns yet.

  • You can get a very rough big-pic­ture sense of how much effort other peo­ple are in­vest­ing in var­i­ous ar­eas—e.g. where most cap­i­tal in­vest­ments go or where most re­search effort goes—which is use­ful for un­der­stand­ing the world in gen­eral.

  • While these ex­er­cises don’t avoid bi­ased se­lec­tion of in­for­ma­tion al­to­gether, they’re prob­a­bly differ­ent bi­ases from what you run into nat­u­rally, and they’re sys­tem­atic enough that we can guess at what bi­ases are likely to be pre­sent.

  • They’re a lot of fun, if you have a cu­ri­ous streak.

Most im­por­tantly: I’ve found each of these ex­er­cises to have last­ing, long-term value in ex­change for a one-time in­vest­ment of effort.

Other ex­er­cises which are on my to-do list, but which I haven’t done yet:

I’m cu­ri­ous to hear other sug­ges­tions for ex­er­cises along these lines.