Model Combination and Adjustment

The de­bate on the proper use of in­side and out­side views has raged for some time now. I sug­gest a way for­ward, build­ing on a fam­ily of meth­ods com­monly used in statis­tics and ma­chine learn­ing to ad­dress this is­sue — an ap­proach I’ll call “model com­bi­na­tion and ad­just­ment.”

In­side and out­side views: a quick review

1. There are two ways you might pre­dict out­comes for a phe­nomenon. If you make your pre­dic­tions us­ing a de­tailed vi­su­al­iza­tion of how some­thing works, you’re us­ing an in­side view. If in­stead you ig­nore the de­tails of how some­thing works, and in­stead make your pre­dic­tions by as­sum­ing that a phe­nomenon will be­have roughly like other similar phe­nom­ena, you’re us­ing an out­side view (also called refer­ence class fore­cast­ing).

In­side view ex­am­ples:

  • “When I break the pro­ject into steps and vi­su­al­ize how long each step will take, it looks like the pro­ject will take 6 weeks”

  • “When I com­bine what I know of physics and com­pu­ta­tion, it looks like the se­rial speed for­mu­la­tion of Moore’s Law will break down around 2005, be­cause we haven’t been able to scale down en­ergy-use-per-com­pu­ta­tion as quickly as we’ve scaled up com­pu­ta­tions per sec­ond, which means the se­rial speed for­mu­la­tion of Moore’s Law will run into road­blocks from en­ergy con­sump­tion and heat dis­si­pa­tion some­where around 2005.”

Out­side view ex­am­ples:

  • “I’m go­ing to ig­nore the de­tails of this pro­ject, and in­stead com­pare my pro­ject to similar pro­jects. Other pro­jects like this have taken 3 months, so that’s prob­a­bly about how long my pro­ject will take.”

  • “The se­rial speed for­mu­la­tion of Moore’s Law has held up for sev­eral decades, through sev­eral differ­ent phys­i­cal ar­chi­tec­tures, so it’ll prob­a­bly con­tinue to hold through the next shift in phys­i­cal ar­chi­tec­tures.”

See also chap­ter 23 in Kah­ne­man (2011); Plan­ning Fal­lacy; Refer­ence class fore­cast­ing. Note that, af­ter sev­eral decades of past suc­cess, the se­rial speed for­mu­la­tion of Moore’s Law did in fact break down in 2004 for the rea­sons de­scribed (Ful­ler & Millett 2011).

2. An out­side view works best when us­ing a refer­ence class with a similar causal struc­ture to the thing you’re try­ing to pre­dict. An in­side view works best when a phe­nomenon’s causal struc­ture is well-un­der­stood, and when (to your knowl­edge) there are very few phe­nom­ena with a similar causal struc­ture that you can use to pre­dict things about the phe­nomenon you’re in­ves­ti­gat­ing. See: The Out­side View’s Do­main.

When writ­ing a text­book that’s much like other text­books, you’re prob­a­bly best off pre­dict­ing the cost and du­ra­tion of the pro­ject by look­ing at similar text­book-writ­ing pro­jects. When you’re pre­dict­ing the tra­jec­tory of the se­rial speed for­mu­la­tion of Moore’s Law, or pre­dict­ing which space­ship de­signs will suc­cess­fully land hu­mans on the moon for the first time, you’re prob­a­bly best off us­ing an (in­tensely in­formed) in­side view.

3. Some things aren’t very pre­dictable with ei­ther an out­side view or an in­side view. Some­times, the thing you’re try­ing to pre­dict seems to have a sig­nifi­cantly differ­ent causal struc­ture than other things, and you don’t un­der­stand its causal struc­ture very well. What should we do in such cases? This re­mains a mat­ter of de­bate.

Eliezer Yud­kowsky recom­mends a weak in­side view for such cases:

On prob­lems that are drawn from a bar­rel of causally similar prob­lems, where hu­man op­ti­mism runs ram­pant and un­fore­seen trou­bles are com­mon, the Out­side View beats the In­side View… [But] on prob­lems that are new things un­der the Sun, where there’s a huge change of con­text and a struc­tural change in un­der­ly­ing causal forces, the Out­side View also fails—try to use it, and you’ll just get into ar­gu­ments about what is the proper do­main of “similar his­tor­i­cal cases” or what con­clu­sions can be drawn there­from. In this case, the best we can do is use the Weak In­side View — vi­su­al­iz­ing the causal pro­cess — to pro­duce loose qual­i­ta­tive con­clu­sions about only those is­sues where there seems to be lop­sided sup­port.

In con­trast, Robin Han­son recom­mends an out­side view for difficult cases:

It is easy, way too easy, to gen­er­ate new mechanisms, ac­counts, the­o­ries, and ab­strac­tions. To see if such things are use­ful, we need to vet them, and that is eas­iest “nearby”, where we know a lot. When we want to deal with or un­der­stand things “far”, where we know lit­tle, we have lit­tle choice other than to rely on mechanisms, the­o­ries, and con­cepts that have worked well near. Far is just the wrong place to try new things.

There are a bazillion pos­si­ble ab­strac­tions we could ap­ply to the world. For each ab­strac­tion, the ques­tion is not whether one can di­vide up the world that way, but whether it “carves na­ture at its joints”, giv­ing use­ful in­sight not eas­ily gained via other ab­strac­tions. We should be wary of in­vent­ing new ab­strac­tions just to make sense of things far; we should in­sist they first show their value nearby.

In Yud­kowsky (2013), sec. 2.1, Yud­kowsky offers a re­ply to these para­graphs, and con­tinues to ad­vo­cate for a weak in­side view. He also adds:

the other ma­jor prob­lem I have with the “out­side view” is that ev­ery­one who uses it seems to come up with a differ­ent refer­ence class and a differ­ent an­swer.

This is the prob­lem of “refer­ence class ten­nis”: each par­ti­ci­pant in the de­bate claims their own refer­ence class is most ap­pro­pri­ate for pre­dict­ing the phe­nomenon un­der dis­cus­sion, and if dis­agree­ment re­mains, they might each say “I’m tak­ing my refer­ence class and go­ing home.”

Re­spond­ing to the same point made el­se­where, Robin Han­son wrote:

[Ear­lier, I] warned against over-re­li­ance on “un­vet­ted” ab­strac­tions. I wasn’t at all try­ing to claim there is one true anal­ogy and all oth­ers are false. In­stead, I ar­gue for prefer­ring to rely on ab­strac­tions, in­clud­ing cat­e­gories and similar­ity maps, that have been found use­ful by a sub­stan­tial in­tel­lec­tual com­mu­nity work­ing on re­lated prob­lems.

Mul­ti­ple refer­ence classes

Yud­kowsky (2013) adds one more com­plaint about refer­ence class fore­cast­ing in difficult fore­cast­ing cir­cum­stances:

A fi­nal prob­lem I have with many cases of ‘refer­ence class fore­cast­ing’ is that… [the] fi­nal an­swers [gen­er­ated from this pro­cess] of­ten seem more spe­cific than I think our state of knowl­edge should al­low. [For ex­am­ple,] I don’t think you should be able to tell me that the next ma­jor growth mode will have a dou­bling time of be­tween a month and a year. The alleged out­side viewer claims to know too much, once they stake their all on a sin­gle preferred refer­ence class.

Both this com­ment and Han­son’s last com­ment above point to the vuln­er­a­bil­ity of rely­ing on any sin­gle refer­ence class, at least for difficult fore­cast­ing prob­lems. Be­ware brit­tle ar­gu­ments, says Paul Chris­ti­ano.

One ob­vi­ous solu­tion is to use mul­ti­ple refer­ence classes, and weight them by how rele­vant you think they are to the phe­nomenon you’re try­ing to pre­dict. Holden Karnofsky writes of in­ves­ti­gat­ing things from “many differ­ent an­gles.” Jonah Sinick refers to “many weak ar­gu­ments.” Statis­ti­ci­ans call this “model com­bi­na­tion.” Ma­chine learn­ing re­searchers call it “en­sem­ble learn­ing” or “clas­sifier com­bi­na­tion.”

In other words, we can use many out­side views.

Nate Silver does this when he pre­dicts elec­tions (see Silver 2012, ch. 2). Ven­ture cap­i­tal­ists do this when they eval­u­ate star­tups. The best poli­ti­cal fore­cast­ers stud­ied in Tet­lock (2005), the “foxes,” tended to do this.

In fact, most of us do this reg­u­larly.

How do you pre­dict which restau­rant’s food you’ll most en­joy, when vis­it­ing San Fran­cisco for the first time? One out­side view comes from the restau­rant’s Yelp re­views. Another out­side view comes from your friend Jade’s opinion. Another out­side view comes from the fact that you usu­ally en­joy Asian cuisines more than other cuisines. And so on. Then you com­bine these differ­ent mod­els of the situ­a­tion, weight­ing them by how ro­bustly they each tend to pre­dict your eat­ing en­joy­ment, and you grab a taxi to Osha Thai.

(Tech­ni­cal note: I say “model com­bi­na­tion” rather than “model av­er­ag­ing” on pur­pose.)

Model com­bi­na­tion and adjustment

You can prob­a­bly do even bet­ter than this, though — if you know some things about the phe­nomenon and you’re very care­ful. Once you’ve com­bined a hand­ful of mod­els to ar­rive at a qual­i­ta­tive or quan­ti­ta­tive judg­ment, you should still be able to “ad­just” the judg­ment in some cases us­ing an in­side view.

For ex­am­ple, sup­pose I used the above pro­cess, and I plan to visit Osha Thai for din­ner. Then, some­body gives me my first taste of the Synsepa­lum dul­ci­ficum fruit. I hap­pen to know that this fruit con­tains a molecule called mira­c­ulin which binds to one’s taste­buds and makes sour foods taste sweet, and that this effect lasts for about an hour (Koizumi et al. 2011). De­spite the re­sults of my ear­lier model com­bi­na­tion, I pre­dict I won’t par­tic­u­larly en­joy Osha Thai at the mo­ment. In­stead, I de­cide to try some tabasco sauce, to see whether it now tastes like dough­nut glaze.

In some cases, you might also need to ad­just for your prior over, say, “ex­pected en­joy­ment of restau­rant food,” if for some rea­son your origi­nal model com­bi­na­tion pro­ce­dure didn’t cap­ture your prior prop­erly.

Against “the out­side view”

There is a lot more to say about model com­bi­na­tion and ad­just­ment (e.g. this), but for now let me make a sug­ges­tion about lan­guage us­age.

Some­times, small changes to our lan­guage can help us think more ac­cu­rately. For ex­am­ple, gen­der-neu­tral lan­guage can re­duce male bias in our as­so­ci­a­tions (Stahlberg et al. 2007). In this spirit, I recom­mend we re­tire the phrase “the out­side view..”, and in­stead use phrases like “some out­side views...” and “an out­side view...”

My rea­sons are:

  1. Speak­ing of “the” out­side view priv­ileges a par­tic­u­lar refer­ence class, which could make us over­con­fi­dent of that par­tic­u­lar model’s pre­dic­tions, and leave model un­cer­tainty un­ac­counted for.

  2. Speak­ing of “the” out­side view can act as a con­ver­sa­tion-stop­per, whereas speak­ing of mul­ti­ple out­side views en­courages fur­ther dis­cus­sion about how much weight each model should be given, and what each of them im­plies about the phe­nomenon un­der dis­cus­sion.