Running Effective Structured Forecasting Sessions

(pt.4 of our Fore­cast­ing In­fras­truc­ture se­ries)

Tl;dr: We’re shar­ing our struc­tured fore­cast­ing ses­sion for­mat.


In our quest to build fore­cast­ing in­fras­truc­ture, we’ve reen­coun­tered a clas­sic open prob­lem in group ra­tio­nal­ity—how do you run a meet­ing? More speci­fi­cally, how can you run a meet­ing in or­der to elicit, ag­gre­gate, and dis­till in­di­vi­d­ual hu­man judge­ments in a prin­ci­pled man­ner such that the whole is greater than the sum of its parts. We wanted to find a meet­ing for­mat that would help a com­mu­nity of fore­cast­ers work to­gether to col­lec­tively make bet­ter pre­dic­tions than they would on their own.

It’s a ques­tion that’s been stud­ied since at least the 1950’s when the RAND Cor­po­ra­tion in­vented the Delphi Method. The Delphi Method is a struc­tured meet­ing tem­plate wherein a group of ex­perts es­ti­mate their an­swer to a ques­tion, share their rea­son­ing, re­ply to oth­ers es­ti­ma­tions, and then es­ti­mate again. It en­courages up­dat­ing in the face of new ev­i­dence, and the for­mat has a strong track record of pro­duc­ing well-cal­ibrated pre­dic­tions.

How­ever, for our needs, it wasn’t quite the right fit. In par­tic­u­lar we didn’t have do­main ex­perts, we had ex­pert fore­cast­ers. The quintessen­tial su­perfore­caster is good at fore­cast­ing but might not have re­fined mod­els of the do­main (in this case AI). We wanted to cre­ate a for­mat that would help fore­cast­ers rapidly ori­ent to the ques­tion. To that end we made a few changes:

- We added a much greater em­pha­sis on un­der­stand­ing and build­ing mod­els, col­lab­o­ra­tively, on the ques­tion.

- We in­cluded time for ac­tive re­search into the topic.

For the past eight months we’ve been hold­ing fore­cast­ing ses­sions most Sun­days. I think our ap­proach is re­fined enough and gen­eral enough that the tem­plate can be used by oth­ers to good effect.


The gen­eral ap­proach is to take 4-12 fore­cast­ers through a se­ries of key steps:

  • Un­der­stand the ques­tion and your key “tech­ni­cal” un­cer­tain­ties (ex. what does this word mean)

  • In­di­vi­d­u­ally make a forecast

  • Dis­cuss as a group the ini­tial forecasts

  • Col­lab­o­ra­tively an­a­lyze the ques­tion from sev­eral lenses.

    • Out­side view

    • In­side View

    • Key Uncertainties

    • Sce­nario Planning

    • What would change my mind?

  • Dis­cuss as a group

  • Make new in­di­vi­d­ual forecasts

  • Share and com­pare updates

A fa­cil­i­ta­tor leads the ses­sion, largely perform­ing a lo­gis­ti­cal role such as keep­ing track of time and record­ing key com­ments in the doc­u­ment.

Most of the ses­sion is spent col­lab­o­ra­tively but silently writ­ing in a Google doc. This en­ables “mul­ti­plex­ing” op­er­a­tions where mul­ti­ple peo­ple work to­gether on differ­ent points. Con­trast with a stan­dard on­line meet­ings where ev­ery­one’s at­ten­tion is di­rected to one per­son, po­ten­tially wast­ing a lot of brain­power. We also heav­ily use meta tags. Us­ing brack­ets to in­di­cate an [info-re­quest] or sup­port for a point [+1] helped di­rect at­ten­tion and sig­nal group con­sen­sus.

Key “goals” of the for­mat:

  • Create an en­vi­ron­ment in which there is an in­ter­ac­tive back-and-forth on the ques­tion.

  • Ap­proach the ques­tion from differ­ent an­gles. For ex­am­ple cre­ate an out­side view model of the ques­tion and then an in­side view model of the ques­tion—the differ­ent ap­proaches com­ple­ment one an­other and can re­veal blindspots.

  • Gen­er­ate sub-ques­tions that can be an­swered or fore­cast.

  • En­courage flex­i­bil­ity and switch­ing be­tween breadth first and depth first ap­proaches for in­ves­ti­gat­ing a topic.

For a taste here’s a tran­script from our AI fore­cast­ing ses­sion on a set of ques­tions as­sess­ing the like­li­hood that, by a cer­tain year, there will be a 2-year in­ter­val in which the AI-com­pute trend did not dou­ble. You can see from the tran­script how the for­mat can elicit mod­els and be­liefs, and al­low back and forth be­tween par­ti­ci­pants in a man­ner to iden­tify the key un­cer­tain­ties driv­ing the fore­cast.


I feel con­fi­dent say­ing the ses­sion for­mat helps in clar­ify­ing and un­der­stand­ing the fore­cast­ing ques­tion—ev­ery time I’ve par­ti­ci­pated I’ve come away with a much deeper un­der­stand­ing of the con­tours of the ques­tion. Similar to the sur­pris­ing effec­tive­ness of fermi es­ti­ma­tion, it’s sur­pris­ingly effec­tive to spend 10-30 min­utes col­lab­o­ra­tively pok­ing at a fore­cast­ing ques­tion.

On the other hand I’m not en­tirely con­fi­dent that we’re not in­creas­ing group think risks. The origi­nal Delphi method had ev­ery par­ti­ci­pant be anony­mous—we haven’t tried that but I’d like to.

Even though much of the ses­sion hap­pens in a Google Doc, I’d recom­mend us­ing a video chat plat­form. For the past few months our ses­sions have been con­ducted over Dis­cord, so that we could eas­ily have mul­ti­ple “voice chan­nels” if ses­sion par­ti­ci­pants want to break off and dis­cuss one on one. That fea­ture has been nice, but I fear we’ve lost an in­ef­fable qual­ity from voice chat. There’s a way in which see­ing ev­ery­ones face, even if you’re work­ing silently, helps sharpen and fo­cus at­ten­tion on the ques­tion at hand—you know you’re in this to­gether.

Fi­nally on a meta level I’m sur­prised at how use­ful pur­pose­fully de­sign­ing a cus­tom meet­ing for­mat was. I bet there’s a lot of easy gains to be made through more cus­tom tai­lor­ing of con­ver­sa­tions (more ex­per­i­men­ta­tion like Robin Han­son’s EquaTalk idea would be neat).


  • We’ve also ex­per­i­mented with “op­er­a­tional­iza­tion” work­shops, where the main goal is to take an un­formed in­tent and turn it into a well op­er­a­tional­ized ques­tion, and “speed-fore­cast­ing” work­shops, where we take sev­eral ques­tions and rapidly share mod­els and fore­cast.

  • The IDEA method was also an in­spira­tion for our fore­cast­ing ses­sions.

No comments.