The Optimizer’s Curse and How to Beat It

The best laid schemes of mice and men
Go of­ten askew,
And leave us noth­ing but grief and pain,
For promised joy!

- Robert Burns (trans­lated)

Con­sider the fol­low­ing ques­tion:

A team of de­ci­sion an­a­lysts has just pre­sented the re­sults of a com­plex anal­y­sis to the ex­ec­u­tive re­spon­si­ble for mak­ing the de­ci­sion. The an­a­lysts recom­mend mak­ing an in­no­va­tive in­vest­ment and claim that, al­though the in­vest­ment is not with­out risks, it has a large pos­i­tive ex­pected net pre­sent value… While the anal­y­sis seems fair and un­bi­ased, she can’t help but feel a bit skep­ti­cal. Is her skep­ti­cism jus­tified?1

Or, sup­pose Holden Karnofsky of char­ity-eval­u­a­tor GiveWell has been pre­sented with a com­plex anal­y­sis of why an in­ter­ven­tion that re­duces ex­is­ten­tial risks from ar­tifi­cial in­tel­li­gence has as­tro­nom­i­cal ex­pected value and is there­fore the type of in­ter­ven­tion that should re­ceive marginal philan­thropic dol­lars. Holden feels skep­ti­cal about this ‘ex­plicit es­ti­mated ex­pected value’ ap­proach; is his skep­ti­cism jus­tified?

Sup­pose you’re a busi­ness ex­ec­u­tive con­sid­er­ing n al­ter­na­tives whose ‘true’ ex­pected val­ues are μ1, …, μn. By ‘true’ ex­pected value I mean the ex­pected value you would calcu­late if you could de­vote un­limited time, money, and com­pu­ta­tional re­sources to mak­ing the ex­pected value calcu­la­tion.2 But you only have three months and $50,000 with which to pro­duce the es­ti­mate, and this limited study pro­duces es­ti­mated ex­pected val­ues for the al­ter­na­tives V1, …, Vn.

Of course, you choose the al­ter­na­tive i* that has the high­est es­ti­mated ex­pected value Vi*. You im­ple­ment the cho­sen al­ter­na­tive, and get the re­al­ized value xi*.

Let’s call the differ­ence xi* - Vi* the ‘post­de­ci­sion sur­prise’.3 A pos­i­tive sur­prise means your op­tion brought about more value than your anal­y­sis pre­dicted; a nega­tive sur­prise means you were dis­ap­pointed.

As­sume, too kindly, that your es­ti­mates are un­bi­ased. And sup­pose you use this de­ci­sion pro­ce­dure many times, for many differ­ent de­ci­sions, and your es­ti­mates are un­bi­ased. It seems rea­son­able to ex­pect that on av­er­age you will re­ceive the es­ti­mated ex­pected value of each de­ci­sion you make in this way. Some­times you’ll be pos­i­tively sur­prised, some­times nega­tively sur­prised, but on av­er­age you should get the es­ti­mated ex­pected value for each de­ci­sion.

Alas, this is not so; your out­come will usu­ally be worse than what you pre­dicted, even if your es­ti­mate was un­bi­ased!

Why?

...con­sider a de­ci­sion prob­lem in which there are k choices, each of which has true es­ti­mated [ex­pected value] of 0. Sup­pose that the er­ror in each [ex­pected value] es­ti­mate has zero mean and stan­dard de­vi­a­tion of 1, shown as the bold curve [be­low]. Now, as we ac­tu­ally start to gen­er­ate the es­ti­mates, some of the er­rors will be nega­tive (pes­simistic) and some will be pos­i­tive (op­ti­mistic). Be­cause we se­lect the ac­tion with the high­est [ex­pected value] es­ti­mate, we are ob­vi­ously fa­vor­ing overly op­ti­mistic es­ti­mates, and that is the source of the bias… The curve in [the figure be­low] for k = 3 has a mean around 0.85, so the av­er­age dis­ap­point­ment will be about 85% of the stan­dard de­vi­a­tion in [ex­pected value] es­ti­mates. With more choices, ex­tremely op­ti­mistic es­ti­mates are more likely to arise: for k = 30, the dis­ap­point­ment will be around twice the stan­dard de­vi­a­tion in the es­ti­mates.4

This is “the op­ti­mizer’s curse.” See Smith & Win­kler (2006) for the proof.

The Solution

The solu­tion to the op­ti­mizer’s curse is rather straight­for­ward.

...[we] model the un­cer­tainty in the value es­ti­mates ex­plic­itly and use Bayesian meth­ods to in­ter­pret these value es­ti­mates. Speci­fi­cally, we as­sign a prior dis­tri­bu­tion on the vec­tor of true val­ues μ = (μ1, …, μn) and de­scribe the ac­cu­racy of the value es­ti­mates V = (V1, …, Vn) by a con­di­tional dis­tri­bu­tion V|μ. Then, rather than rank­ing al­ter­na­tives. based on the value es­ti­mates, af­ter we have done the de­ci­sion anal­y­sis and ob­served the value es­ti­mates V, we use Bayes’ rule to de­ter­mine the pos­te­rior dis­tri­bu­tion for μ|V and rank and choose among al­ter­na­tives based on the pos­te­rior means...

The key to over­com­ing the op­ti­mizer’s curse is con­cep­tu­ally very sim­ple: treat the re­sults of the anal­y­sis as un­cer­tain and com­bine these re­sults with prior es­ti­mates of value us­ing Bayes’ rule be­fore choos­ing an al­ter­na­tive. This pro­cess for­mally rec­og­nizes the un­cer­tainty in value es­ti­mates and cor­rects for the bias that is built into the op­ti­miza­tion pro­cess by ad­just­ing high es­ti­mated val­ues down­ward. To ad­just val­ues prop­erly, we need to un­der­stand the de­gree of un­cer­tainty in these es­ti­mates and in the true val­ues..5

To re­turn to our origi­nal ques­tion: Yes, some skep­ti­cism is jus­tified when con­sid­er­ing the op­tion be­fore you with the high­est ex­pected value. To min­i­mize your pre­dic­tion er­ror, treat the re­sults of your de­ci­sion anal­y­sis as un­cer­tain and use Bayes’ The­o­rem to com­bine its re­sults with an ap­pro­pri­ate prior.

Notes

1 Smith & Win­kler (2006).

2 Lindley et al. (1979) and Lindley (1986) talk about ‘true’ ex­pected val­ues in this way.

3 Fol­low­ing Har­ri­son & March (1984).

4 Quote and (adapted) image from Rus­sell & Norvig (2009), pp. 618-619.

5 Smith & Win­kler (2006).

References

Har­ri­son & March (1984). De­ci­sion mak­ing and post­de­ci­sion sur­prises. Ad­minis­tra­tive Science Quar­terly, 29: 26–42.

Lindley, Tver­sky, & Brown. 1979. On the rec­on­cili­a­tion of prob­a­bil­ity as­sess­ments. Jour­nal of the Royal Statis­ti­cal So­ciety, Series A, 142: 146–180.

Lindley (1986). The rec­on­cili­a­tion of de­ci­sion analy­ses. Oper­a­tions Re­search, 34: 289–295.

Rus­sell & Norvig (2009). Ar­tifi­cial In­tel­li­gence: A Modern Ap­proach, Third Edi­tion. Pren­tice Hall.

Smith & Win­kler (2006). The op­ti­mizer’s curse: Skep­ti­cism and post­de­ci­sion sur­prise in de­ci­sion anal­y­sis. Man­age­ment Science, 52: 311-322.