I imagine a grounded career would be one in which you can see a direct link between your actions and their results e.g. careers in medicine. Strangely enough, Earning to Give seems to fit this criteria (provided you give to a grounded charity like GiveDirectly or something). I’m not necessarily endorsing this philosophy though—I think there are a lot of important and impactful careers that have really loose feedback loops and uncertain outputs, like journalism or science communication.
I’m also not sure that the math works out the same if you change the question from something like “how to optimize your individual impact” to “what algorithm would produce the greatest value if followed by many actors”. Most startups fail, and most startup founders also overestimate their chances and go for speculative shots. But some startups do make it really big, and it seems better for society that lots of people try and fail (but some succeed) than if all founders followed a more “grounded” strategy.
Claude thought that this does change things, and that the article is premised on assuming that there are no outlier interventions with massively beneficial effects to be found. If there is a possible speculative approach that could be enormously beneficial, then it can be valuable for society that a lot of people go for risky shots in order to find it, even if each individual should expect to fail.
more specifically, most startups fail, and most startup founders also overestimate their chances and go for speculative shots. but some startups do make it really big, and it seems better for society that lots of people try and fail than if all founders followed a more “grounded” strategy.
Claude
I think the answer is that the math changes substantially — the article’s conclusions are correct for its framing but don’t generalize to the collective case, for several interconnected reasons.
The core asymmetry: selection vs. exploration
The optimizer’s curse is fundamentally about selection bias in estimation. When you pick the top-ranked option from noisy estimates, you’re systematically selecting for positive noise. The article proves this cleanly. But the question “did I accurately estimate the value of my chosen option?” is a different question from “did the population of actors generate more total value?”
When many actors each independently pick what looks best to them — each individually subject to the optimizer’s curse — the collective outcome has a different structure. You’re no longer making one bet on one noisy estimate. You’re running a portfolio of bets that covers more of the space. Each founder’s overconfidence about their startup is individually irrational in the article’s sense, but the population of overconfident founders collectively explores far more of the opportunity landscape than a population of well-calibrated founders would.
The distributional assumption does a lot of work
The article models true effects as normally distributed (mean 0, SD 100). This is doing enormous work for the conclusion that speculative interventions are overrated. With a normal distribution, there are no outsized winners to discover — the expected true value of any randomly drawn intervention is the same (the mean), regardless of how many you draw.
Startup outcomes follow something much closer to a power law. The distribution of true outcomes has a heavy right tail: most startups are worth approximately nothing, but a few are worth billions. Under heavy-tailed distributions, the expected value of the maximum of N independent draws grows substantially with N — and the aggregate value across all draws is dominated by the winners in the tail. The math of the optimizer’s curse still applies to each individual’s estimate (they’re still overestimating their particular selection), but the cost of this overestimation is bounded (you lose your investment), while the benefit of having many people explore is that you’re much more likely to find the genuinely extreme positive outliers that actually exist in the distribution.
This is a critical distinction. The article’s setup has no extreme positive outliers to find, so exploration has no value. Change the generating distribution to a power law and the entire calculus flips.
The Bayesian correction is individually rational but collectively harmful
The article’s recommended correction — shrink estimates toward the prior mean, with more shrinkage for more uncertain interventions — is the right thing for any single actor to do if they want to maximize their own expected outcome. But if every actor follows this algorithm, the population shifts heavily toward the “grounded” interventions. You get less variance in what people attempt, less exploration of the space, and you’re much less likely to find the rare interventions that are genuinely 100x better than anything in the “grounded” set.
This is essentially a positive externality problem. Each person who takes the “speculative” bet and fails has generated information (this approach doesn’t work) and explored territory (leaving more of the remaining space for others), but they don’t capture these benefits. The individually rational Bayesian correction understates the social value of speculative bets because it doesn’t account for these externalities.
Feedback mechanisms change the one-shot framing
The article models a one-shot game: estimate, select, done. In practice, startups (and many real-world interventions) operate in an environment with ongoing feedback. You start a company, the market tells you whether it’s working, you either grow or shut down. This feedback mechanism means the cost of initially overestimating is time-bounded — failed ventures get pruned, and resources eventually flow toward the things that actually work. The optimizer’s curse in the one-shot model assumes you’re stuck with your noisy selection forever. In a dynamic setting with feedback, the noise in initial selection gets corrected by subsequent evidence, and the main thing that matters is whether someone, somewhere, started the thing that turned out to genuinely work.
Where the analogy has limits
The startup case is the strongest version of your argument because it combines heavy tails, bounded downside (limited liability), market feedback, and huge exploration value. The argument weakens when:
The true distribution really is closer to normal (e.g., comparing similar medical interventions where effects are moderate and similar in magnitude). If there genuinely aren’t extreme positive outliers to find, then the article’s math applies more directly to the collective case too.
The downside isn’t bounded. If each speculative attempt has catastrophic failure modes that impose costs on others (not just the actor), then population-level exploration can be net negative. This matters for, say, biosecurity interventions or AI policy experiments where the “failure” case isn’t just “nothing happened.”
There’s no feedback mechanism to prune failures. If you’re allocating a fixed charitable fund once and walking away, you’re closer to the one-shot game, and the optimizer’s curse correction is more appropriate even at the population level.
The “speculative” interventions really do have the same true effect distribution as the “grounded” ones. The article explicitly sets these as equal. Your argument implicitly assumes that the speculative category has a different true effect distribution (heavier-tailed, with more variance in actual value). If speculative just means “harder to estimate” without also meaning “potentially much better,” the article’s conclusion holds even at the population level.
The synthesis
The article proves something real about selection bias, but it’s answering a narrower question than it appears to. The question “what should one actor do to maximize their expected impact, given noisy estimates?” has a different answer than “what allocation algorithm across many actors maximizes total value?” The latter needs to account for exploration value, portfolio diversification, distributional shape (especially tail behavior), and dynamic feedback — none of which appear in the toy model. Your startup intuition is pointing at this gap correctly, and the economics literature on optimal experimentation, portfolio theory, and the social value of risk-taking broadly supports the view that the collectively optimal level of “speculative” investment is substantially higher than what individually rational Bayesian agents would choose.
I imagine a grounded career would be one in which you can see a direct link between your actions and their results e.g. careers in medicine. Strangely enough, Earning to Give seems to fit this criteria (provided you give to a grounded charity like GiveDirectly or something). I’m not necessarily endorsing this philosophy though—I think there are a lot of important and impactful careers that have really loose feedback loops and uncertain outputs, like journalism or science communication.
I’m also not sure that the math works out the same if you change the question from something like “how to optimize your individual impact” to “what algorithm would produce the greatest value if followed by many actors”. Most startups fail, and most startup founders also overestimate their chances and go for speculative shots. But some startups do make it really big, and it seems better for society that lots of people try and fail (but some succeed) than if all founders followed a more “grounded” strategy.
Claude thought that this does change things, and that the article is premised on assuming that there are no outlier interventions with massively beneficial effects to be found. If there is a possible speculative approach that could be enormously beneficial, then it can be valuable for society that a lot of people go for risky shots in order to find it, even if each individual should expect to fail.
Claude Opus 4.6
Prompt
does the math in https://forum.effectivealtruism.org/posts/q2TfTirvspCTH2vbZ/the-best-cause-will-disappoint-you-an-intro-to-the work out the same if you change the question from something like “optimizing your individual impact” to “what algorithm would produce the greatest value if followed by many actors”?
more specifically, most startups fail, and most startup founders also overestimate their chances and go for speculative shots. but some startups do make it really big, and it seems better for society that lots of people try and fail than if all founders followed a more “grounded” strategy.
Claude
I think the answer is that the math changes substantially — the article’s conclusions are correct for its framing but don’t generalize to the collective case, for several interconnected reasons.
The core asymmetry: selection vs. exploration
The optimizer’s curse is fundamentally about selection bias in estimation. When you pick the top-ranked option from noisy estimates, you’re systematically selecting for positive noise. The article proves this cleanly. But the question “did I accurately estimate the value of my chosen option?” is a different question from “did the population of actors generate more total value?”
When many actors each independently pick what looks best to them — each individually subject to the optimizer’s curse — the collective outcome has a different structure. You’re no longer making one bet on one noisy estimate. You’re running a portfolio of bets that covers more of the space. Each founder’s overconfidence about their startup is individually irrational in the article’s sense, but the population of overconfident founders collectively explores far more of the opportunity landscape than a population of well-calibrated founders would.
The distributional assumption does a lot of work
The article models true effects as normally distributed (mean 0, SD 100). This is doing enormous work for the conclusion that speculative interventions are overrated. With a normal distribution, there are no outsized winners to discover — the expected true value of any randomly drawn intervention is the same (the mean), regardless of how many you draw.
Startup outcomes follow something much closer to a power law. The distribution of true outcomes has a heavy right tail: most startups are worth approximately nothing, but a few are worth billions. Under heavy-tailed distributions, the expected value of the maximum of N independent draws grows substantially with N — and the aggregate value across all draws is dominated by the winners in the tail. The math of the optimizer’s curse still applies to each individual’s estimate (they’re still overestimating their particular selection), but the cost of this overestimation is bounded (you lose your investment), while the benefit of having many people explore is that you’re much more likely to find the genuinely extreme positive outliers that actually exist in the distribution.
This is a critical distinction. The article’s setup has no extreme positive outliers to find, so exploration has no value. Change the generating distribution to a power law and the entire calculus flips.
The Bayesian correction is individually rational but collectively harmful
The article’s recommended correction — shrink estimates toward the prior mean, with more shrinkage for more uncertain interventions — is the right thing for any single actor to do if they want to maximize their own expected outcome. But if every actor follows this algorithm, the population shifts heavily toward the “grounded” interventions. You get less variance in what people attempt, less exploration of the space, and you’re much less likely to find the rare interventions that are genuinely 100x better than anything in the “grounded” set.
This is essentially a positive externality problem. Each person who takes the “speculative” bet and fails has generated information (this approach doesn’t work) and explored territory (leaving more of the remaining space for others), but they don’t capture these benefits. The individually rational Bayesian correction understates the social value of speculative bets because it doesn’t account for these externalities.
Feedback mechanisms change the one-shot framing
The article models a one-shot game: estimate, select, done. In practice, startups (and many real-world interventions) operate in an environment with ongoing feedback. You start a company, the market tells you whether it’s working, you either grow or shut down. This feedback mechanism means the cost of initially overestimating is time-bounded — failed ventures get pruned, and resources eventually flow toward the things that actually work. The optimizer’s curse in the one-shot model assumes you’re stuck with your noisy selection forever. In a dynamic setting with feedback, the noise in initial selection gets corrected by subsequent evidence, and the main thing that matters is whether someone, somewhere, started the thing that turned out to genuinely work.
Where the analogy has limits
The startup case is the strongest version of your argument because it combines heavy tails, bounded downside (limited liability), market feedback, and huge exploration value. The argument weakens when:
The true distribution really is closer to normal (e.g., comparing similar medical interventions where effects are moderate and similar in magnitude). If there genuinely aren’t extreme positive outliers to find, then the article’s math applies more directly to the collective case too.
The downside isn’t bounded. If each speculative attempt has catastrophic failure modes that impose costs on others (not just the actor), then population-level exploration can be net negative. This matters for, say, biosecurity interventions or AI policy experiments where the “failure” case isn’t just “nothing happened.”
There’s no feedback mechanism to prune failures. If you’re allocating a fixed charitable fund once and walking away, you’re closer to the one-shot game, and the optimizer’s curse correction is more appropriate even at the population level.
The “speculative” interventions really do have the same true effect distribution as the “grounded” ones. The article explicitly sets these as equal. Your argument implicitly assumes that the speculative category has a different true effect distribution (heavier-tailed, with more variance in actual value). If speculative just means “harder to estimate” without also meaning “potentially much better,” the article’s conclusion holds even at the population level.
The synthesis
The article proves something real about selection bias, but it’s answering a narrower question than it appears to. The question “what should one actor do to maximize their expected impact, given noisy estimates?” has a different answer than “what allocation algorithm across many actors maximizes total value?” The latter needs to account for exploration value, portfolio diversification, distributional shape (especially tail behavior), and dynamic feedback — none of which appear in the toy model. Your startup intuition is pointing at this gap correctly, and the economics literature on optimal experimentation, portfolio theory, and the social value of risk-taking broadly supports the view that the collectively optimal level of “speculative” investment is substantially higher than what individually rational Bayesian agents would choose.