I think these are fair criticisms of the “defer to superforecasters” view (which I share), and I think you helped me clarify some of my views here (thanks!), but I feel like it’s missing a few things. The best case for it, in my view, goes something like this:
The world is very hard to predict, and expertise is often overrated in complex domains.
Your first argument—that superforecasters lack domain-specific track records—doesn’t carry much weight if the relevant forecasting questions require broad expertise across regulation, diffusion dynamics, and technical capabilities simultaneously. No one has a verified track record across all of these, and “domain expert” here often just means “has strong inside views in a complicated area.”
On the selection effect: I don’t buy the strong version of your second claim. The fact that there’s low or no signal in non-verifiable domains (ie. philosophy) doesn’t really vindicate inside views—it weakens both. Any somewhat independent signal aggregated across multiple actors is probably better than a single inside view, even an expert one.
On track record: The benchmark underperformance is a real update against superforecasters in the AI case, but the question is how large that update should be. My framing: superforecasters are the prior; evidence of their underperformance updates you toward domain experts with better track records—and yes, I do think it updates toward people like yourself, Ryan, Eli, Daniel, and Peter Wilderford, who have been more right. But by how much? That’s the crux, and I’d genuinely like to see someone work through the math. A few people being more accurate than superforecasters on a hard problem doesn’t automatically license large updates toward their broader worldviews—we should be asking what reference class of questions they outperformed on, and whether that tracks the specific claims we care about. I’d also note that knowing how benchmarks saturate is less relevant to AI risk than you seem to think—the revenue point is stronger.
I’d also push back on a common error I see: people often conclude that “no clear expert → my inside view gets more weight.” This is probably true at the margin, but massively overrated in practice.
On your argument that object-level reasoning obsoletes base rates: This is somewhat circular. You have inside views about what it means to reason well about AI progress, and superforecasters disagree. You’re partially bootstrapping from your own beliefs to dismiss theirs.
On inside views and group epistemics: I agree that deference cascades are bad, but the fix isn’t “everyone uses their inside view”—it’s that people should be clearer about what’s inside vs. outside view reasoning (I agree this is complicated and maybe idealistic, but I don’t think the default for rationalists here should be to take the inside view of the community). I’m also skeptical that inside-view reasoning escapes the groupthink problem. Epistemic bubbles shape which counterarguments you seek, what your priors are, which information you weight. The AI safety/rationalist community isn’t immune to this.
I do think people should build inside views on AI—and the move of not doing so because it’s “not relevant to my field” is more often cope than a principled stance. But I’m genuinely uncertain about what the right policy is after you’ve built one. Surely the answer isn’t just “act on it fully”—the outside view still has to do some work. One practical resolution: argue on inside view, but take actions that at least partially reflect outside-view uncertainty.
A real remaining question: in non-verifiable domains, who counts as an expert? This is, I think, just an open and hard problem.
On your argument that object-level reasoning obsoletes base rates: This is somewhat circular. You have inside views about what it means to reason well about AI progress, and superforecasters disagree. You’re partially bootstrapping from your own beliefs to dismiss theirs.
Oops, “object level reasoning obsoletes base rates” is not what I was trying to argue… my view is that the action is mostly in selecting the right base rate, i.e. that AI is more analogous to a new species than a normal technology.
Also I don’t agree that it’s circular. I think one of the correct reasons to defer to someone is them making correct arguments (as evaluated by my inside view), and that doesn’t apply. I definitely agree that I’m bootstrapping from my views to dismiss theirs. Now, there might be other reasons to defer to someone (for example, the other reasons I gave above), but I was arguing specifically against reason #3 above here.
I think these are fair criticisms of the “defer to superforecasters” view (which I share), and I think you helped me clarify some of my views here (thanks!), but I feel like it’s missing a few things. The best case for it, in my view, goes something like this:
The world is very hard to predict, and expertise is often overrated in complex domains.
Your first argument—that superforecasters lack domain-specific track records—doesn’t carry much weight if the relevant forecasting questions require broad expertise across regulation, diffusion dynamics, and technical capabilities simultaneously. No one has a verified track record across all of these, and “domain expert” here often just means “has strong inside views in a complicated area.”
On the selection effect: I don’t buy the strong version of your second claim. The fact that there’s low or no signal in non-verifiable domains (ie. philosophy) doesn’t really vindicate inside views—it weakens both. Any somewhat independent signal aggregated across multiple actors is probably better than a single inside view, even an expert one.
On track record: The benchmark underperformance is a real update against superforecasters in the AI case, but the question is how large that update should be. My framing: superforecasters are the prior; evidence of their underperformance updates you toward domain experts with better track records—and yes, I do think it updates toward people like yourself, Ryan, Eli, Daniel, and Peter Wilderford, who have been more right. But by how much? That’s the crux, and I’d genuinely like to see someone work through the math. A few people being more accurate than superforecasters on a hard problem doesn’t automatically license large updates toward their broader worldviews—we should be asking what reference class of questions they outperformed on, and whether that tracks the specific claims we care about. I’d also note that knowing how benchmarks saturate is less relevant to AI risk than you seem to think—the revenue point is stronger.
I’d also push back on a common error I see: people often conclude that “no clear expert → my inside view gets more weight.” This is probably true at the margin, but massively overrated in practice.
On your argument that object-level reasoning obsoletes base rates: This is somewhat circular. You have inside views about what it means to reason well about AI progress, and superforecasters disagree. You’re partially bootstrapping from your own beliefs to dismiss theirs.
On inside views and group epistemics: I agree that deference cascades are bad, but the fix isn’t “everyone uses their inside view”—it’s that people should be clearer about what’s inside vs. outside view reasoning (I agree this is complicated and maybe idealistic, but I don’t think the default for rationalists here should be to take the inside view of the community). I’m also skeptical that inside-view reasoning escapes the groupthink problem. Epistemic bubbles shape which counterarguments you seek, what your priors are, which information you weight. The AI safety/rationalist community isn’t immune to this.
I do think people should build inside views on AI—and the move of not doing so because it’s “not relevant to my field” is more often cope than a principled stance. But I’m genuinely uncertain about what the right policy is after you’ve built one. Surely the answer isn’t just “act on it fully”—the outside view still has to do some work. One practical resolution: argue on inside view, but take actions that at least partially reflect outside-view uncertainty.
A real remaining question: in non-verifiable domains, who counts as an expert? This is, I think, just an open and hard problem.
Happy to hear counters.
Oops, “object level reasoning obsoletes base rates” is not what I was trying to argue… my view is that the action is mostly in selecting the right base rate, i.e. that AI is more analogous to a new species than a normal technology.
Also I don’t agree that it’s circular. I think one of the correct reasons to defer to someone is them making correct arguments (as evaluated by my inside view), and that doesn’t apply. I definitely agree that I’m bootstrapping from my views to dismiss theirs. Now, there might be other reasons to defer to someone (for example, the other reasons I gave above), but I was arguing specifically against reason #3 above here.