The Epistemology of AI risk

Link post

(Disclaimer: Philip Trammell is planning to rewrite this blogpost and make it clearer and more precise, but I think there’s something good in this direction. Sharing it to see what the rest of the community thinks. The author is Philip Trammell, not me.)

“Some smart people, including some of my friends, believe that advanced AI poses a serious threat to human civilization in the near future, and that AI safety research is therefore one of the most valuable uses, if not the very most valuable use, of philanthropic talent and money. But most smart people, as far as I can judge their behavior—including some, like Mark Zuckerberg and Robin Hanson, who have expressed their thoughts on this explicitly—do not believe this. (I, for whatever it’s worth, am agnostic.) In my experience, when someone points out the existence of smart skeptics like these, believers often respond: “Sure, those people dismiss AI risk. But have they engaged with the arguments?”

If the answer is no, it seems obvious that those who have engaged with the arguments have nothing to learn from these skeptics’ judgment. If you aren’t worried about rain because you saw a weather report that predicts sun, and I also saw that but also saw an updated weather report that now predicts rain, I should predict rain—not update on your rain skepticism, however smart you may be. Likewise, if Mark Zuckerberg dismisses AI risk because his one exposure to the idea was a Paul Christiano blog post from 2015 with a mistake in it, which a 2016 blog post corrects, then it seems that we who have read both should not update our beliefs at all in light of Zuckerberg’s opinion. And when we look at the distribution of opinion among those who have really “engaged with the arguments”, we are left with a substantial majority—maybe everyone but Hanson, depending on how stringent our standards are here!—who do believe that, one way or another, AI development poses a serious existential risk.

But something must be wrong with this inference, since it works for all kinds of mutually contradictory positions. The majority of scholars of every religion are presumably members of that religion. The majority of those who best know the arguments for and against thinking that a given social movement is the world’s most important cause, from pro-life-ism to environmentalism to campaign finance reform, are presumably members of that social movement. The majority of people who have seriously engaged with the arguments for flat-earthism are presumably flat-earthers. I don’t even know what those arguments are.

What’s going wrong, I think, is something like this. People encounter uncommonly-believed propositions now and then, like “AI safety research is the most valuable use of philanthropic money and talent in the world” or “Sikhism is true”, and decide whether or not to investigate them further. If they decide to hear out a first round of arguments but don’t find them compelling enough, they drop out of the process. (Let’s say that how compelling an argument seems is its “true strength” plus some random, mean-zero error.) If they do find the arguments compelling enough, they consider further investigation worth their time. They then tell the evangelist (or search engine or whatever) why they still object to the claim, and the evangelist (or whatever) brings a second round of arguments in reply. The process repeats.

As should be clear, this process can, after a few iterations, produce a situation in which most of those who have engaged with the arguments for a claim beyond some depth believe in it. But this is just because of the filtering mechanism: the deeper arguments were only ever exposed to people who were already, coincidentally, persuaded by the initial arguments. If people were chosen at random and forced to hear out all the arguments, most would not be persuaded.

Perhaps more disturbingly, if the case for the claim in question is presented as a long fuzzy inference, with each step seeming plausible on its own, individuals will drop out of the process by rejecting the argument at random steps, each of which most observers would accept. Believers will then be in the extremely secure-feeling position of knowing not only that most people who engage with the arguments are believers, but even that, for any particular skeptic, her particular reason for skepticism seems false to almost everyone who knows its counterargument.

The upshot here seems to be that when a lot of people disagree with the experts on some issue, one should often give a lot of weight to the popular disagreement, even when one is among the experts and the people’s objections sound insane. Epistemic humility can demand more than deference in the face of peer disagreement: it can demand deference in the face of disagreement from one’s epistemic inferiors, as long as they’re numerous. They haven’t engaged with the arguments, but there is information to be extracted from the very fact that they haven’t bothered engaging with them.”