Has this worked so far? How many cases can we point to where a person who was publicly skeptical of AI danger changed their public stance, as a result of seeing experimental evidence?
What’s the theory of change for “generating evidence of danger”? The people who are most grim about AI would probably tell you that there is already plenty of evidence. How will adding more evidence to the pile help? Who will learn about this new evidence, and how will it cause them to behave differently?
Here’s a theory of change I came up with. (You might wish to spend some time brainstorming your own theory of change before reading my idea, in case you’re able to independently generate something different and better.)
Phil Tetlock found: “Many experts claimed that they assigned higher probabilities to outcomes that materialised than they did. As Tetlock notes, it is hard to say someone got it wrong if they think they got it right.” Based on this result, it seems plausible that whatever experimental evidence comes in, various AI experts who appear to disagree with one another will all claim that the experiment evidence bolsters their position, and shows that they were right all along.
To address this issue, I would suggest an adversarial collaboration structure. Get two experts with different views—Eliezer Yudkowsky and Yann Lecun, say—to identify a concrete experiment such that they predict different results. Design the experiment so the outcome is relatively unambiguous. Have the experiment be implemented and performed by an ecumenical team. Place a public bet on the outcome of the experiment, and have the loser agree in advance to concede the bet publicly. I think this has a genuine shot at changing the public conversation in a way that the current slow drip of evidence has not.
It seems somewhat urgent to set up this adversarial collaboration structure. Every experimental result which comes in without advance expert pre-registration is another chance for experts to come up with rationalizations for why the new evidence doesn’t show that they were wrong.
Has this worked so far? How many cases can we point to where a person who was publicly skeptical of AI danger changed their public stance, as a result of seeing experimental evidence?
What’s the theory of change for “generating evidence of danger”? The people who are most grim about AI would probably tell you that there is already plenty of evidence. How will adding more evidence to the pile help? Who will learn about this new evidence, and how will it cause them to behave differently?
Here’s a theory of change I came up with. (You might wish to spend some time brainstorming your own theory of change before reading my idea, in case you’re able to independently generate something different and better.)
Phil Tetlock found: “Many experts claimed that they assigned higher probabilities to outcomes that materialised than they did. As Tetlock notes, it is hard to say someone got it wrong if they think they got it right.” Based on this result, it seems plausible that whatever experimental evidence comes in, various AI experts who appear to disagree with one another will all claim that the experiment evidence bolsters their position, and shows that they were right all along.
To address this issue, I would suggest an adversarial collaboration structure. Get two experts with different views—Eliezer Yudkowsky and Yann Lecun, say—to identify a concrete experiment such that they predict different results. Design the experiment so the outcome is relatively unambiguous. Have the experiment be implemented and performed by an ecumenical team. Place a public bet on the outcome of the experiment, and have the loser agree in advance to concede the bet publicly. I think this has a genuine shot at changing the public conversation in a way that the current slow drip of evidence has not.
It seems somewhat urgent to set up this adversarial collaboration structure. Every experimental result which comes in without advance expert pre-registration is another chance for experts to come up with rationalizations for why the new evidence doesn’t show that they were wrong.
Relatedly, highly recommended Book Review: How Minds Change .