“Can we know what to do about AI?”: An Introduction

I’m currently working on a research project for MIRI, and I would welcome feedback on my research as I proceed. In this post, I describe the project.

As a part of an effort to steel-man objections to MIRI’s mission, MIRI Executive Director Luke Muehlhauser has asked me to develop the following objection:

“Even if AI is somewhat likely to arrive during the latter half of this century, how on earth can we know what to do about it now, so far in advance?”

In Luke’s initial email to me, he wrote:

I think there are plausibly many weak arguments and historical examples suggesting that P: “it’s very hard to nudge specific distant events in a positive direction through highly targeted actions or policies undertaken today.” Targeted actions might have no lasting effect, or they might completely miss their mark, or they might backfire.

If P is true, this would weigh against the view that a highly targeted intervention today (e.g. Yudkowsky’s Friendly AI math research) is likely to positively affect the future creation of AI, and might instead weigh in favor of the view that all we can do about AGI from this distance is to engage in broad interventions likely to improve our odds of wisely handling future crises in general — e.g. improving decision-making institutions, spreading rationality, etc.

I’m interested in abstract arguments for P, but I’m even more interested in historical data. What can we learn from seemingly analogous cases, and are those cases analogous in the relevant ways? What sorts of counterfactual history can we do to clarify our picture?

Luke and I brainstormed a list of potential historical examples of people predicting the future 10+ years out, and using the predictions to inform their actions. We came up with the following potential examples, which I’ve listed in chronological order by approximate year:

In addition, we selected

as background reading.

I would greatly appreciate any ideas from the Less Wrong community concerning potential historical examples and relevant background reading.

Over the coming weeks, I’ll be making a series of discussion board posts on Less Wrong reporting on my findings, and linking these posts here.