Modeling Transformative AI Risk (MTAIR)

This is a sequence outlining work by the Modeling Transformative AI Risk (MTAIR) project. The project is an attempt to map out the relationships between key hypotheses and cruxes involved in debates about catastrophic risks from advanced AI. The project is also an attempt to convert our hypothesis /​ crux map into a software-based model that can incorporate probability estimates or other quantitative factors in ways that might be useful for exploration, planning, and/​or decision support.

This series of posts presents a preliminary version of the model, along with a discussion of some of our plans going forward. The primary purpose of this sequence of posts is to inform the community about our progress, and hopefully contribute meaningfully to the ongoing discussion.

A critical secondary goal is to get feedback from the community about this model as a form of expert engagement /​ elicitation. Although the project is still very much a work in progress, we believe that we are now at the stage where we can productively engage the community to solicit feedback, critiques, and suggestions, as well as contribute to the discourse. We also welcome ideas for further collaboration with other members of the community who are interested, or who are working on related projects.

Model­ling Trans­for­ma­tive AI Risks (MTAIR) Pro­ject: Introduction

Analo­gies and Gen­eral Pri­ors on Intelligence

Paths To High-Level Ma­chine Intelligence

Take­off Speeds and Discontinuities

Model­ing Risks From Learned Optimization

Model­ing the im­pact of safety agendas

Model­ing Failure Modes of High-Level Ma­chine Intelligence

Elic­i­ta­tion for Model­ing Trans­for­ma­tive AI Risks