My Elevator Pitch for FAI

This is a short introduction to the idea of FAI and existential risk from technology that I’ve used with decent success among my social circles, which consist mostly of mathematicians or at least people who have taken an introductory CS class.

I’ll do my best to dissect what I think is effective about it, mostly as an exercise for myself. I encourage people to adopt this to their own purposes.

So, technology is getting more powerful over time, right? That is, as time goes on, it gets easier and easier to do more and more. If we extrapolate that to its logical extreme, and obviously there are some issues there but let’s just pretend, eventually we should be able to press a button and recreate the entire world however we want.

But computers don’t do exactly what we want, they do exactly what we say. So if we ever get to the point of having that button, it’s very important that we know exactly what we want. And not at the level of “I want a sandwich right now,” at the level of actually programming it into a computer.

(This is 90% of the inferential gap; also usually the above fits into a literal elevator ride.)

Again, we probably won’t have a literal button that remakes the entire universe. But we will probably have smarter-than-human AI at some point.

Imagine putting humans into a world with only chimpanzees. You don’t have to imagine that hard; humans evolved into a world with only chimpanzees. And now humans are everywhere, there are tons of us, and if we all decided that chimpanzees should die then all chimpanzees would die. They wouldn’t even know what was going on. A few humans died at first, and we still die for dumb reasons, but humans overall have a lot of power over everything else.

Now imagine putting AI into a world of humans. And if you don’t want the world to be a Luddite dictatorship, you have to imagine that people will keep creating AIs, even if the first few don’t take off. The same thing is likely to happen. AIs will take over, and we’ll live or die at their whim.

Fortunately for chimps, humans feel pretty friendly toward chimps most of the time. So we really want AIs to be friendly toward us. Which means we need to figure out what it actually means to be friendly, at a level that we can program into a computer.

Smarter than human AI is probably a fair distance away. But if you look at how fast AI research progresses and compare it to how fast philosophy research progresses, I don’t think AI is further away than philosophers actually agreeing on what people want out of life. They can’t even agree on whether God exists.

Guesses as to why this is effective (i.e. applications of Dark Arts 101):

Open with a rhetorical question that your audience likely agrees to. If necessary, talk about some examples like computers. Reduce it to a nice soundbite Also: stay really informal.

Ask them to extrapolate, and guide them to something to extrapolate. Oversimplify a lot so that you can talk about something simple, but acknowledge that you’re oversimplifying. Especially among mathematicians, the audience should fill in some gaps on their own; it makes them feel a bit more ownership over the idea, and allows them to start agreeing with you before things get too intense.

Recall a fact that they agree with and can sympathize with: computers doing what you say not what you want. If they don’t have this background it will be much harder to bridge the gap.

The next step is now just putting two and two together; hopefully they’re doing this in your head and can almost complete the sentence:

we need to know what we want if we get that magic button. The magic button is a good thing, too, so it’s not scary to agree!

And codify it into something more precise: programming the answer to a philosophical question (vague, difficult to answer) into a computer (extremely precise and picky). They should be able to register this as something very difficult, but possibly solvable.

After this, intelligences differences between chimps and humans and projecting to humans vs. AI usually works ok, but you could switch to talking about nanotech or whatnot easily.

For nanotech or biotech I recommend the line: “Any improvement is a change, but most changes just give you cancer and kill you.” This goes over well with biochemists.