Simpler explanations of AGI risk

We’re getting a shot at presenting our concerns about AI X-risk to the general public. It would be useful to have a brief presentation that plays well with less-technical people, or technical people who don’t want to listen to a half hour of explanation just at that moment. The other goal here is to avoid polarization with a gentle approach. We don’t want AGI risk to become polarized like the climate change “debate” did.

This is my suggestion for a conversation template, based on personal success. I’m hoping others chip in ideas and say what’s worked for them.

  • If we make something smarter than us, why wouldn’t it become our overlord?

  • We’re going to make AI smarter than us,

    • and by default, it will treat us like we treat all of the species we’ve accidentally eliminated.

  • (You might be able to stop here. For some people, the above is totally intuitive.)

  • We’re not talking about tools here, like all current and previous AI. - we’re talking about something more like a new species, with its own goals and the intelligence to figure out how to accomplish them.

  • We don’t know how long this will take,

    • Or how soon it might happen.

      • GPT4 is acing some exams and failing only the toughest tests of logic

      • and it’s getting smarter with code that prompts it to do things like

        • “check your reasoning”

        • And break problems into pieces

        • And calling on other AI tools like WolframAlpha

    • This will almost certainly replace a bunch of jobs,

      • And it’s definitely going to get smarter

  • Something smarter than us will wind up outsmarting us,

    • and doing whatever it wants.

    • (We’re unlikely to put even the first one in a box, given how we’re treating AI now

      • And if we do, it will probably outsmart us and get out

        • And we’ll keep making more until we screw one up)

  • There’s no good reason to think it’s going to be nice

    • Unless we get a hell of a lot better at building it so it’s nice.

    • Nobody knows how.

    • Including the people saying “Oh we’ll figure it out.”

    • Not one of them has a plan that sounds worth betting on,

    • Let alone betting the future of the species on it.

  • But we’re not doomed.

    • We just need to pull together and figure this out

    • But quickly.

  • -”Can’t we just...”

    • Maybe. But probably not.

      • Tons of smart people have offered their “can’t we just” suggestions.

        • Not one of them stands up to sober, close inspection.

        • Some of them offer ways to approach the problem, but they don’t make it easy.

    • Making a new being that actually loves us is not easy

    • (Even humans aren’t all that safe for other humans, and we have no idea how to reproduce what makes humans nice).

This approach has worked for me in conversation, but only when I also get the emotional tone right. Logic is emotional for everyone. People without strong rationalist ambitions are even more prone to think with their feelings. So:

  • Don’t argue. - Arguing makes people want to prove you wrong. - It engages their motivated reasoning and confirmation bias to find counterarguments—And avoid thinking about your arguments.

    • it’s key to not get dragged into details

      • You want to keep it brief, and you’ll never get there if you go into a discussion of a point that doesn’t really matter for the main argument

      • For instance, what do you mean by smarter?

  • Don’t sound condescending.

    • Sounding condescending will make them want to prove you wrong, as above

      • This could color their whole take on the topic

        • possibly for years that we don’t have

      • You’ve had this conversation a million times. - They haven’t. -

        • This all sounds weird and new

        • And the new logic is likely to trip them up.

        • So you’ll need to patient. If you’re as impatient as I am, this is the hard part.

  • Don’t try to get them to agree with you on the spot.

    • It’s challenging to move on without a conclusion, but it’s important.

    • You can’t change someone’s mind.

    • You can only offer arguments that will cause them to change their own minds,

      • over time,

      • IF they’re thinking about them without looking wanting to prove you wrong.

This approach is intended for casual conversations, or for times when you’ve got the floor, but you don’t want to overstay your welcome on that floor.

When it gets sidetracked into details, steering this back to the top level, with epistemic modesty, seems useful. Asking something like “Can you really be sure that something smarter than us won’t outsmart us somehow? I wish I could be sure, but I’m not.” Or saying something like “It just seems like we shouldn’t trust something that thinks differently than us, if it has goals programmed or trained in without really knowing how to do it”. This may present you as being on the same team and at the same level as the person you’re talking to.

This set of suggestions is offered with low certainty. I’m no expert at persuasion, but I have researched it a bit, and researched cognitive biases a lot.

I also tried to make a similar set of simple presentations as an accordion style FAQ, to provide as a link instead of in conversation.

So, how could the above be better? Or is my premise mistaken?