Formalizing the “AI x-risk is unlikely because it is ridiculous” argument

There is a lot of good writing on technical arguments against AI x-risk (such as Where I agree and disagree with Eliezer (which mostly argues for more uncertainty) and others). However, in the wider world the most popular argument is more of the form “it is ridiculous” or “it is sci-fi” or some sort of other gut feeling. In particular, I think this is the only way people achieve extremely low credence on AI doom (low enough that they worry about other disasters instead).

Although this seems like a fallacy, in this post I will attempt to formalize this argument. Not only is it good, but I think it turns out to be extremely strong!

In my judgement, I still find the arguments for x-risk stronger or at least balanced with the “it is ridiculous” argument, but it still deserves serious study. In particular, I think analyzing and critiquing it should become a part of the AI public discourse. For example, I think there are flaws in the argument that, when revealed, would cause people to become more worried about AI x-risk. I do not quite know what these flaws are yet. In any case, I hope this post will allow us to start studying the argument.

The argument is actually a bunch of separate arguments that tend to be lumped together into one “it is ridiculous” argument.

For the purposes of this article, Bob is skeptical of AI x-risk and Alice argues in favor of it.

Existential risk would stop a 12,000 year trend

Forgetting all of human history, Bob first assumes our priors for the long term future are very much human agnostic. The vast majority of outcomes have no humans, but are just arbitrary arrangements of matter (paperclips, diamonds, completely random, etc...). So our argument will need to build a case for the long term future actually being good for humans, despite this prior.

Next, Bob takes into account human history. The total value of the stuff humans consume tends to go up. In particular, it seems to follow a power law, which is a straight line on this log-log graph.

Graph from Modeling the Human Trajectory

Which means Bob has the gods of straight lines on his side!

This should result in a massive update of the priors towards “the future will have lots of things that humans like”.

Most people of course don’t track economics or think about power-laws, but they have an intuitive sense of human progress. This progress is pretty robust to a wide variety of disasters, but not to x-risk, and thus the model is evidence that x-risk simply won’t occur.

Clever arguments fail in unexpected ways

However, trends do break sometimes, and AI seems pretty dangerous. In fact, Alice has very good technical theories of why it is dangerous.

But if you go through history, you find that even very good theories are hit and miss. It is good enough to locate the hypothesis, but still has a decent chance of being wrong.

Alice might say “but if the theory fails, that might just mean AI is bad to humans in a way I didn’t expect, not that AI is safe”. But our prior does say AI is safe, thanks to the gods of straight lines. And Alice does not have a god of straight lines for AI doom; if anything, AI has tended to get more useful to humans over time, not less useful.

Out of the ideas Bob has heard, sci-fi-ness and bad-ness is correlated

Science fiction is a pretty good predictor of the future (in that future progress has often been predicted by some previous sci-fi story). However, if Bob discovers that a new idea he heard previously occurred in sci-fi, on average this provides evidence against the idea.

That’s because if an idea is both bad and not from sci-fi, Bob is unlikely to hear it. And thus being sci-fi and being bad becomes correlated conditioned on Bob having heard about it.

Bob should partially discount Alice’s arguments that would benefit Alice if Bob believed them

Alice has a lot to gain from Bob believing in AI x-risk. If AI x-risk isn’t actually that bad, she still gains social clout and/​or donations in the short term. If AI x-risk is a serious concern, Bob believing in AI x-risk might decrease the chance of everyone including Alice from dying. This is amplified by the fact that believing Alice’s argument would result in a huge change of priorities from Bob compared to other arguments.

In other words, Bob has reason to believe that Alice might be a clever arguer. Depending on the details, Alice’s argument might even be evidence against x-risk if Bob isn’t sufficiently satisfied with it. (I have also seen the reverse; convincing people that AI progress is bad by pointing out how insane the e/​acc people are.)

Interestingly, Timnit Gebru has theorized that belief in AI x-risk also massively benefits AI companies (since it encourages investment to keep ahead of the competition). She seems to imply that the discussions promoting belief in AI x-risk is therefore being astroturfed by donating to the effective altruist movement.

In general, this heuristic is reinforced by the many causes and disasters people hear about from all sorts of persuasive people.

Alice’s lack of grass touching

Most of AI x-risk discussion occurs online, which has a large negativity bias.

Conclusion

I think this is a decent formalization of the various gut reactions skeptical of AI x-risk.

However, all these arguments are flawed of course. Sci-fi does come true sometimes, sometimes lines stop[1], etc...

I think in the AI discourse, we should try to find historical examples of how well these gut reactions and these formalizations of them did in the past. I think that (1) we will find that the argument did pretty well but (2) the cases where it did fail will convince many people that looking more deeply into AI x-risk is worth it.


What do you think? Did I miss any arguments that should be also be thought of the “it’s ridiculous” argument?

  1. ^

    Another interesting bit is the “12,000 year trend” argument has a serious limitation. This is significant because it’s the only thing that allowed us to overcome our apocalyptic priors. Without it, Bob has no reason to believe humanity will go well even if Alice’s arguments are wrong.

    The limitation is that the model does not actually make any predictions for the year 2047 or beyond, because the x-axis only includes times before 2047, even if you extend it rightwards. So the gods of straight lines will abandon Bob, and he will need to come up with a new model.

    However, like I said in that section, most people are not thinking about mathematical models. The heuristic is just “human civilization gets bigger”.