P(doom) is a Dumb Meme

Look, I’m as much of a Rationalist with a special interest in AI x-risk as anyone. But oh my god do I hate talking about “P(doom)”. When it gained widespread usage in the wake of ChatGPT, I assumed that it was floating around variously adjacent circles of faux-intellectuals, but surely everyone in my circles could see how braindead it was… right?

image.png

(This post was partially inspired by a recent conversation with Liron about Doom Debates.[1])

I guess it’s time for me to focus on a place where I’m shocked that everyone else is dropping the ball.[2]

P(doom) is Hopelessly Vague

Let’s start with the ambiguity. Does “doom” mean… extinction? A lot of people think so! I have personally encountered people who think catastrophic harms from AI are likely, but the risks of all humans dying are low. They’re like “Sure, 99.999% of humans might die from AI, but the AI will obviously want to keep thousands of humans alive for science and potential trade with aliens and stuff, so my P(doom) is approximately 0%.”

That might sound crazy. Surely you, dear reader, know exactly what “doom” means. You know, for example, which of these count as doom and which don’t:

  • A young ASI tries to use it’s first-mover advantage to take over the world and prevent other ASI competitors from emerging. In doing so it sparks a war against humanity where it eventually loses,[3] but it kills 10% of all humans in the process.

  • ASI empowers a single person or small group of humans to become tyrants and lock in a permanent authoritarian regime where almost all other humans are subject to the whims of the tyrant(s).

  • A state actor or terrorist group uses narrow AI to build a bioweapon (eg AlphaFold but for plagues), and that weapon gets out and kills literally everyone.[4]

  • A great power conflict starts for reasons mostly unrelated to AI, but advanced AI systems are deployed on the various battlefields, and eventually (in part due to the speed and ruthlessness of AI) it escalates to a thermonuclear war that kills 99% of humanity.

  • Humans “merge with the machines” in a way that, from the outside, looks an awful lot like all of humanity being subsumed by inhuman machinery.

  • Humans become incapable of competing with machines in almost all sectors of the economy. While the rule of law persists, the political landscape is also dominated by a variety of ASIs with various goals, and they don’t redistribute significant wealth into human hands. Only a small number of humans survive in the long-run, relegated to being a historical curiosity.

  • ASIs are developed in a broadly corrigible way, leading to extreme abundance, and the offense-defense balance means that the world is basically safe. But everyone (even the Amish) eventually stop having human kids because other things (including AI “kids”) are way more fun/​satisfying. Even though lifespans are really long, people still gradually die off, and eventually only the machines remain.

Implicit in all of this are questions of timeframe. Many people that I’ve talked to feel that gradual disempowerment probably doesn’t count as AI doom because it’s too slow, and they’re assuming a short timeframe for “P(doom)” like 5 or 10 years. Others assume that P(doom) means “this century.” (But does that mean the next 74 years or 100 years?) Does the heat death of the universe count as doom?

Even if you have a way that you like to think about the subject, are you sure that the person you’re talking with is using the same frame? If I’ve learned one thing from making lots of bets and engaging with prediction markets nearly every day for years, it’s that the devil is in the details, and that finding the right way to operationalize a bet is often the hardest part.

(Edit: To be clear, I am not saying that vague statements are always bad. There’s only so much communication bandwidth, after all. What I am against is vagueness under the guise of precision, where people don’t realize that they have different interpretations, and thus miscommunicate.)

Inside Views, Outside Views, and Likelihood Ratios

“I actually think the risk [from AGI] is more than 50% … but I don’t say that because there’s other people think it’s less, and I think a sort of plausible thing that takes into account the opinions of everybody I know is sort of 10 to 20%.”
Geoffrey Hinton, 2024

Let’s say that you’re trying to estimate the number of beans in a jar. A good way to do that is use the wisdom of crowds. Go around to a bunch of people, ask them to guess, and then take the average of their guesses. Some people will be too high, others will be too low, but in theory their guesses will be correlated with the truth and their errors will be independent, so when you average them, the errors cancel out.

But now suppose that you go up to someone and ask them to write their guess on a piece of paper. They write down 1000. Then, you go to the Nobel Prize winning “godfather” of bean-counting. He’s about to write 5000, but then spots the “1000″ at the top of the page. “I’m probably too high,” he says to himself, and writes 3000 instead, thinking himself very clever. And indeed, he is clever! If all that mattered was the accuracy of his guess, then updating on the evidence is smart.

But if you then repeat this process and get a bunch of guesses, then average them, you’ll do way worse. You’re double-counting evidence! The errors in the first guess get compounded in the second guess, and at that point who wants to go against the expert consensus? People look around them and assume that a fire alarm can’t possibly be going off, because not enough people are acting, not realizing that most other people aren’t acting for a similar reason.

These two forms of probability — gearsy inside-views and all-things-considered outside-views — can disagree wildly! When I try to tell a concrete story of the development of superintelligence going well, not fudging or assuming some mysterious breakthrough, I simply fail. In that sense, my P(doom) is 100%. That’s the number that I claim you should take from me if you’re calculating an average. But I am also a Bayesian, and thus entirely aware that 100% isn’t a valid probability. When adopting a stance of humility that keeps my ignorance in mind, my sense of being doomed goes down a lot.[5]

But both of these numbers are usually a distraction!

When I’m talking to someone about the risk from AI, I don’t really want to average my worldview with theirs. What I want to do is learn their insights. Insight can come in the form of evidence, experienced out in the world, or it can come from reasoning and studying the ideas in question. Insight gives an update in the form of a likelihood ratio.

Imagine a person named Gloomer who has a naturally sad disposition, perhaps because of his genes, or maybe they had a rotten childhood. Gloomer starts with a prior of 99:1 that humanity will go extinct in his lifetime (inside view). Again, this isn’t really based on anything. He’s just a pessimist. Now suppose that Gloomer learns of an alignment technique that has promise, according to his understanding of the science, but doesn’t solve the whole problem. Being the pessimist that he is, he suspects that the rest of the problem probably won’t get solved in time. And even if it did, something else would probably get us.

If you ask Gloomer for his P(doom), even if you’re very clear to operationalize it in the right way, defining your terms and asking for their inside view, he might say “9:1.” Yikes! That’s pretty bleak! But now imagine asking “in what ways have you updated about doom?” Gloomer might tell you about becoming less pessimistic, sharing his 11:1 evidence that things are going to be okay!

In my general experience, conversations about reason and evidence (“Why do you believe what you believe?”) rather than bottom-line conclusions (“What do you believe?”) get into productive territory much faster, and devolve less often into performative bafflement around the other person’s bottom-line.

image.png

P(doom) is Fatalistic

What is the probability that you will say “Zimbabwe,” out-loud to yourself, in the next 60 seconds?

Probabilistic reasoning is a (mental) tool. And like all tools, it has places where it fits nicely and helps you do work, and has places where, if you insist on applying it, you’ll make a mess. In particular, it carries a flavor of looking at things from the outside, like a detached spectator betting on the way a sports match will go. But as the saying goes: There are two ways to be right. You can be right because you’re a wizard, and have thus found the secret truths of reality, or you can be right because you’re a king, and have decided that things will be the way you say they are.

Some secret observer might do well to speculate about whether you’ll say “Zimbabwe” in terms of probability, but you should instead ask yourself “What do I want to say?”

This is especially important in the context of coordination problems and multi-party interactions. For example, if I said to my wife “I think there’s an 8% chance we’ll get divorced in the next five years,” the very act of sharing that prediction might cause that number to go up! She might reasonably interpret it as a sign that I don’t believe in the partnership. Then, she might respond with her own prediction of 14%, leading me to update in the same way, in a back-and-forth pattern that ends up in divorce. This cascade of updates is only irrational in that speaking purely in terms of high-level probabilities is the wrong tool for the job of coordination. We could dodge it by getting specific, asking to talk about likelihood ratios and what sorts of things might doom the marriage. Or we could dodge it by asking “Do you also want to be married?”, taking the other person’s “Yes” as good enough, and moving on with our lives, having jointly made that decision.

When leaders of companies or nations talk about how they don’t expect other companies or nations to slow down in the race towards the AGI cliff, consider that this expression of pessimism might be a self-fulfilling prophesy. Those words are a signal to those who are listening that the speaker doesn’t expect (and therefore doesn’t intend!) to cooperate.

Perversely, the inverse dynamic can also happen! I’ve met people who say they have a low P(doom) because “Obviously it would be extremely dangerous to charge ahead, even as superintelligence gets closer. Humanity isn’t that stupid. We’ll slow down and figure things out, and thereby avoid doom.”

But, uhh… do you realize that giving a low P(doom) is communicating the exact opposite of your worldview?! The situation resembles Murphy’s Law: if you believe that things will go fine, your carelessness may cause things to fail; if you believe that things will go wrong, you will prepare, and by preparing, things will go fine.[6] The situation isn’t paradoxical. You just have to have believe that conditional on being cautious, things will turn out okay. Giving a non-conditional probability hides our power, and ability to act with caution, implying that things will either be fine (in which case why bother taking action) or they won’t (in which case why bother taking action) or they are up to chance (in which case why bother taking action).

Some people feel powerless. I get that. They don’t feel like they get to decide whether humanity is cautious or reckless. They don’t feel like, by vocalizing hope or fear, they are changing the world. And sometimes that’s right. It can be useful to quietly ask yourself whether cooperation is likely, or whether humanity will decide to slow down. Not everyone is in a position of power.

But our decisions are usually entangled in more ways than we appreciate. You are part of a culture, a community, and a country. It matters, in aggregate, what people in those groups say and believe and try to accomplish. This aggregate will isn’t some mysterious thing, beyond your power. It comes from you, and people like you. Every time you act, you are acting collectively.

At the level of humanity, at the very least, we are in control of Earth. If we decide to work together, we can work together. If we decide not to build machines that make us obsolete, we can (for now) just stop, and choose not to go down that road. Perhaps there are too many fools for things to work out, but assuming that to be the case (or saying things in public that imply that you think it) is itself a kind of foolishness.

Counterarguments

There are people I respect who disagree with me about P(doom) being a bad meme. Zvi Mowshowitz, for example, argued with me at LessOnline that while I’m not wrong about all the ways that it’s vague and unsophisticated, it has the virtue of being memetically catchy enough to get people thinking about existential risk at all, whereas the default is that it gets ignored. He followed up by asking whether political prediction markets (ie “P(Trump)”) are also bad.

These are strong counterpoints, and I take them seriously, but I’m ultimately convinced by neither.

On the question of memetic fitness, I agree that some people who might otherwise be oblivious, or swept up in some other memetic fad, probably spend more time thinking about the possibility of AI causing a disaster as a result of the meme. This is good. We need more people thinking about and talking about risks.

But it’s a false dichotomy to choose between P(doom) and nothing. Even before P(doom), there was the obnoxious “what are your timelines?” meme[7] and many others before it. After, we had AI 2027, the METR plot, and of course:

image.png

Each of these have pros and cons, being variously sophisticated, memetically fit, and useful for having productive conversations. I’m not really saying that any of them are necessarily better than P(doom) — even though they are — but rather that a dumb meme is still a dumb meme even if it’s better than nothing. We can, and should, do better!

One example of doing better are prediction markets, which I am broadly in favor of. Talking about probabilities and making bets seems to me to be a vastly superior way to do political forecasting than the raw punditry I grew up with. So what about “P(Trump)” and other such things? Why don’t I have the same animosity for prediction markets as I do for P(doom)?

  • Election markets aren’t vague. Most are extremely well operationalized, with clear criteria and timelines. If someone made a market that was similarly vague, and people kept talking about it ad nauseam, I would have similar complaints.

  • In my view, people who are savvy enough to be thinking about prediction markets (outside of sports betting) are usually savvy enough to understand that something like “P(Trump)” exists in the context of a marketplace, and that knowing how someone bet on that market gives only a very limited handle on their beliefs. If people started implying that people with a high P(Trump) should be labeled as “Trumpers,” I would also yell at them.

  • When people cite prediction markets (or more typically, polling) as a reason not to vote for the best candidate,[8] I think this is a dumb and bad mistake that is directly analogous to the fatalism surrounding P(doom)! Thankfully, it seems to me that most people understand that their vote matters for deciding elections, even though the market isn’t conditional on one course of action or another.

In short, P(Trump) isn’t a meme that’s rolling around the discourse, causing miscommunication and sloppy thinking. If it was, I’d probably be against it!

A Sense That More Is (Memetically) Possible

P(doom) probably isn’t going away, this side of the singularity. It’s reached critical mass, and will get parroted by midwits, if nothing else. But you don’t have to be a midwit! You can recognize that P(doom) has the flavor of a scissor statement, rather than a nuanced perspective on reality. If someone brings it up in conversation, use it as an opportunity to get into the details of what sorts of futures they’re concretely imagining, what they’re basing that perspective on, and what sorts of things can be done to steer our fate. Refuse to just give simple numbers until things have been sufficiently operationalized. Take the road of rationality, and choose your memes wisely.

image.png
  1. ^

    I gave a mini-version of this rant when I went on his show late last year. While the P(doom) emphasis rubs me the wrong way, I like Doom Debates overall.

  2. ^

    To be fair, some others, including Eliezer, have also complained about it publicly.

  3. ^

    In the hypothetical, we can imagine the ASI is the weakest/​stupidest possible AI that has a real shot at takeover. But thanks to it being right on the cusp, its success isn’t guaranteed, and we happen to get lucky.

  4. ^

    Perhaps by causing society to collapse to the point where the few stragglers in bunkers can’t recover.

  5. ^

    How much? It depends on what “doom” means, of course! 😛

  6. ^

    This idea, and the difference between wizard correctness and kingly correctness, are ideas I picked up from CFAR, with particular thanks to Duncan Sabien. I’m not sure who the first person to develop the ideas was.

  7. ^

    “What are your timelines?” is bad for similar reasons. It’s vague about what we’re talking about. It encourages focusing on bottom-lines, rather than reason or evidence. It denies agency and our power to choose. And it, in practice, often resulted in point estimates, rather than distributions. Like with P(doom), there’s a sophisticated way to do forecasting, which the meme version turns into a cartoon.

  8. ^

    To be clear, I think that some amount of strategic voting is wise. In particular, as long as one is tracking their decision-theoretic reference class and the higher-order effects, I think it’s good to try to estimate how the other voters are leaning and occasionally make compromises that look like voting for the lesser evil. I am rallying against the braindead “What can one vote do?” and “I live in a state where my vote doesn’t matter!” narratives.