For awhile, I kinda assumed Eliezer had basically coined the concept of p(Doom). Then I was surprised one day to hear him complaining about it being an antipattern he specifically thought was unhelpful and wished people would stop.

He noted: “If you want to trade statements that will actually be informative about how you think things work, I’d suggest, “What is the minimum necessary and sufficient policy that you think would prevent extinction?”

Complete text of the corresponding X Thread:

I spent two decades yelling at nearby people to stop trading their insane made-up “AI timelines” at parties. Just as it seemed like I’d finally gotten them to listen, people invented “p(doom)” to trade around instead. I think it fills the same psychological role.
If you want to trade statements that will actually be informative about how you think things work, I’d suggest, “What is the minimum necessary and sufficient policy that you think would prevent extinction?”
The idea of a “p(doom)” isn’t quite as facially insane as “AGI timelines” as marker of personal identity, but
(1) you want action-conditional doom,
(2) people with the same numbers may have wildly different models,
(3) these are pretty rough log-odds and it may do violence to your own mind to force itself to express its internal intuitions in those terms which is why I don’t go around forcing my mind to think in those terms myself,
(4) most people haven’t had the elementary training in calibration and prediction markets that would be required for them to express this number meaningfully and you’re demanding them to do it anyways,
(5) the actual social role being played by this number is as some sort of weird astrological sign and that’s not going to help people think in an unpressured way about the various underlying factual questions that ought finally and at the very end to sum to a guess about how reality goes.

Orthonormal responds:

IMO “p(doom)” was a predictable outgrowth of the discussion kicked off by Death with Dignity, specifically saying you thought of success in remote log-odds. (I did find it a fair price to pay for the way the post made the discourse more serious, ironically for April Fools.)

Eliezer responds:

the post was about fighting for changes in log-odds and at no point tried to give an absolute number

(There is some further back-and-forth about this)

I kinda agree with Orthonormal (on this being a fairly natural outgrowth of Death with Dignity’s ), although I think it’s more generally downstream of “be a culture that puts probabilities on things.”

I’m posting this partly because, I had vaguely been attributing “p(Doom)” discourse to Eliezer, and it seemed like maybe other people were too, and it seemed good to correct the record for anyone else who also thought that.

I know at least a few other prominent x-risky thinkers who also think p(Doom) is a kind of bad way to compress worldviews. I’m posting this today because Neel Nanda recently tweeted about generally hating being asked for his p(Doom), noting a time an interviewer recently asked him about it and he replied:

I decline to answer that question because I am not particularly confident in my answer, and I think that the social norm of asking for off-the-cuff numbers falsely implies that they matter more than they do.

Quick Takes

Originally, I wanted to flesh out this post with some thinking about what people are trying to get out of “saying your p(Doom)”, and how to best achieve that. Spelling that out nicely turned out to be a lot of work, but, it still seemed nice to crosspost this mini-essay.

I guess I will include some takes without justifying them yet, and hash things out in the comments:

0. It’s not nonzero useful to discuss p(Doom), the thing that is weird/bad is how much people fixate on it relative to other things.

1. It’s not useful to articulate more precise p(Doom) than you have a credibly calibrated belief about.

i.e. most people probably should be saying things more like “idk somewhere between X and Y%?” than “Z%.”

I think it’s better to give ranges than to just use words like “probably?” “probably not” “very unlikely”, because not everyone means the same thing by those words, and having common units is helpful for avoiding misunderstandings.

2. You should at least factor out “p(Doom) if we build superintelligence soon under present conditions using present techniques” vs “p(Doom) all-things-considered.”

“How hard is navigating superintelligence” and “how competently will humanity navigate superintelligence” seem like fairly different questions.

In particular, this helps prevent p(Doom) being more like a measure of mood-affiliation of vibes than an actual prediction.

I like Eliezer’s alternate question of “What is the minimum necessary and sufficient policy that you think would prevent extinction?” (with “nothing” being an acceptable answer), but, it does seem noticeably harder to answer and at least one of the nice things about p(Doom) is you probably have an implicit thing you already believe. (It’s also a much more opinionated framing)

3. It’s useful to track some kind of consensus about p(Doom)-ish questions, but not for the reasons most people think.

It’s not good for leaning into tribal camps about how doomy to be.

It’s also not good for figuring out what sort of ideas or questions are acceptable to talk about.

I think it’s kinda reasonable for people who are mostly not going to think about x-risk anyways, and who don’t trust themselves or want to put in the time to evaluate the arguments themselves, to differ to some vague consensus of people you trust.

(I do think it’s pretty silly to do that re: mainstream AI experts, because so many of them are clearly not paying much attention to the arguments at all. But, if you don’t trust anyone in the x-risk community, idk, I don’t have a better suggestion. But you are hear reading LessWrong so probably you aren’t doing that)

It does seem kinda useful to track what mainstream experts believe, for purposes of modeling mainstream society so you can then make predictions about interventions on mainstream society. But, a) it still seems better to separate “doom-if-prosaic-AI-under-present-conditions” from “overall doom” b) I think it’s easy to fall into some tribal dynamics here. Please don’t.

It also seems kinda useful to track what the x-risk-thinkers consensus is, for purposes of modeling the x-risk conversation and how to make intellectual progress on it, but, again, don’t fall into the attractor of overdoing it or doing it in a tribal way, and don’t overindex on p(Doom) as opposed to all the other questions that are worth asking.

Appendix: The Rob Bensinger Files

Some alternate ways of breaking down this question:

Rob Bensinger has made some previous attempts at more nuanced things. In 2021 he sent this 2-question survey to ~117 people working on long-term AI risk

1. How likely do you think it is that the overall value of the future will be drastically less than it could have been, as a result of humanity not doing enough technical AI safety research?
2. How likely do you think it is that the overall value of the future will be drastically less than it could have been, as a result of AI systems not doing/optimizing what the people deploying them wanted/intended?

And in 2023 made this much more multifacted view-shapshot chart.

...but you only get five words a couple questions.

So insofar as you wanted something like “a barometer of what some people think”, I think this last one is too complex to be useful except as a one-time highish effort survey.

Yudkowsky on “Don’t use p(doom)”

Quick Takes

Appendix: The Rob Bensinger Files

...but you only get five words a couple questions.