Stop talking about p(doom)
Epistemic status: Complete speculation, somewhat informed by copious arguing about the subject on Twitter.
As AI risk has moved into the mainstream over the past few years, I’ve come to believe that “p(doom)” is an actively harmful term for X-risk discourse, and people trying to mitigate X-risk should stop using it entirely.
Ambiguity
The first problem is that it’s unclear what is actually being discussed. “p(doom)” can refer to many different things:
p(AI kills us within 5-10 years)
p(AI kills us within 80-200 years)
p(conditional on AGI, we die shortly afterwards)
p(conditional on superintelligence, we die shortly afterwards)
Like 10 other things.[1]
These could have wildly different probabilities, and come along with different cruxes for disagreement. Depending on what specific “doom” is being discussed, the relevant point could be any of:
Whether LLMs are capable of AGI at all.
Whether AGI will quickly turn into superintelligence.
Whether aligning superintelligence will be hard.
These are completely different questions, and people who are not explicit about which one they’re discussing can end up talking past each other.
There are also many other potential miscommunications regarding exactly what “doom” refers to, the difference between one’s inside view probability vs. ultimate probability, and more.
Distilling complex concepts down to single terms is good, but only when everyone is on the same page about what the term actually means.
Rhetoric
People concerned about X-risk tend to avoid “dark arts” rhetorical tactics, and justifiably so. Unfortunately, current society does not allow for complete good faith agents to do very well. Being fully honest about everything will turn you into a pariah, most people will judge you more based on charisma than on factual accuracy, and you need to use the right tribal signals before people will listen to you on a controversial topic at all. Using at least some light greyish arts in day to day life is necessary in order to succeed.
“p(doom)” is an extremely ineffective rhetorical tactic.
Motivated innumeracy
One of the most common responses from the e/acc crowd to discussions of p(doom) is to say that it’s a made up, meaningless number, ungrounded in reality and therefore easily dismissed. Attempts to explain probability theory to them often end up with them denying the validity of probability theory entirely.
These sorts of motivated misunderstandings are extremely common, coming even from top physicists who suddenly lose their ability to understand high school level physics. Pointing out the isolated demand for rigor involved in their presumable acceptance of more pedestrian probabilistic statements also doesn’t work; 60% of the time they ignore you entirely, the other 40% they retreat to extremely selective implementations of frequentistism where they’re coincidentally able to define a base rate for any event that they have a probabilistic intuition for, and reject all other base rates as too speculative.
I think the fundamental issue here is that explicit probabilities are just weird to most people, and when they’re being used to push a claim that is also weird, it’s easy to see them as linked and reject everything coming from those people.
Framing AI risk in terms of Bayesian probability seems like a strategical error. People managed to convince the world of the dangers of climate change, nuclear war, asteroid impacts, and many other not-yet-clearly-demonstrated risks all without dying on the hill of Bayesian probability. They did of course make many probabilistic estimates, but restricted them to academic settings, and didn’t frame the discussion largely in terms of specific numbers.
Normalizing the use of explicit probabilities is good, but trying to do it with regards to AI risk and other things that people aren’t used to thinking about is precisely the worst possible context in which to try that. The combination of two different unintuitive positions will backfire and inoculate the listener to both forever.
Get people used to thinking in probabilistic terms in non-controversial situations first. Instead of “I’ll probably go to the party tonight”, “60% I’ll go to the party tonight”. If people object to this usage, it will be much easier to get them on the same page about the validity of this number in a non-adversarial context.
When discussing AI risk with a general audience, stick to traditional methods to get your concerns across. “It’s risky.” “We’re gambling with our lives on an unproven technology.” Don’t get bogged down in irrelevant philosophical debates.
Tribalism
“p(doom)” has become a shibboleth for the X-risk subculture, and an easy target of derision for anyone outside it. Those concerned about X-risk celebrate when someone in power uses their tribal signal, and those opposed to considering the risks come up with pithy derogatory terms like “doomer” that will handily win the memetic war when pitted against complicated philosophical arguments.
Many have also started using a high p(doom) as an ingroup signal and/or conversation-starter, which further corrupts truthseeking discussion about actual probabilities.
None of this is reducing risk from AI. All it does is contribute to polarization and culture war dynamics[2], and reify “doomers” as a single group that can be dismissed as soon as one of them makes a bad argument.
Ingroup signals can be a positive force when used to make people feel at home in a community, but co-opting existential risk discussions in order to turn their terms into such signals is the worst possible place to do this.
I avoid the term entirely, and suggest others do the same. When discussing AI risk with someone who understands probability, use a more specific description that defines exactly what class of potential future events you’re talking about. And when discussing it with someone who does not understand probability, speak in normal language that they won’t find off-putting.
- ^
Lex Fridman apparently uses it to refer to the probability that AI kills us ever, at any point in the indefinite future. And many people use it to refer to other forms of potential X-risk, such as nuclear war and climate change.
- ^
I recently saw a Twitter post that said something like “apparently if you give a p(doom) that’s too low you can now get denied access to the EA cuddle pile”. Can’t find it again unfortunately.
I don’t think you use correct model of public discource. Debates with e/acc on twitter don’t have a goal to persuade e/acc. The point is to give public a live evidence that your position is well-founded, coherent and likely correct and their position is less so.
When you publicly debate (outside nice rationalist circles) with someone, it’s very hard to get acknowledgement of them being wrong because it’s low-status. But people outside the debate can quetly change their mind and make it look like it was their position all along, avoiding status-blows.
It’s literally an invitation to irrelevant philosophical debates about how all technologies are risky and we are still alive and I don’t know how to get out of here without reference to probabilities and expected values.
This kinda misses greater picture? “Belief that here is a substantial probability of AI killing everyone” is 1000x stronger shibboleth and much easier target for derision.
Oh I agree the main goal is to convince onlookers, and I think the same ideas apply there. If you use language that’s easily mapped to concepts like “unearned confidence”, the onlooker is more likely to dismiss whatever you’re saying.
If that comes up, yes. But then it’s them who have brought up the fact that probability is relevant, so you’re not the one first framing it like that.
Hmm. I disagree, not sure exactly why. I think it’s something like: people focus on short phrases and commonly-used terms more than they focus on ideas. Like how the SSC post I linked gives the example of republicans being just fine with drug legalization as long as it’s framed in right-wing terms. Or how talking positively about eugenics will get you hated, but talking positively about embryo selection and laws against incest will be taken seriously. I suspect that most people don’t actually take positions on ideas at all; they take positions on specific tribal signals that happen to be associated with ideas.
Consider all the people who reject the label of “effective altruist”, but try to donate to effective charity anyway. That seems like a good thing to me; some people don’t want to be associated with the tribe for some political reason, and if they’re still trying to make the world a better place, great! We want something similar to be the case with AI risk; people may reject the labels of “doomer” or “rationalist”, but still think AI is risky, and using more complicated and varied phrases to describe that outcome will make people more open to it.
I am one of those people; I don’t consider myself EA due to its strong association with atheism, but nonetheless am very much on slowing down AGI before it kills us all.
Existential risk comprises both extinction and disempowerment without extinction. Existential risk is often used interchangeably with doom. So a salient sense of P(doom) is P(disempowerment).
This enables false disagreement through equivocation that masks remaining true disagreements:
Alice: My P(doom) is 90%! [thinking that disempowerment is very likely, but not particularly being concerned that AIs killeveryone]
Bob: But why would AIs killeveryone??? [also thinking that disempowerment is very likely, and not particularly being concerned that AIs killeveryone]
Alice: What’s your P(doom) then?
Bob: It’s about 5%! Building AI is worth the risk. [thinking that disempowerment is 95% likely, but accepting it as price of doing business]
Alice: … [thinking that Bob believes that disempowerment is only 5% likely]
Why ? The second isn’t what “existential” means.
I think the term “existential risk” comes from here, where it is defined as:
(I think on a plain english reading “existential risk” doesn’t have any clear precise meaning. I would intuitively have included e.g. social collapse, but probably wouldn’t have included an outcome where humanity can never expand beyond the solar system, but I think Bostrom’s definition is also consistent with the vague plain meaning.)
In general I don’t think using “existential risk” with this precise meaning is very helpful in broader discourse and will tend to confuse more than it clarifies. It’s also a very gnarly concept. In most cases it seems better to talk directly about human extinction, AI takeover, or whatever other concrete negative outcome is on the table.
Note Existential is a term of art different from Extinction.
The Precipice cites Bostrome and defines it such:
”An existential catastrophe is the destruction of humanity’s longterm potential.
An existential risk is a risk that threatens the destruction of humanity’s longterm potential.”
Disempowerment is generally considered an existential risk in the literature.
… Did … X-risk from non-AI sources just … vanish somehow, that it’s mentioned nowhere in anything anyone’s talking about on this stop talking about it thing that supposedly lists many possibilities? Should I not be including probability of existential catastrophe / survivable apocalypse / extinction event from non-AI sources in any p(doom) number I come up with? Where do they go instead if so?
Great point! I focused on AI risk since that’s what most people I’m familiar with are talking about right now, but there are indeed other risks, and that’s yet another potential source of miscommunication. One person could report a high p(doom) due to their concerns about bioterrorism, and another interprets that as them being concerned about AI.
Most people putting lots of effort into calculating AI-X-Risk seem to subsume the other kinds of risk (call them human x-risk) into AI, believing that AI will either solve them or accelerate them, but either way be pivotal such that AI is all that matters.
Personally, I tend to think we were on the path to disaster (perhaps not permanent, but maybe. At least a few thousand years of civilizational regression) already, and AI is not likely to either solve it or go so far awry as to be the proximal cause of doom. It WILL be an accelerator, just as it’s turning into an accelerator for everything. But the seeds were planted long ago, and it’s purely human decisions at fault.
So if I had a model, like (toy numbers follow):
0.5% chance nature directly causes doom
5% chance AI directly causes an avoidable doom
15% chance AI directly solves an avoidable (human or natural) doom
50% chance humans cause doom with or without AI
99% chance AI accelerates both problems and solutions for doom
… what single number does that actually subsume into? (I could most likely calculate at least one literal answer to this, but you can see the communication problems from the example, I hope.)
I think one of the points of this post is that you shouldn’t have or communicate one single number. These are different things, and you at the very least need to quantify “avoidable”, and figure out the correlations between them (like “human-caused degradation would reduce the world population by 90% if AI didn’t extend our ability to cope by 30 years, but then another part of fragility of human expectations makes civilization collapse anyway”).
At some point (which we’re well past in most discussions around here), it becomes too complex and FAR too dependent on assumptions with very large error bars (and very large conditionals on surprising levels of human coordination), that there is no way to predict the outcomes. About the best you can do is very large buckets of “somewhat likely”, “rather unlikely”, and “not gonna worry about it”, with another dimension of “how much, if any, will my actions change things”, also focused on paths wide enough that you’re not basing it on insane multiplication of very small/large made-up numbers.
‘Avoidable’ in above toy numbers are purely that
1 - avoidable doom directly caused by AI is in fact avoided if we destroy all (relevantly capable?) AI when testing for doom.
2 - avoidable doom directly caused by humans or nature is in fact avoided by AI technology we possess when testing for doom.
Still not sure I follow. “testing for doom” is done by experiencing the doom or non-doom-yet future at some point in time, right? And we can’t test under conditions that don’t actually obtain. Or do you have some other test that works on counterfactual (or future-unknown-maybe-factual) worlds?
Yeah, the test is just if doom is experienced (and I have no counterfactual world testing, useful as that would be).
Agreed, p(doom) is a useless metric for advancing our understanding or policy decisions. It has become a social phenomenon where people are now pressuring others to calculate a value and asserting it can be used as an accurate measurement. It is absurd. We should stop accepting nothing more than guesses and define a criteria that is measurable. FYI, my own recent rant on the topic …
https://www.mindprison.cc/p/pdoom-the-useless-ai-predictor
I don’t think you understand how probability works.
https://outsidetheasylum.blog/understanding-subjective-probabilities/
I’m still unconvinced that p(doom) numbers aren’t largely pulled out of thin air. It’s at best a fermi estimate so we can safely assume the real p(doom) numbers are within an order of magnitude of the estimate. Doesn’t bode well for most estimates as it means basically any number implies a 0-100% of doom.
Probability is a geometric scale, not an additive one. An order of magnitude centered on 10% covers ~1% − 50%.
https://www.lesswrong.com/posts/QGkYCwyC7wTDyt3yT/0-and-1-are-not-probabilities
Exactly, not very informative. For pretty much any p(doom) the range is anywhere from non-existent to very likely. When someone gives a Fermi estimate of a p(doom) between 0% and 100% they may as well be saying the p(doom) is between ~0% and ~100%. Divide any number by 10 and multiply by 10 to see this.
That contains a number of mistakes, but doesn’t prove anything about e/accs.
Feel free to elaborate on the mistakes and I’ll fix them.
That article isn’t about e/acc people and doesn’t mention them anywhere, so I’m not sure why you think it’s intended to be. The probability theory denial I’m referencing is mostly on Twitter.