Hide table of contents

Epistemic status: Complete speculation, somewhat informed by copious arguing about the subject on Twitter.

As AI risk has moved into the mainstream over the past few years, I've come to believe that "p(doom)" is an actively harmful term for X-risk discourse, and people trying to mitigate X-risk should stop using it entirely.

Ambiguity

The first problem is that it's unclear what is actually being discussed. "p(doom)" can refer to many different things:

  • p(AI kills us within 5-10 years)
  • p(AI kills us within 80-200 years)
  • p(conditional on AGI, we die shortly afterwards)
  • p(conditional on superintelligence, we die shortly afterwards)
  • Like 10 other things.[1]

These could have wildly different probabilities, and come along with different cruxes for disagreement. Depending on what specific "doom" is being discussed, the relevant point could be any of:

  • Whether LLMs are capable of AGI at all.
  • Whether AGI will quickly turn into superintelligence.
  • Whether aligning superintelligence will be hard.

These are completely different questions, and people who are not explicit about which one they're discussing can end up talking past each other.

There are also many other potential miscommunications regarding exactly what "doom" refers to, the difference between one's inside view probability vs. ultimate probability, and more.

Distilling complex concepts down to single terms is good, but only when everyone is on the same page about what the term actually means.

Rhetoric

People concerned about X-risk tend to avoid "dark arts" rhetorical tactics, and justifiably so. Unfortunately, current society does not allow for complete good faith agents to do very well. Being fully honest about everything will turn you into a pariah, most people will judge you more based on charisma than on factual accuracy, and you need to use the right tribal signals before people will listen to you on a controversial topic at all. Using at least some light greyish arts in day to day life is necessary in order to succeed.

"p(doom)" is an extremely ineffective rhetorical tactic.

Motivated innumeracy

One of the most common responses from the e/acc crowd to discussions of p(doom) is to say that it's a made up, meaningless number, ungrounded in reality and therefore easily dismissed. Attempts to explain probability theory to them often end up with them denying the validity of probability theory entirely.

These sorts of motivated misunderstandings are extremely common, coming even from top physicists who suddenly lose their ability to understand high school level physics. Pointing out the isolated demand for rigor involved in their presumable acceptance of more pedestrian probabilistic statements also doesn't work; 60% of the time they ignore you entirely, the other 40% they retreat to extremely selective implementations of frequentistism where they're coincidentally able to define a base rate for any event that they have a probabilistic intuition for, and reject all other base rates as too speculative.

I think the fundamental issue here is that explicit probabilities are just weird to most people, and when they're being used to push a claim that is also weird, it's easy to see them as linked and reject everything coming from those people.

Framing AI risk in terms of Bayesian probability seems like a strategical error. People managed to convince the world of the dangers of climate change, nuclear war, asteroid impacts, and many other not-yet-clearly-demonstrated risks all without dying on the hill of Bayesian probability. They did of course make many probabilistic estimates, but restricted them to academic settings, and didn't frame the discussion largely in terms of specific numbers.

Normalizing the use of explicit probabilities is good, but trying to do it with regards to AI risk and other things that people aren't used to thinking about is precisely the worst possible context in which to try that. The combination of two different unintuitive positions will backfire and inoculate the listener to both forever.

Get people used to thinking in probabilistic terms in non-controversial situations first. Instead of "I'll probably go to the party tonight", "60% I'll go to the party tonight". If people object to this usage, it will be much easier to get them on the same page about the validity of this number in a non-adversarial context.

When discussing AI risk with a general audience, stick to traditional methods to get your concerns across. "It's risky." "We're gambling with our lives on an unproven technology." Don't get bogged down in irrelevant philosophical debates.

Tribalism

"p(doom)" has become a shibboleth for the X-risk subculture, and an easy target of derision for anyone outside it. Those concerned about X-risk celebrate when someone in power uses their tribal signal, and those opposed to considering the risks come up with pithy derogatory terms like "doomer" that will handily win the memetic war when pitted against complicated philosophical arguments.

Many have also started using a high p(doom) as an ingroup signal and/or conversation-starter, which further corrupts truthseeking discussion about actual probabilities.

None of this is reducing risk from AI. All it does is contribute to polarization and culture war dynamics[2], and reify "doomers" as a single group that can be dismissed as soon as one of them makes a bad argument.

Ingroup signals can be a positive force when used to make people feel at home in a community, but co-opting existential risk discussions in order to turn their terms into such signals is the worst possible place to do this.

 

I avoid the term entirely, and suggest others do the same. When discussing AI risk with someone who understands probability, use a more specific description that defines exactly what class of potential future events you're talking about. And when discussing it with someone who does not understand probability, speak in normal language that they won't find off-putting.

  1. ^

    Lex Fridman apparently uses it to refer to the probability that AI kills us ever, at any point in the indefinite future. And many people use it to refer to other forms of potential X-risk, such as nuclear war and climate change.

  2. ^

    I recently saw a Twitter post that said something like "apparently if you give a p(doom) that's too low you can now get denied access to the EA cuddle pile". Can't find it again unfortunately.

115

12
3

Reactions

12
3

More posts like this

Comments12
Sorted by Click to highlight new comments since:

Nice points, Isaac!

I would personally go a little further. I think the concept of existential risk is sufficiently vague for it to be better to mostly focus on clearer metrics (e.g. a suffering-free collapse of all value would be maximally good for negative utilitarians, but would be an existential risk for most people). For example, extinction risk, probability of a given drop in global population / GDP / democracy index, or probability of global population / GDP / democracy index remaining smaller than the previous maximum for a certain time.

There's a new chart template that is better than "P(doom)" for most people.

Counterpoint: many people dismissed longtermism as a kind of mathematical blackmail because even miniscule probability events could justify infinite resources to them. The biggest change in moving from longtermism to "holy shit x risk" was emphasizing that the probability is not miniscule.

But I agree with your first point, and think that "p(doom)" should be expanded into "p(doom | agi)" and "p(agi)".

Or to put more bluntly, the p(doom) estimates tended to rise after the "10^35 future humans so even if p(doom) is really low..." arguments were widely dismissed.

Obviously other stuff happened in the world of AI, and AGI researchers are justified in arguing they simply updated their priors in the light of the rise of emergent behaviour exhibited by LLMs[1] (others obviously always had high and near-term expectations of doom anyway). But Bayes' Theorem also justifies sceptics updating their p(blackmail) estimates[2]

Priors for regular events like sports, market movements and insurable events are easily converted into money so bold predictions are easily put to the test whether they claim to be based on a robust frequentist model of similar events or inside information or pure powers of observation. But doom in most of the outlined scenarios is a one-off and the incentive structure actually works the opposite way round: people arguing we're seriously underestimating p(doom) don't expect to be around if they're right and are asking for resources now to reduce it. I don't think it's an isolated demand for rigour to suggest that a probabilistic claim of this nature bears very little resemblance to a probabilistic claim made where there's some evidence of some base rate and strong incentive not to be overconfident.

So yeah, I agree, p(doom) isn't persuasive and I'm not sure decomposing it into p(doom | agi) and p(agi) or equivalents for other x-risk fields puts it on a stronger footing. Understanding how researchers believe a development increases or reduces a source of x-risk is much more convincing argument about their value than incrementing or decrementing an arbitrary-seeming doom number. The "doomsday clock" was an effective rhetorical tool because everyone understood it as asking politicians to reverse course, not because it was accepted as a valid representation of an underlying probability distribution.

[1]though they could also have been justified in updating the other way; [notionally] safety-conscious organisations getting commercially valuable, near-human level outputs from a text transformation matrix arguably gives less reasons to believe anyone would deem giving machines agency worth the effort.

[2]also in either direction.

Oh, I agree. Arguments of the form "bad things are theoretically possible, therefore we should worry" are bad and shouldn't be used. But "bad things are likely" is fine, and seems more likely to reach an average person than "bad things are 50% likely".

Toby Ord's existential risk estimates in The Precipice were for risk this century (by 2100) IIRC. That book was very influential in x-risk circles around the time it came out, so I have a vague sense that people were accepting his framing and giving their own numbers, though I'm not sure quite how common that was. But these days most people talking about p(doom) probably haven't read The Precipice, given how mainstream that phrase has become.

Also, in some classic hard-takeoff + decisive-strategic-advantage scenarios, p(doom) in the few years after AGI would be close to p(doom) in general, so these distinctions don't matter that much. But nowadays I think people are worried about a much greater diversity of threat models.

Yeah, most of the p(doom) discussions I see taking place seem to be focusing on the nearer term of 10 years or less. I believe there are quite a few people (e.g. Gary Marcus, maybe?) who operate under a framework like "current LLMs will not get to AGI, but actual AGI will probably be hard to align), so they may give a high p(doom before 2100) and a low p(doom before 2030).

Isaac -- good, persuasive post. 

I agree that p(doom) is rhetorically ineffective -- to normal people, it just looks weird, off-putting, pretentious, and depressing. Most folks out there have never taken a probability and statistics course, and don't know what p(X) means in general, much less p(doom). 

I also agree that p(doom) is way too ambiguous, in all the ways you mentioned, plus another crucial way: it isn't conditioned on anything we actually do about AI risk. Our p(doom) given an effective global AI regulation regime might be a lot lower than p(doom) if we do nothing. And the fact that p(doom) isn't conditioned on our response to p(doom) creates a sense of fatalistic futility, as if p(doom) is a quantitative fact of nature, like the Planck constant or the Coulomb constant, rather than a variable that reflects our collective response to AI risks, and that could go up or down quite dramatically given human behavior.

thanks for the writeup! I had a ton of similar feelings for a while, mixing between finding people who say "it's not worth defending it's just a meme" and "actually I'll defend using something like this". 

At one point I was discussing this issue with Rob Miles at manifest, who told me something like "the default is a bool (some two valued variable)", the idea being that if people are arguing over an interval then we could've done way worse. 

Executive summary: The term "p(doom)" is ambiguous, rhetorically ineffective for communicating AI risks, and has become an polarizing ingroup signal that impedes thoughtful discussion on mitigating existential threats from advanced AI.

Key points:

  1. "p(doom)" conflates multiple distinct probabilities like short-term AI catastrophe vs long-term, conditional on AGI vs conditional on superintelligence. This ambiguity fosters miscommunication.
  2. Explicit probabilities meet motivated skepticism and innumeracy. Framing AI risk discussion around numbers backfires rhetorically.
  3. "p(doom)" has become an ingroup shibboleth that outsiders easily ridicule. This entrenches polarization around AI risk.
  4. People should stop using this term and instead discuss specific risks and probabilities when warranted, but focus rhetoric on normal language.

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Very well put, thanks. 
I feel that starting with "epistemic status" has [some!] similar aspects to p(doom). It's a lot of fun for us but beginning an argument in real life with "Epistemic Status" loses in a split second. 

Yeah, I don't do it on any non-LW/EAF post.

Curated and popular this week
Relevant opportunities