A longtermist critique of “The expected value of extinction risk reduction is positive”

Anthony DiGiovanni

Reducing the probability of human extinction is a highly popular cause area among longtermist EAs. Unfortunately, this sometimes seems to go as far as conflating longtermism with this specific cause, which can contribute to the neglect of other causes.^[1] Here, I will evaluate Brauner and Grosse-Holz’s argument for the positive expected value (EV) of extinction risk reduction from a longtermist perspective. I argue that the EV of extinction risk reduction is not robustly positive,^[2] such that other longtermist interventions such as s-risk reduction and trajectory changes are more promising, upon consideration of counterarguments to Brauner and Grosse-Holz’s ethical premises and their predictions of the nature of future civilizations. I abbreviate “extinction risk reduction” as “ERR,” and the thesis that ERR is positive in expectation as “+ERR.”

Brauner and Grosse-Holz support their conclusion with arguments that I would summarize as follows: If humans do not go extinct, our descendants (“posthumans”) will tend to have values aligned with the considered preferences of current humans in expectation, in particular values that promote the welfare of sentient beings, as well as the technological capacity to optimize the future for those values. Our moral views upon reflection, and most welfarist views, would thus consider a future populated by posthumans better than one in which we go extinct. Even assuming a suffering-focused value system, the expected value of extinction risk reduction is increased by (1) the possibility of space colonization by beings other than posthumans, which would be significantly worse than posthuman colonization, and (2) posthumans’ reduction of existing disvalue in the cosmos. Also, in practice, interventions to reduce extinction risk tend to decrease other forms of global catastrophic risk and promote broadly positive social norms, such as global coordination, increasing their expected value.

In response, first, I provide a defense of types of welfarist moral views—in particular, suffering-focused axiologies—for which a future with posthuman space civilization is much less valuable, even if posthumans’ preferences are aligned with those of most current humans. Although +ERR appears likely conditional on value-monist^[3] utilitarian views that do not put much more moral weight on suffering than happiness,^[4] views on which sufficiently i...

145 Reactions

Mentioned in

225Critique of MacAskill’s “Is It Good to Make Happy People?”

211Reasons I’ve been hesitant about high levels of near-ish AI risk

145The Future Might Not Be So Great

95Complexity of value but not disvalue implies more focus on s-risk. Moral uncertainty and preference utilitarianism also do.

84Announcing: Existential Choices Debate Week (March 17-23)

Load more (5/15)

Comments10

Sorted by

New & upvoted

Click to highlight new comments since: Today at 11:29 AM

Jim Buhler4y9

Interesting! Thank you for writing this up. :)

It does seem plausible that, by evolutionary forces, biological nonhumans would care about the proliferation of sentient life about as much as humans do, with all the risks of great suffering that entails.

What about the grabby aliens, more specifically? Do they not, in expectation, care about proliferation (even) more than humans do?

All else being equal, it seems -- at least to me -- that civilizations with very strong pro-life values (i.e., that thinks that perpetuating life is good and necessary, regardless of its quality) colonize, in expectation, more space than compassionate civilizations willing to do the same only under certain conditions regarding others' subjective experiences.

Then, unless we believe that the emergence of dominant pro-life values in any random civilization is significantly unlikely in the first place (I see a priori more reasons to assume the exact opposite), shouldn't we assume that space is mainly being colonized by "life-maximizing aliens" who care about nothing but perpetuating life (including sentient life) as much as possible?

Since I've never read such an argument anywhere else (and am far from being an expert in this field), I guess that is has a problem that I don't see.

EDIT: Just to be clear, I'm just trying to understand what the grabby aliens are doing, not to come to any conclusion about what we should do vis-à-vis the possibility of human-driven space colonization. :)

Anthony DiGiovanni4y5

That sounds reasonable to me, and I'm also surprised I haven't seen that argument elsewhere. The most plausible counterarguments off the top of my head are: 1) Maybe evolution just can't produce beings with that strong of a proximal objective of life-maximization, so the emergence of values that aren't proximally about life-maximization (as with humans) is convergent. 2) Singletons about non-life-maximizing values are also convergent, perhaps because intelligence produces optimization power so it's easier for such values to gain sway even though they aren't life-maximizing. 3) Even if your conclusion is correct, this might not speak in favor of human space colonization anyway for the reason Michael St. Jules mentions in another comment, that more suffering would result from fighting those aliens.

Jim Buhler4y4

I completely agree with 3 and it's indeed worth clarifying. Even ignoring this, the possibility of humans being more compassionate than pro-life grabby aliens might actually be an argument against human-driven space colonization, since compassion -- especially when combined with scope sensitivity -- might increase agential s-risks related to potential catastrophic cooperation failure between AIs (see e.g., Baumann and Harris 2021, 46:24), which are the most worrying s-risks according to Jesse Clifton's preface of CLR's agenda. A space filled with life-maximizing aliens who don't give a crap about welfare might be better than one filled with compassionate humans who create AGIs that might do the exact opposite of what they want (because of escalating conflicts and stuff). Obviously, uncertainty stays huge here.

Besides, 1 and 2 seem to be good counter-considerations, thanks! :)

I'm not sure I get why "Singletons about non-life-maximizing values are also convergent", though. Do you -- or anyone else reading this -- can point at any reference that would help me understand this?

Anthony DiGiovanni4y2

I'm not sure I get why "Singletons about non-life-maximizing values are also convergent", though.

Sorry, I wrote that point lazily because that whole list was supposed to be rather speculative. It should be "Singletons about non-life-maximizing values could also be convergent." I think that if some technologically advanced species doesn't go extinct, the same sorts of forces that allow some human institutions to persist for millennia (religions are the best example, I guess) combined with goal-preserving AIs would make the emergence of a singleton fairly likely - not very confident in this, though, and I think #2 is the weakest argument. Bostrom's "The Future of Human Evolution" touches on similar points.

Fai3y7

Thank you for the great post! I think my post might be relevant to 2.1.1. Animals [1.1].

(my post discusses about factory farmed animals in the long-term future, but that doesn't mean I only worry about that as the only source of animal suffering in the long-term)

Anthony DiGiovanni3y3

Thanks for the kind feedback. :) I appreciated your post as well—I worry that many longtermists are too complacent about the inevitability of the end of animal farming (or its analogues for digital minds).

MichaelStJules4y5

Each of the five mutually inconsistent principles in the Third Impossibility Theorem of Arrhenius (2000) is, in isolation, very hard to deny.

This post/paper points out that lexical total utilitarianism already satisfies all of Arrhenius's principles in his impossibility theorems (there are other background assumptions):

However, it’s recently been pointed out that each of Arrhenius’s theorems depends on a dubious assumption: Finite Fine-Grainedness. This assumption states, roughly, that you can get from a very positive welfare level to a very negative welfare level via a finite number of slight decreases in welfare. Lexical population axiologies deny Finite Fine-Grainedness, and so can satisfy all of Arrhenius’s plausible adequacy conditions. These lexical views have other advantages as well. They cohere nicely with most people’s intuitions in cases like Haydn and the Oyster, and they offer a neat way of avoiding the Repugnant Conclusion.

Also, for what it's worth, the conditions in these theorems often require a kind of uniformity that may only be intuitive if you're already assuming separability/additivity/totalism in the first place, e.g. (a) there exists some subpopulation A that satisfies a given condition for any possible disjoint unaffected common subpopulation C (i.e. the subpopulation C exists in both worlds, and the welfares in C are the same across the two worlds), rather than (b) for each possible disjoint unaffected common subpopulation C, there exists a subpopulation A that satisfies the condition (possibly a different A for a different C). The definition of separability is just that a disjoint unaffected common subpopulation C doesn't make a difference to any comparisons.

So, if you reject separability/additivity/totalism or are at least sympathetic to the possibility that it's wrong, then it is feasible to deny the uniformity requirements in the principles and accept weaker non-uniform versions instead. Of course, rejecting separability/additivity/totalism has other costs, though.

MichaelStJules4y5

I might have missed it in your post, but descendants of humans encountering a grabby alien civilization is itself an (agential) s-risk. If they are optimizing for spread and unaligned ethically with us, then we will be in the way, and they will have no moral qualms with using morally atrocious tactics, including spreading torture on an astronomical scale to threaten our values to get access to more space and resources, or we may be at war with them. If our descendants are also motivated to expand, and we encounter grabby aliens, how long would conflict between us go on for?

MichaelStJules4y4

Perfection Dominance Principle. Any world A in which no sentient beings experience disvalue, and all sentient beings experience arbitrarily great value, is no worse than any world B containing arbitrarily many sentient beings experiencing only arbitrarily great disvalue (possibly among other beings).^[15]

I'm confused by the use of quantifiers here. Which of the following is what's intended?

If A has only beings experiencing positive value and B has beings experiencing disvalue, then A is no worse than B? (I'm guessing not; that's basically just the procreation asymmetry.)
For some level of value , some level of disvalue $b < 0$ , and some positive integer $N$ , if A has only beings experiencing value at least $a$ , and B has at least N beings experiencing disvalue $b$ or worse (and possibly other beings), then A is no worse than B.
Something else similar to 2? Can $b$ and/or $N$ depend on A?
Something else entirely?

Anthony DiGiovanni4y3

What I mean is closest to #1, except that B has some beings who only experience disvalue and that disvalue is arbitrarily large. Their lives are pure suffering. This is in a sense weaker than the procreation asymmetry, because someone could agree with the PDP but still think it's okay to create beings whose lives have a lot of disvalue as long as their lives also have a greater amount of value. Does that clarify? Maybe I should add rectangle diagrams. :)