Why are longtermists so much less focused on human extinction these days?

In most worlds where AI disempowers humanity, the human species continues

Do you mean most likely worlds? The difference seems incredibly important - there are, in my view, quite compelling arguments that the most likely outcome given disempowerment is human extinction, but of course I can imagine worlds in which that doens't happen.

Linch

I mostly agree with this though I think there's more extremization^[1]: https://www.lesswrong.com/posts/4fqwBmmqi2ZGn9o7j/notes-on-fatalities-from-ai-takeover

^{^}
Anything like 35% death rate seems implausible to me if I think through the mechanics of a takeover, both <5% and >95% seem more plausible to me, including in very violent takeovers.

William_MacAskill

I mean, conditional on human disempowerment, >50% that the human species continues till after 2100. Maybe I'm at 80% or more on this.

Jari

"So, the probability of "an event occurs by 2100 that prevents Earth-originating life from ever spreading to the stars" is really low, I'd say <1%."

Does this take into account mirror life before AGI is built?
Also there could be technologies invented in the next 75 years that make destroying the world much easier

Niel_Bowerman

In most worlds where AI disempowers humanity, the human species continues.

@William_MacAskill can you say more about this claim?

William_MacAskill

This is probably the best thing written on expected fatalities.

But the main point is:
- The resources needed to sustain the human species are tiny compared even to the resources just in the solar system (1 part in 10 trillion for all of current civilisation, and the human species could be sustained with a tiny fraction of that)
- If misaligned ASI wants power, it doesn't need to kill everybody in order to do so (and deliberately killing everybody would actively be wasteful).
- So in order to keep some humans around, it only needs to be the case that a tiny fraction of AIs care a tiny amount about keeping some humans around. Could be for intrinsic concern, nostalgia, fulfilling commitments they made (in order to get some humans on-side), acausal reasons (trade with human-like creatures elsewhere in the universe or multiverse), reasoning with potential human simulators, or instrumental reasons (they want to do experiments on humans for science). But the main point is just any tiny motivation is enough. Yes, we're atoms that could be used for something else, but we're really not many atoms at all.

(I also think most human disempowerment scenarios are ones where the humans in general feel pretty fine with it, but I think the above even putting that to the side.)

Lucas Tucker

AI safety is about who we want to guide the future, not about whether there's a long-term future or not.

Maybe this is obvious (I'm pretty new here) but if that's the case then is AI safety more about empowering the right people than it is about aligning any individual technology? When (if at all) do these priorities flip?

Charlie_Guthmann

There’s reasonably little written about why longtermists should change their prioritisation in this direction. The notable exception is this paper by Will MacAskill.

Thanks for the links!

Bella

Interesting post, thanks!! Excited to see discussion of it.

One thing I wonder if it's relevant are the sort of "mixed" pictures? Like, maybe you think totalitarian control of AI would be really bad because it increases the risk of human extinction from misalignment, by increasing the success rate of any attempted AI takeover / making it happen sooner in time (bc a totalitarian might e.g. give more control over to an AI they were using to stay in power).

Or, more vaguely, I sorta feel like if world governments / anyone with ambitions to power think that advanced AI will help them, that's bad news for alignment efforts.

Anyway, that could be something going on for some people I think?

Jeanne Marie Jacqueline (JMJ/Evana)

Good point, thanks!

Hi Bella, yes it is a fascinating read, and thank you for your moderate comment. Although, my first thought was slightly different, I went slightly further than "one could become the consequence of the other"...

Wouldn't extinction, in itself, be an event where suffering would be short then end? Rather than, for instance, enduring extreme concentration of power (building loyalty, distrusting people around oneself, etc.) or living under a superior intelligence (emulating the harmful dynamics humans had with animals).

I could see it as a hierarchy of anxieties

Tobias Häberli

My sense is that there’s been a significant shift in how much longtermists prioritise non-extinction risks over the last few years. A decade ago people who were trying to ensure the flourishing of the universe trillions of years from now were very often focused on avoiding events that would kill all humans.

I agree that this shift has happened over the past 2 years. But I think 2 to 3 years ago EAs were unusually focused on extinction compared to a decade ago. I remember more discussions back then around positive visions for the longterm.

For what it’s worth, we just announced our first Frontier Biodefense Fellowship at Pivotal, which is more singularly focused on avoiding extinction than most projects within AI safety (including our AI safety fellowship). Obviously the team has a range of motivation to work on Biodefense, but for me weak longtermist arguments are quite central.

Benevolent_Rain

I am really happy you are focusing squarely on existential risk from bio. I think there is a tendency in EA-adjacent biosec work to lose a bit of focus on how extremely bad such scenarios are. I also think it is great you raised this Michelle - I also feel like not enough EAs have contemplated the importance of 2 further assumptions needed to work on longtermism:

1 - Massive increase in value in the future (re: Arepo's billions of star civilization), and
2 - Very few or not other periods of existential risk for the rest of the infinite future

Interesting, I don't think I noticed that trend between 10 years ago and 2 years ago.

Cool!

OscarD🔸

A key consideration for me is that earth-originating civilisation first spreading to other galaxies seems likely to be a lock-in, where if the values/organising structures of those early space missions are bad, the future seems quite bad in expectation to me.

And it seems likely that such space colonisation will become possible soon.

That does seem like a more plausible lock in event than many. But I'd have tentatively expected that we'd send off space probes reasonably early, such that as technology gets better we can overtake the original ones. I guess a significant speed up of technological development would make a difference here though. Interested in how this affects your thinking?

Arepo

I don't encounter many people who still identify as longtermist, but as someone who does, I recently wrote these arguments for why longtermists should be less extinction-focused.

The tl;dr is that I think that other than extinction there are predictable patterns, with perhaps the most prominent related to entropy, and that those patterns provide more nuanced ways to estimate the cost of lesser catastrophes - and that while assessing the costs of lesser catastrophes precisely is infeasible, that's not a basis for thinking they would be negligible compared to extinction.

Interesting that you don't come across many people these days who still identify as longtermist, that's pretty different from my experience. I think it feels more intuitive to me to identify as 'longtermist' than 'effective altruist'. The former is a claim about my values (people in the future matter morally) whereas the latter is behavioural and feels presumptuous (how altruistic really am I? Am I effective at it even when I try?). But I guess I'm in the minority on that!

Ben Millwood🔸

These might be kinda similar to things that others have already said, but:

My personal journey was encountering extinction risks first, worrying about those, and then over time thinking in more detail about threat models and consequently broadening the list of things I worried about. I've been assuming that community discourse evolved in the same way: initially based on relatively simple ideas (e.g. omnipotent superintelligence, everyone dies) and then adding more detail and precision and subtlety as people developed those more, which naturally increases the number of possible pathways and scenarios. But it's possible that all the new pathways I discovered were only new to me, and therefore my path doesn't track the community path. I don't know.

On the object level, I don't super buy the "extinction is positive lock-in but not much else is". Similar to what Vasco said, you only believe that extinction has a massive impact on the long-term future if you believe that extinction risk is high now but will drop to being permanently extremely low later on. Otherwise you're just delaying extinction rather than preventing it. This isn't an obvious belief! Most who believe it appeal to something like space colonisation, where we spread in such a way that it's no longer easy for effects of any kind to spread across all sentient life rapidly. But if space colonisation works to prevent extinction (which itself is not obvious!) then maybe it also locks in other things in the same way. You can think of not only the human race as having their survival at stake here, but also our ideas, social structures, etc. -- arguably, space colonisation gives these things the same shot at immortality it does us.

Having written this out, I think it's fair enough if you still think that "locking in non-extinction" is likely to happen, but locking in anything else isn't. It's reasonable to believe that extinction is special. But I hope that gives some intuition for why someone might think that extinction and lock-in are comparable risks.

jackva

This could be entirely explainable by what is most resonant with a broader public and the fact that many of the non-extinction risks have much higher societal buy-in / are much more legible to many more people.

lroberts

I think there is a spelling mistake

(making it more likely than it might have been that I company execs could amass unusual levels of power)

I'm guessing you meant "AI company execs".

Otherwise great post, thanks for writing it up!

You're quite right, thank you!

Toby Tremlett🔹

This is also a change I'd noticed and been a little confused by. I really appreciate Michelle raising it, and the comments so far have been great. I'm curating this post.

Rafael Ruiz

One reason is that longtermists are largely philosophers, who have no particular expertise on the details of aligning AI.

Another reason worth taking into consideration is if the true moral view is "fussy", rather than "easygoing". If you're "easygoing" in what you consider utopia, then, conditional on survival, most achievable value gets realized by default (we get great human lives), and extinction is the one really action-relevant lock-in event. But if you're fussy about realizing the best possible utopia, then, conditional on survival, we're still likely to miss most achievable value across a huge swathe of futures (we don't tile the universe with happy digital minds, say, or whatever crazy future might be the best utopia). The space of "didn't go extinct but missed most of the value" turns out to be enormous, and some of the features determining where in that space we land (early decisions about digital minds, population-ethics, allocation of resources during space settlement, which value-systems get amplified during the AI transition) are themselves plausibly locked-in, even if they don't feel as salient as extinction.

(But then again, maybe I'm recency biased because I just re-read Better Futures for the discussion week here on the Forum)

That doesn't settle the prioritization, and, like, the people are Forethought and 80.000 Hours are directly and explicitly working on the AI transition? So it's not like x-risk is off the table. My vibe for highly-engaged EAs is that perhaps it just feels that the main arguments about x-risk have already been made.

I agree that in deciding how much to prioritise averting extinction vs improving worlds in which we persist, it's important to think about the difference in value between (non-existence)(default survival)(actual utopia). But that argument has been around a long while. I think Ben Garfinkel was advancing the idea that (actual utopia) - (default survival) might be much larger than (default survival) - (non-existence) in the late 2010s. I'm interested in what's changed that's affected discourse. It's possible the answer is 'more people have read arguments of this form'. But in that case people who had already read those arguments should update less than if the change is eg us getting more info about how difficult alignment is.

Tristan Katz

Is this post conflating EAs or AI-safety researchers/advocates with longtermists? My impression is that actually rather few EAs are strong longtermists, and AI safety researchers/advocates maybe even less - they're just united by their wish to avoid catastrophe, but differ in their understandings of the kind of catastrophe we should expect.

I think when AI safety was young and, er, weird, a lot of those talking about it were longtermists. That makes sense. But now it's become a mainstream EA concern, and even a not-weird concern in society generally, so it's unsurprising that it's become less longtermist, from a sociological perspective.

I was intending to pick out a group of people who have for years identified as EAs and longtermists but have changed what they've worked on. I was thinking it was clear in talking about EAs deprioritising a thing that I meant the ones who prioritised that highly initially, but I see how that's confusing - I'll edit to clarify.

Tristan Katz

Right, what seemed to be missing to me was evidence that longtermists specifically had stopped working on extinction risk. But I see there are a good amount of posts others have linked in the comments that would count as evidence for this.

Jordan Arel

1mo*

I think it’s essentially a matter of ITN, AI timelines, and post-AGI dynamics. And perhaps a little sociological.

Compared to long-term trajectory change, AI safety is now much less neglected, I would say by a couple orders of magnitude, which is not enough to compensate for the lower tractability of trajectory change.

AI timelines seem quite short and AGI & ASI seem like they may be quite likely to be structurally power concentrating (even if we avoid the most extreme scenarios, although of course those are one of the main concerns) and likely to make most humans largely irrelevant;

Hence the time we have left to influence the trajectory of the long-term future may be quite limited. Even if nothing is locked in immediately, the ability of most humans to meaningfully influence the long-term trajectory of the future may be strongly curtailed as AI (and who controls AI, and the overall shape of civilization at the time the AI transition occurs) becomes the dominant force deciding what happens in the future.

As noted in the MacAskill essay you mentioned, AGI/ASI may also accelerate other factors like grabbing solar space resources & potentially defense dominant deep space settlement, rapidly moving toward technological maturity, and various other forces increasing the likelihood of local and global lock-in, enabled by advanced AI.

So overall I think the concern is that AI may cause a dramatic relatively near-term acceleration of path dependence and lock-in dynamics, and that this is very severely neglected compared to AI related extinction risk, which seem like the biggest extinction risk.

I think trajectory change is in fact so neglected that it’s hard to even say how tractable it is because barely anyone has looked at it, and it seems worth having at least a few people looking at it.

I think there’s also a sociological factor that MacAskill and Forethought, which have something not too far off from the the above view, have been really trying to raise the profile of these ideas since at least “What We Owe The Future” (although to be clear I know there are several longtermists who helped innovate these views and quite a lot of longtermists who are sympathetic to this kind of stuff, I just think MacAskill/Forethought have been most effective.)

This is a bit more speculative, but I also think that perhaps when the whole shift from “longtermist” framing to “x-risk” framing occured, a lot of the people who cared most about/were most comparatively advantaged at work on extinction just stopped calling themselves longtermists, or at least stop emphasizing this, as they saw this as a liability, whereas people who cared about/were comparatively advantaged at work on the trajectory of the future were much more likely to continue calling themselves longtermists; so part of it might be that, not only did longtermists start focusing more on this stuff, but that people who care about extinction stopped calling themselves longtermists, making it seem like a higher proportion of longtermists now care more about non-extinction issues.

While I don’t know if this was counterfactual, I requested the “Existential Choices Debate” on this topic and spent most of last year researching and writing a yet-to-be-published essay containing a lot of my own views essentially comparing extinction vs. trajectory change as cause areas, which you may find relevant.

Dave Cortright 🔸

I know this is likely an unpopular view, but what makes the human species so precious? Going back in evolutionary time, what if Neanderthals or T. rex or even jellyfish decided they were the pinnacle of evolution and closed the door behind them?

I'm not saying I want humans to go extinct, but it is speciesist to say we are as good as it gets and halt the process of natural selection.

Ben Millwood🔸

I think of this situation as analogous to waking up in the cabin of a truck that's careening down a major road. Of course you're going to grab the wheel, just because no-one else has done so yet.

Vasco Grilo🔸

Hi Michelle.

Preventing humans from going extinct in the next decade continues to affect the future indefinitely - human extinction seems like a clear ‘lock-in’ event.

Why so? Something being locked in forever is not sufficient for longterm benefits?

Suppose the European Union's (EU's) ban on housing chickens in battery cages is locked in forever, in the sense the vast majority of farmed chickens in the countries which currently belong to the EU will forever be outside cages. This does not mean the marginal advocacy for the ban had longterm benefits. I think the number of chickens in cages in 1 M years would have been the same in expectation if the spending advocating for the ban had been e.g. 10 k$ smaller. I would estimate the benefits from "increase in the probability of the ban"*"acceleration in the transition to cage-free conditional on the ban being passed". I believe this last factor is like 3 to 10 years, thus preventing astronomical benefits.

Why departing from the logic above for "longtermist interventions"? One could define these interventions as ones which depart from the logic above, but the question then is whether such interventions exist.

Matthew Rendall

1mo*

One possibility is that longtermists have become less confident that the survival of human beings and/or sentient life is desirable. Two trends might explain this:

(a) Increased attention to and prioritisation of s-risks. If you give greater weight to avoiding suffering than promoting happiness, then you'll worry less about extinction and more about avoiding dystopias.

(b) Increased attention to wild animal suffering. It's only since about 2015 that the view has become widespread that most wild animal lives are worth not living. That doesn't make human extinction less undesirable per se. Some existential catastrophes, however—such as misaligned AI--would have a good chance of eradicating not only humans but sentient life. If you suspect that sentient life is on balance bad, then you’ll accord less weight to preventing them.

At the same time, human intervention might offer the best chance of mitigating wild animal suffering. So if suffering predominates in non-human lives, the worst possible scenario could be one in which only humans go extinct. If you think wild animal lives are bad on net (I’m far from confident that this is true myself) that might provide some basis for prioritising human extinction scenarios--such as pandemics--and de-prioritising threats like AI.