Tldr;
- (Longtermist) EAs seem more focused on non-extinction risks from AI than they used to be
- I haven’t seen much discussion of why, making reference to the very longrun
- Changes in the world this might be responding to are: alignment seeming more promising than it used to, the speed of AI progress, and concerns around US governance.
- There are also plausible sociological reasons for the shift, so I feel unsure how far to update
What’s the change and why care about it?
My sense is that there’s been a significant shift in how much longtermists prioritise non-extinction risks over the last few years. A decade ago people who were trying to ensure the flourishing of the universe trillions of years from now were very often focused on avoiding events that would kill all humans. My sense is that now they’re much more often focused on situations which they don’t think pose a risk of human extinction, such as extreme power concentration.
This change has felt confusing to me for a couple of reasons:
- There’s reasonably little written about why longtermists should change their prioritisation in this direction. The notable exception is this paper by Will MacAskill.
- To my mind, there’s a clear reason for longtermists to focus on extinction risks:
- It seems extremely hard for any of our actions now to predictably affect what the world looks like in trillions of years time. (Events which predictably affect the very longrun future are often called ‘lock-in’ events.)
- Preventing humans from going extinct in the next decade continues to affect the future indefinitely - human extinction seems like a clear ‘lock-in’ event.
You might think that our actual actions won’t be affected much by whether the risks we should prioritise most are totalitarianism or human extinction. There are indeed many actions which are useful for both of these. Pausing AI development will give us more time to allow the world to prepare itself for all sorts of risks that transformative AI might bring. Frontier AI companies having good whistleblower provisions gives us a better hope of finding out about many of the largest of those risks before they come about.
But there are ways in which you could disproportionately affect extinction risks compared to other risks. One is working directly to mitigate the ways humans could all plausibly be killed, such as AI enabled biorisks. Another is to focus more on the risk of misaligned AI risk than the risk of AI misuse by humans. The reason is that AI may be inherently alien to humans and therefore likely to have goals which are far reaching and extremely different to ours, and therefore cause a future which doesn’t include us. "The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else". By comparison, humans tend to have goals for which it’s useful to have other humans surviving.
I wanted to get a sense of why there’s been a shift in favour of longtermists prioritising extreme power concentration amongst humans compared to AI takeover, in order to understand better what’s highest priority for me to focus on. Below are the reasons I came across when talking to people about what has caused this trend.
Differences in the world that motivate the change
Alignment is more promising than some people expected.
For many longtermists, by far the most likely cause of human extinction is a misaligned AI. The reason is that humanity is actually reasonably resilient to large shocks, even ones that kill the vast majority of the population. That means natural disasters are reasonably unlikely to wipe everyone out. Whereas a misaligned AI for whom humans were in the way might take a more intentional approach to wiping us out.
Ten years ago, it seemed plausible that AI wouldn’t be able to understand human goals at all, so the idea of AI pursuing totally alien goals seemed more likely. We now have empirical evidence of AI understanding humans reasonably well - being able to get a fairly brief and vague prompt and accurately determine what we’d like it to do and how. To the extent their goals diverge from ours, they’re often very intelligible do us (such as wanting to complete the task and therefore cheating on it). There are some structural reasons for that - using large language models extensively imports a lot of human concepts into models.
Even aside from alignment being less challenging than some expected, the fact that there are now more people working on it provides some reason for thinking that marginal work on it might be less useful than it was in the past.
AI progress has been continuous, and medium-fast
In a world where AI progressed from largely useless to superintelligent in the course of days, there’s little opportunity for humans to figure out how to control it or wield it to their ends. In a world where AI progress is very slow, we have time to adjust our governance mechanisms to the existence of ASI. But where progress is continuous and medium-fast, some humans have enough time to notice it would likely be useful to control super-intelligence and make plans to do so, without there being time for things like governments
US governance is in a worse state than we might have expected for the point at which we hit transformative AI.
The federal government seems unusually opposed to safety-focused AI legislation (making it more likely than it might have been that I company execs could amass unusual levels of power), and less respectful of separation of powers. It’s possible that the people running AI companies also seem more power-seeking / less safety conscious than we would have expected (though that seems less compelling to me).
How much of an update should we make?
The above all seem like solid reasons to shift some prioritisation from avoiding AI takeover to avoiding extreme human power concentration.
I’m still not sure if I buy that these risks are in the same ballpark, even if they’re now closer. My hesitation stems from a strong intuition that it’s extremely hard to predictably affect anything over the long-term. Our actions have a very strong tendency to ‘wash-out’ - often over mere years, but particularly after centuries, millennia and onwards.
There are reasons to think that AGI will make it less likely that actions wash out - for example, it will plausibly enable immortality (perhaps by uploading rather than biologically). Dictatorships have often historically been brought down by the natural death of the dictator, so the impossibility of that could really increase the chance of lock in over the very long run. (For more on how AGI might enable authoritarian lock in, see the MacAskill and Finnveden et al papers linked about.) On the other hand, technological breakthroughs have historically swung the balances of power and otherwise washed out previous actions, and transformative AI will precipitate a time of unusually rapid technological progress.
There are also various things which might have causally led to a shift in views without epistemically justifying them:
- When OpenAI and DeepMind were set up, their comms about their aims were heavily focused on keeping the world safe. Things like the OpenAI board turning over has made salient how much some of the key players seem squarely focused on amassing power, and how scary that is. Being viscerally aware of that (even if the extent to which it’s the case is within the bounds of what you would have theoretically expected) makes it harder to look away from risks from human power concentration.
- The US has become more divided and tribal, making it feel more urgent and important to work on preserving democracy
- As AI progress is discussed more in the mainstream, there’s a larger coalition working on ensuring that the transition to superintelligent AI proceeds safely, many of whose values don’t extend to the extremely longrun future. Preventing extreme concentration of power is a unifying issue in the sense that lots of people find it more plausible than extinction, so this provides a (perhaps unconscious) incentive to emphasise it more.
The existence of causal reasons like those above make me wary of over updating in towards prioritising extreme power concentration. On the other hand, I think it’s plausible that I previously focused too hard on human extinction compared to other plausible lock-in events. A couple of the reasons I might have done that:
- I have a strong intuition about it being very hard to affect things over the longrun. And there is plenty of evidence about people trying to cause lasting change and totally failing. But I think it might be a mistake to trust my intuitions about this question at all. We should expect the world to be unrecognisable after a transition to ASI, so why would we expect it it to be similarly easy/hard to lock things in for the future?
- It does seem right to me that there is a chance of us being able to lock particular values or governance structures for the extremely long run, and also a chance that preventing human extinction does not lock in a change in value of the universe (for example, if humans went extinct decades later it would wash out the effect of saving them this decade). I think it’s easy to round the former off to ‘basically not lock in’ and the latter to ‘basically lock in’ in a way I don’t endorse.
All this mostly leaves me still feeling pretty unclear about how to prioritise between risks that might and might not plausibly cause human extinction. I’m inclined to prioritise those that won’t somewhat higher than I have historically. I still feel nervous that I’m partly doing that for bad reasons. Those reasons are less the explicit causal ones I listed, and more deferring to the ‘general zeitgeist’ without fully understanding its reasoning. I’d be very keen to hear other people’s views on this question.
Thanks to Arden Koehler for making this post (and so many of the things I write) much better, and to various people for the ideas behind the post, including Alex Lawsen and Nick Beckstead.

A couple of quick thoughts:



1/
- A big thing, in my view, is that AI safety isn't about preventing "extinction" in the relevant sense. In most worlds where AI disempowers humanity, the human species continues. And in essentially all worlds where AI disempowers humanity, AI still takes to the stars. So, AI safety is about who we want to guide the future, not about whether there's a long-term future or not.
- And, even if humanity does go extinct in (say) a bio-catastrophe, probably technologically-capable life evolves in the remaining time that Earth remains habitable.
- So, the probability of "an event occurs by 2100 that prevents Earth-originating life from ever spreading to the stars" is really low, I'd say <1%.
- Which is a lot less, in my view, than "an event occurs by 2100 that meaningfully affects the long-term value of Earth-originating civilisation" where I'm >50%. Cluelessness pushes this down, so "an event occurs that meaningfully and predictably in expectation" is lower, but not by enough.
2/
- There's still WAY more quality-adjusted $$ and labour going to AI safety than there is to AI-enabled extreme human power concentration, from the EA / AI safety communities. I'd say 1-2 OOMs more? So, I think one thing that's going on is a correction because that ratio is out of whack.
3/
Some evidence you are right, though, about shifting priorities, comes from the February Existential Security Summit. I ran a survey there (massive caveats: tiny sample size of 16, and probably with selection bias from who filed out the form).
Here are the results (note the shades of colours are a little confusing):
(Should say: ""Though biorisk and AI takeover risk are high-priority, there are other issues arising from AI that, at least in the aggregate, are comparable in (importance * tractability * neglectedness) to biorisk and/or AI takeover risk.")
Then, here is an aggregate ranking and Borda scores, in response to:
"How would you rank each of these cause areas, in terms of priority?
Imagine you are allocating a highly capable person who could productively work on any area they put their mind to.
(Rank 1 is highest, and put each rank only once.)"

1 AI-enabled concentration of human power
2 Risks from misaligned powerseeking AI
3 AI’s impact on epistemics, coordination and decision-making
3 Post-AGI governance
5 Biorisk
6 AI character
7 AI wellbeing and rights
8 Acausal trade
9 Space governance
Do you mean most likely worlds? The difference seems incredibly important - there are, in my view, quite compelling arguments that the most likely outcome given disempowerment is human extinction, but of course I can imagine worlds in which that doens't happen.
Well let me stop you right there.
https://forum.effectivealtruism.org/s/8ooZtgeWbsAxxP9bd
https://forum.effectivealtruism.org/s/wmqLbtMMraAv5Gyqn
https://forum.effectivealtruism.org/posts/CxMusuX8E5hiTXEWX/fruit-picking-as-an-existential-risk
https://forum.effectivealtruism.org/posts/zuQeTaqrjveSiSMYo/a-proposed-hierarchy-of-longtermist-concepts
https://forum.effectivealtruism.org/posts/wqmY98m3yNs6TiKeL/parfit-singer-aliens
https://forum.effectivealtruism.org/posts/zLi3MbMCTtCv9ttyz/formalizing-extinction-risk-reduction-vs-longtermism
https://forum.effectivealtruism.org/posts/WebLP36BYDbMAKoa5/the-future-might-not-be-so-great
https://forum.effectivealtruism.org/posts/WebLP36BYDbMAKoa5/the-future-might-not-be-so-great?commentId=cJdqyAAzwrL74x2mG
https://forum.effectivealtruism.org/posts/GsjmufaebreiaivF7/what-is-the-likelihood-that-civilizational-collapse-would
This isn't a comprehensive list, just some stuff off the top of my head.
Personally, I've spent a long time thinking about this and as far as I can tell, it's incredibly uncertain if (e)x(tinction)-risk reduction is positive EV (x-risk reduction is almost tautologically positive to the point of uselessness). In general I think the community got weirdly one shotted by astronomical waste-like arguments which are interesting thought experiments but don't really come close to solving cluelessness.
Thanks for the links!
Interesting post, thanks!! Excited to see discussion of it.
One thing I wonder if it's relevant are the sort of "mixed" pictures? Like, maybe you think totalitarian control of AI would be really bad because it increases the risk of human extinction from misalignment, by increasing the success rate of any attempted AI takeover / making it happen sooner in time (bc a totalitarian might e.g. give more control over to an AI they were using to stay in power).
Or, more vaguely, I sorta feel like if world governments / anyone with ambitions to power think that advanced AI will help them, that's bad news for alignment efforts.
Anyway, that could be something going on for some people I think?
Good point, thanks!
I don't encounter many people who still identify as longtermist, but as someone who does, I recently wrote these arguments for why longtermists should be less extinction-focused.
The tl;dr is that I think that other than extinction there are predictable patterns, with perhaps the most prominent related to entropy, and that those patterns provide more nuanced ways to estimate the cost of lesser catastrophes - and that while assessing the costs of lesser catastrophes precisely is infeasible, that's not a basis for thinking they would be negligible compared to extinction.
Interesting that you don't come across many people these days who still identify as longtermist, that's pretty different from my experience. I think it feels more intuitive to me to identify as 'longtermist' than 'effective altruist'. The former is a claim about my values (people in the future matter morally) whereas the latter is behavioural and feels presumptuous (how altruistic really am I? Am I effective at it even when I try?). But I guess I'm in the minority on that!
Hi Michelle.
Why so? Something being locked in forever is not sufficient for longterm benefits?
Suppose the European Union's (EU's) ban on housing chickens in battery cages is locked in forever, in the sense the vast majority of farmed chickens in the countries which currently belong to the EU will forever be outside cages. This does not mean the marginal advocacy for the ban had longterm benefits. I think the number of chickens in cages in 1 M years would have been the same in expectation if the spending advocating for the ban had been e.g. 10 k$ smaller. I would estimate the benefits from "increase in the probability of the ban"*"acceleration in the transition to cage-free conditional on the ban being passed". I believe this last factor is like 3 to 10 years, thus preventing astronomical benefits.
Why departing from the logic above for "longtermist interventions"? One could define these interventions as ones which depart from the logic above, but the question then is whether such interventions exist.
I agree that this shift has happened over the past 2 years. But I think 2 to 3 years ago EAs were unusually focused on extinction compared to a decade ago. I remember more discussions back then around positive visions for the longterm.
For what it’s worth, we just announced our first Frontier Biodefense Fellowship at Pivotal, which is more singularly focused on avoiding extinction than most projects within AI safety (including our AI safety fellowship). Obviously the team has a range of motivation to work on Biodefense, but for me weak longtermist arguments are quite central.
I am really happy you are focusing squarely on existential risk from bio. I think there is a tendency in EA-adjacent biosec work to lose a bit of focus on how extremely bad such scenarios are. I also think it is great you raised this Michelle - I also feel like not enough EAs have contemplated the importance of 2 further assumptions needed to work on longtermism:
1 - Massive increase in value in the future (re: Arepo's billions of star civilization), and
2 - Very few or not other periods of existential risk for the rest of the infinite future
Interesting, I don't think I noticed that trend between 10 years ago and 2 years ago.
Cool!
This could be entirely explainable by what is most resonant with a broader public and the fact that many of the non-extinction risks have much higher societal buy-in / are much more legible to many more people.
One reason is that longtermists are largely philosophers, who have no particular expertise on the details of aligning AI.
Another reason worth taking into consideration is if the true moral view is "fussy", rather than "easygoing". If you're "easygoing" in what you consider utopia, then, conditional on survival, most achievable value gets realized by default (we get great human lives), and extinction is the one really action-relevant lock-in event. But if you're fussy about realizing the best possible utopia, then, conditional on survival, we're still likely to miss most achievable value across a huge swathe of futures (we don't tile the universe with happy digital minds, say, or whatever crazy future might be the best utopia). The space of "didn't go extinct but missed most of the value" turns out to be enormous, and some of the features determining where in that space we land (early decisions about digital minds, population-ethics, allocation of resources during space settlement, which value-systems get amplified during the AI transition) are themselves plausibly locked-in, even if they don't feel as salient as extinction.
(But then again, maybe I'm recency biased because I just re-read Better Futures for the discussion week here on the Forum)
That doesn't settle the prioritization, and, like, the people are Forethought and 80.000 Hours are directly and explicitly working on the AI transition? So it's not like x-risk is off the table. My vibe for highly-engaged EAs is that perhaps it just feels that the main arguments about x-risk have already been made.
I agree that in deciding how much to prioritise averting extinction vs improving worlds in which we persist, it's important to think about the difference in value between (non-existence)(default survival)(actual utopia). But that argument has been around a long while. I think Ben Garfinkel was advancing the idea that (actual utopia) - (default survival) might be much larger than (default survival) - (non-existence) in the late 2010s. I'm interested in what's changed that's affected discourse. It's possible the answer is 'more people have read arguments of this form'. But in that case people who had already read those arguments should update less than if the change is eg us getting more info about how difficult alignment is.
Is this post conflating EAs or AI-safety researchers/advocates with longtermists? My impression is that actually rather few EAs are strong longtermists, and AI safety researchers/advocates maybe even less - they're just united by their wish to avoid catastrophe, but differ in their understandings of the kind of catastrophe we should expect.
I think when AI safety was young and, er, weird, a lot of those talking about it were longtermists. That makes sense. But now it's become a mainstream EA concern, and even a not-weird concern in society generally, so it's unsurprising that it's become less longtermist, from a sociological perspective.
I was intending to pick out a group of people who have for years identified as EAs and longtermists but have changed what they've worked on. I was thinking it was clear in talking about EAs deprioritising a thing that I meant the ones who prioritised that highly initially, but I see how that's confusing - I'll edit to clarify.
Right, what seemed to be missing to me was evidence that longtermists specifically had stopped working on extinction risk. But I see there are a good amount of posts others have linked in the comments that would count as evidence for this.
I know this is likely an unpopular view, but what makes the human species so precious? Going back in evolutionary time, what if Neanderthals or T. rex or even jellyfish decided they were the pinnacle of evolution and closed the door behind them?
I'm not saying I want humans to go extinct, but it is speciesist to say we are as good as it gets and halt the process of natural selection.