I think there is an issue with the community-wide allocation of effort. A very large proportion of our effort goes into preparation work, setting the community up for future successes; and very little goes into external-focused actions which would be good even if the community disappeared. I'll talk about why I think this is a problem (even though I love preparation work), and what types of things I hope to see more of.
Phase 1 and Phase 2
A general strategy for doing good things:
- Phase 1: acquire resources and work out what to do
- Phase 2: spend down the resources to do things

Note that Phase 2 is the bit where things actually happen. Phase 1 is a necessary step, but on its own it has no impact: it is just preparation for Phase 2.
To understand if something is Phase 2 for the longtermist EA community, we could ask “if the entire community disappeared, would the effects still be good for the world?”. For things which are about acquiring resources — raising money, recruiting people, or gaining influence — the answer is no. For much of the research that the community does, the path to impact is either by using the research to gain more influence, or having the research inform future longtermist EA work — so the answer is again no. However, writing an AI alignment textbook would be useful to the world even absent our communities, so would be Phase 2. (Some activities live in a grey area —for example, increasing scope sensitivity or concern for existential risk across broad parts of society.)
It makes sense to frontload our Phase 1 activities, but we do want to also do Phase 2 in parallel for several reasons:
- Doing enough Phase 2 work helps to ground our Phase 1 work by ensuring that it’s targeted at making the Phase 2 stuff go well
- Moreover we can benefit from the better feedback loops Phase 2 usually has
- We can’t pivot (orgs, careers) instantly between different activities
- A certain amount of Phase 2 helps bring in people who are attracted by demonstrated wins
- We don’t know when the deadline for crucial work is (and some of the best opportunities may only be available early) so we want a portfolio across time
So the picture should look something like this:

I’m worried that it does not. We aren’t actually doing much Phase 2 work, and we also aren’t spending that much time thinking about what Phase 2 work to be doing. (Although I think we’re slowly improving on these dimensions.)
Problem: We are not doing Phase 2 Work
When we look at the current longtermist portfolio, there’s very little Phase 2 work[1]. A large majority of our effort is going into acquiring more resources (e.g. campus outreach, or writing books), or into working out what the-type-of-people-who-listen-to-us should do (e.g. global priorities research).
This is what we could call an inaction trap. As a community we’re preparing but not acting. (This is a relative of the meta-trap, but we have a whole bunch of object-level work e.g. on AI.)
How does AI alignment work fit in?
Almost all AI alignment research is Phase 1 — on reasonable timescales it’s aiming to produce insights about what future alignment researchers should investigate (or to gain influence for the researchers), rather than producing things that would leave the world in a better position even if the community walked away.
But if AI alignment is the crucial work we need to be doing, and it's almost all Phase 1, could this undermine the claim that we should increase our focus on Phase 2 work?
I think not, for two reasons:
- Lots of people are not well suited to AI alignment work, but there are lots of things that they could productively be working on, even if AI is the major determinant of the future (see below)
- Even within AI alignment, I think an increased focus on "how does this end up helping?" could make Phase 1 work more grounded and less likely to accidentally be useless
Problem: We don’t really know what Phase 2 work to do
This may be a surprising statement. Culturally, EA encourages a lot of attention on what actions are good to take, and individuals talk about this all the time. But I think a large majority of the discussion is about relatively marginal actions — what job should an individual take; what project should an org start. And these discussions often relate mostly to Phase 1 goals, e.g. How can we get more people involved? Which people? What questions do we need to understand better? Which path will be better for learning, or for career capital?
It’s still relatively rare to have discussions which directly assess different types of Phase 2 work that we could embark on (either today or later). And while there is a lot of research which has some bearing on assessing Phase 2 work, a large majority of that research is trying to be foundational, or to provide helpful background information.
(I do think this has improved somewhat in recent times. I especially liked this post on concrete biosecurity projects. And the Future Fund project list and project ideas competition contain a fair number of sketch ideas for Phase 2 work.)
Nonetheless, working out what to actually do is perhaps the central question of longtermism. I think what we could call Phase 1.5 work — developing concrete plans for Phase 2 work and debating their merits — deserves a good fraction of our top research talent. My sense is that we're still significantly undershooting on this.[2]
Engaging in this will be hard, and we’ll make lots of mistakes. I certainly see the appeal of keeping to the foundational work where you don’t need to stick your neck out: it seems more robust to gradually build the edifice of knowledge we can have high confidence in, or grow the community of people trying to answer these questions. But I think that we’ll get to better answers faster if we keep on making serious attempts to actually answer the question.
Towards virtuous cycles?
Since Phase 1.5 and Phase 2 work are complements, when we're underinvested in both the marginal analsysis can suggest that neither is that worthwhile — why implement ideas that suck? or why invest in coming up with better ideas if nobody will implement them? But as we get more analysis of what Phase 2 work is needed, it should be easier for people to actually try these ideas out. And as we get people diving into implementation, it should get more people thinking more carefully about what Phase 2 work is actually helpful.
Plus, hopefully the Phase 2 work will actually just make the world better, which is kind of the whole thing we’re trying to do. And better yet, the more we do it, the more we can honestly convey to people that this is the thing we care about and are actually doing.
To be clear, I still think that Phase 1 activity is great. I think that it’s correct that longtermist EA has made it a major focus for now — far more than it would attract in many domains. But I think we’re noticeably above the optimum at the moment.[3]
- ^
When I first drafted this article 6 months ago I guessed <5%. I think it's increased since then; it might still be <5% but I'd feel safer saying <10%, which I think is still too low.
- ^
This is a lot of the motivation for the exercise of asking "what do we want the world to look like in 10 years?"; especially if one excludes dimensions that relate to the future success of the EA movement, it's prompting for more Phase 1.5 thinking.
- ^
I've written this in the first person, but as is often the case my views are informed significantly by conversations with others, and many people have directly or indirectly contributed to this. I want to especially thank Anna Salamon and Nick Beckstead for helpful discussions; and Raymond Douglas for help in editing.
Sorry for the long and disorganized comment.
I agree with your central claim that we need more implementation, but I either disagree or am confused by a number of other parts of this post. I think the heart of my confusion is that it focuses on only one piece of end to end impact stories: Is there a plausible story for how the proposed actions actually make the world better?
You frame this post as “A general strategy for doing good things”. This is not what I care about. I do not care about doing things, I care about things being done. This is semantic but it also matters? I do not care about implementation for it’s own sake, I care about impact. The model you use assumes preparation, implementation and the unspoken impact. If the action leading to the best impact is to wait, this is the action we should take, but it’s easy to overlook this if the focus is on implementation. So my Gripe #1 is that we care about impact, not implementation, and we should say this explicitly. We don’t want to fall into a logistic trap either [1].
The question you pose is confusing to me:
I’m confused by the timeline of the answer to this question (the effects in this instant or in the future?). I’m also confused by what the community disappearing means – does this mean all the individual people in the community disappear? As an example, MLAB skills up participants in machine learning; it is unclear to me if this is Phase 1 or Phase 2 because I’m not sure the participants disappear; if they disappear then no value has been created, but if they don’t disappear (and we include future impact) they will probably go make the world better in the future. If the EA community disappeared but I didn’t, I would still go work on alignment. It seems like this is the case for many EAs I know. Such a world is better than if the EA community never existed, and the future effects on the world would be positive by my lights, but no phase 2 activities happened up until that point. It seems like MLAB is probably Phase 1, as is university, as is the first half of many people’s careers where they are failing to have much impact and are skill/career capital building. If you do mean disappearing all community members, is this defined by participation in the community or level of agreement with key ideas (or something else)? I would consider it a huge win if OpenAI’s voting board of directions were all members of the EA community, or if they had EA-aligned beliefs; this would actually make us less likely to die. Therefore, I think doing outreach to these folks, or more generally “educating people in key positions about the risks from advanced AI” is a pretty great activity to be doing – even though we don’t yet know most the steps to AGI going well. It seems like this kind of outreach is considered Phase 1 in your view because it’s just building the potential influence of EA ideas. So Gripe #2: The question is ambiguous so I can’t distinguish between Phase 1 and 2 activities on your criteria.
You give the example of
I disagree with this. I don’t think writing a textbook actually makes the world much better. (An AI alignment textbook exists) is not the thing I care about; (aligned AI making the future of humanity go well) is the thing I care about. There’s like 50 steps from the textbook existing to the world being saved, unless your textbook has a solution for alignment, and then it’s only like 10 steps[2]. But you still need somebody to go do those things.
In such a scenario, if we ask “if the entire community disappeared [including all its members], would the effects still be good for the world?”, then I would say that the textbook existing is counterfactually better than the textbook not existing, but not by much. I don’t think the requisite steps needed to prevent the world from ending would be taken. To me, assuming (the current AL alignment community all disappears) cuts our chances of survival in half, at least[3]. I think this framing is not the right one because it is unlikely that the EA or alignment communities will disappear, and I think the world is unfortunately dependent on whether or not these communities stick around. To this end, I think investing in the career and human capital of EA-aligned folks who want to work on alignment is a class of activities relatively likely to improve the future. Convincing top AI researchers and math people etc. is also likely high EV, but you’re saying it’s Phase 2. Again, I don’t care about implementation, I care about impact. I would love to hear AI alignment specific Phase 2 activities that seem more promising than “building the resource bucket (# of people, quality of ideas, $ to a lesser extent, skills of people) of people dedicated to solving alignment”. By more promising I mean have a higher expected value or increase our chances of survival more. Writing a textbook doesn’t pass the test I don’t think. There’s some very intractable ideas I can think of like the UN creates a compute monitoring division. Of the FTX Future Fund ideas, AI Alignment Prizes are maybe Phase 2 depending on the prize, but depends on how we define the limit of the community; probably a lot of good work deserving of a prize would result in an Alignment Forum or LessWrong post without directly impacting people outside these communities much. Writing about AI Ethics suffers from the alignment textbook because it just relies on other people (who probably won’t) taking it seriously. Gripe 3: In terms of AI Alignment, the cause area I focus on most, we don’t seem to have promising Phase 2 ideas but some Phase 1 ideas seem robustly good.
I guess I think AI alignment is a problem where not many things actually help. Creating an aligned AGI helps (so research contributing to that goal has high EV, even if it’s Phase 1), but it’s only something we get one shot at. Getting good governance helps; much of the way to do this is Phase 1 of aligned people getting into positions of power; the other part is creating strategy and policy etc; CSET could create an awesome plan to govern AGI, but, assuming policy makers don’t read reports from disappeared people, this is Phase 1. Policy work is Phase 1 up until there is enough inertia for a policy to get implemented well without the EA community. We’re currently embarrassingly far from having robustly good policy ideas (with a couple exceptions). Gripe 3.5: There’s so much risk of accidental harm from acting soon, and we have no idea what we’re doing.
I agree that we need implementation, but not for its own sake. We need it because it leads to impact or because it’s instrumentally good for getting future impact (as you mention, better feedback, drawing in more people, time diversification based on uncertainty). The irony and cognitive dissonance of being a community dedicated to doing lots of good who then spends most its time thinking does not allude me; as a group organizer at a liberal arts college I think about this quite a bit.
I think the current allocation between Phase 1 and Phase 2 could be incorrect, and you identify some decent reasons why it might be. What would change my mind is a specific plan where having more Phase 2 activities actually increases the EV of the future. In terms of AI Alignment, Phase 1 activities just seem better in almost all cases. I understand that this was a high-level post, so maybe I'm asking for too much.
the concept of a logistics magnet is discussed in Chapter 11 of “Did That Just Happen?!: Beyond “Diversity”―Creating Sustainable and Inclusive Organizations” (Wadsworth, 2021). “This is when the group shifts its focus from the challenging and often distressing underlying issue to, you guessed it, logistics.” (p. 129)
Paths to impact like this are very fuzzy. I’m providing some details purely to show there’s lots of steps and not because I think they’re very realistic. Some steps might be: a person reads the book, they work at an AI lab, they get promoted into a position of influence, they use insights from the book to make some model slightly more aligned and publish a paper about it; 30 other people do similar things in academia and industry, eventually these pieces start to come together and somebody reads all the other papers and creates an AGI that is aligned, this AGI takes a pivotal act to ensure others don’t develop misaligned AGI, we get extremely lucky and this AGI isn’t deceptive, we have a future!
I think it sounds self-important to make a claim like this, so I’ll briefly defend it. Most the world doesn’t recognize the importance or difficulty of the alignment problem. The people who do and are working on it make up the alignment community by my definition; probably a majority consider themselves longtermist or EAs, but I don’t know. If they disappeared, almost nobody would be working on this problem (from a direction that seems even slightly promising to me). There are no good analogies, but... If all the epidemiologists disappeared, our chances of handling the next pandemic well would plunge. This is a bad example partially because others would realize we have a problem and many people have a background close enough that they could fill in the gaps
Thanks for the clarification! I would point to this recent post on a similar topic to the last thing you said.