The Soul of EA is in Trouble

Mjreard

This is a Forum Team crosspost from Substack.

Whither cause prioritization and connection with the good?

There’s a trend towards people who once identified as Effective Altruists now identifying solely as “people working on AI safety.”^[1] For those in the loop, it feels like less of a trend and more of a tidal wave. There’s an increasing sense that among the most prominent (formerly?) EA orgs and individuals, making AGI go well is functionally all that matters. For that end, so the trend goes, the ideas of Effective Altruism have exhausted their usefulness. They pointed us to the right problem – thanks; we’ll take it from here. And taking it from here means building organizations, talent bases, and political alliances at a scale incommensurate with attachment to a niche ideology or moralizing language generally. I think this a dangerous path to go down too hard and my impression is the EAs are going down the path quite hard.

I’ll acknowledge right off the bat that it is emphatically not the case that everyone doing some version of rebranding themselves to AI safety is making a mistake at all or even harming the ideas of EA on balance. A huge EA insight relative to other moral schools of thought is that excessive purity about particular means and motivation often comes at too great a cost to good outcomes in the world and you should be willing to trade these off at least somewhat. It is definitely the case that specific labels, ideologies, and communities bring with them baggage that makes building broad alliances unnecessarily difficult and not everyone working on AI safety or any other EA-inspired cause should feel obligated to foreground their inspiration and the intellectual history that led them to do what they're doing.

A central point of my previous post on roughly this topic is that people have crossed the line of merely not-foregrounding EA towards things that look more like active disparagement. That seems like a straightforward mistake from many perspectives. Smart people will draw a line from Effective Altruism to specific perspectives on AI safety and associate one with the other. If you disparage EA, you disparage the specific parts of AI safety you, as part of the EA progeny, are supposed to care most about.

The worry I want to express here is sort of the inverse: if you glorify some relatively-value-neutral conception of AI safety as the summum bonum of what is or used to be EA, there is just a good chance that you will lose the plot and end up not pursuing the the actual highest good, the good itself.

What I see

The [fictionalized] impetus for writing this post came from going to a retreat held by what used to be a local EA group that had rebranded to a neutral name while keeping the same basic cause portfolio. The retreat was about 25 people, and I’d guess only 3-4 were vegan/vegetarian. Beyond that, when someone gave a lightning talk on earning to give to the presumably-relatively-high-context attendees, it seemed to go over like it might with a totally neutral, cold audience. Most nodded along, but didn’t engage; a few came up to ask typical questions re e.g., shouldn't the government do this, what about loans; and a few were mildly offended/felt put-upon.

For those whose EA retreat experience is mostly pre-2023 like me, both the numbers and reactions here are kind of shocking. I would have expected the retreat to be ~70% vegetarian and for most of the response to be hard-nosed questions about the most effective interventions, not “huh, so do you think any charities actually work?” As you might predict, almost all the rest of the retreat was split between technical AI safety and AI policy, with some lip service to biosecurity along the way.

Perhaps the clearest and most predictive embodiment of the trend is 80,000 Hours’ new strategic focus on AI. 80k was always fundamentally about providing thorough, practical cause/intervention prioritization and that exercise can be fairly regarded as the core of EA. They’re now effectively saying the analysis is done: doing the most good means steering AI development, so we’ll now focus only on the particulars of what to do in AI. Thanks, we'll take it from here indeed.

Now, even though it’d be easy to frame these moves as reacting to external evidence – perhaps laudably noticing the acceleration of AI capabilities, and perhaps less laudably wanting to cut ties with the past after FTX – one claim is that this is a turn towards greater honesty and transparency with audiences. To some degree, it has always been the case that AI career changes have been the primary measure of success of EA commun– ahem– field-building programs and now we’re just being clearer about what we want and hope for from participants.

This response seems question-begging in this context. Do we want people to work on AI safety or do we want them to do the most good, all things considered? Arguably, we genuinely wanted the latter, so the process mattered here. Maybe someone’s personal fit and drive for animals really did make that the better overall outcome. Maybe we were wrong about some key assumption in the moral calculus of AI safety and would welcome being set straight.

Even putting the question begging concern to the side, exactly what people end up doing within “AI safety” matters enormously from the EA perspective. Don’t you remember all the years, up to and including the present, where it was hard to know whether someone really meant what we thought (or hoped) they did when they said “AI safety?” We actually care about the overall moral value of the long run future. Making AI less racist or preventing its use in petty scams doesn’t really cut it in those terms.

Some reduce the problem to AI-not-kill-everyone-ism, which seems straightforward enough and directed and the most robust source of value here, but I notice people in more sophisticated (and successful) orgs are skittish about parsing things in those terms, lest they turn off the most talented potential contributors and collaborators.

Even this assumes, however, that the problem and its dimensions are and will remain simple enough to communicate in principle without needing to delve into any philosophy or moralizing about the kind of future we want. The obviously-biggest bads will be obvious, and so too the obviously-biggest goods. Thank goodness that our new, most highly capable contributors won’t need to know the ins and outs of our end goals in order to drive progress towards them, they’d be a lot harder to recruit otherwise.

The threat means pose to ends

And this strategy spawns things like the BlueDot curriculum, whose most digestible summary reading on risks from AI covers discrimination, copyright infringement, worker exploitation, invasions of privacy, reduced social connection, and autonomous vehicle malfunctions before touching on what I might call “real risks.”^[2] It might not be so bad if this was all just due diligence to cast the widest possible net before, in the course itself, participants would compare the seriousness of these risks. But on multiple occasions, I’ve had the sad experience of speaking to someone who had completed the course who seemed to not even have an awareness of existential risks as a concern.

I understand the temptation. The people I spoke to in this context were very impressive on paper. So you give them the course they want to take and maybe they get excited about doing work at an org you think is doing great and important work on AI. Once they’re there, they’ll catch on and see what’s up, or at least enough of them will do that to make this all worthwhile.

Well, then there’s the orgs. They’re also taking more and more steps to garner conventional credibility by working on more mundane and lower stakes questions than those aimed squarely at value. And it’s working. For those in the know, it’s hard to deny these EA-founded orgs are getting more prominent: better talent, more connections, more influence. A lot of it is a traceable consequence of moderating. The plan is that once there are clearer levers to pull to reduce existential risk (and I agree there aren’t really hugely ripe policy opportunities or ideas for this now), they’ll be in a great position to pull them.

Perhaps you see the worry. Compromise your goals now, pander to your constituents now, and later you’ll be able to cash it all in for what you really care about. The story of every politician ever. Begin as a young idealist, start making compromises, end up voting to add another $5 trillion to the debt because even though you’re retiring next term, you’d hate not to be a team player when these midterms are going to be so. close.

This isn’t just a problem for politics and public-facing projects. It’s a deep weakness of the human condition. People will often decide that some particular partner, or house, or car, or job, or number of kids will make them happy. So they fixate on whatever specific instrument of happiness they chose and after enough time goes by, they fully lose their original vision in the day-to-day onrush of things involved in pursuing the instrument. It’s much easier to simply become the mask you put on to achieve your goals than it is to always remind yourself it’s a mask. In competitive, high-stakes, often zero-sum competitions like policy, it is even harder to pay the tax of maintaining awareness of your mask, lest you fall behind or out of favor, and this is exactly the situation I see AI safety orgs headed towards.

All the same, I don’t think we’re at a point of crisis. None of these tradeoffs seem too dumb at the moment (with some exceptions) and I generally trust EAs to be able to pull off this move more than most. But we’re not setting ourselves up well to escape this trap when we consciously run away from our values and our roots. Likewise when we don't acknowledge or celebrate people doing the hard work of reflecting directly on what matters in the present. This all corresponds too neatly to the hollowing out of principles-first EA community building, either from 80k or from local groups or university groups converting to AI safety or, tellingly, “AI Security” programs.

The social signals are also powerful. The serious, important people no longer dabble in cause prioritization, obsess about personal donations, or debate population ethics. They build fancy think tanks staffed with Ivy Leaguers and take meetings with important people in government and tech about the hot AI topic of the month. And so community builders take their cues and skip as much of the background ethics, assumptions, and world modeling as they can to get their audiences looking and acting like the big people as fast as possible.

Again, the fanciness and the meetings are good moves and have a lot of value, but if the people executing them never show up to EAG or speak to a university group about the fundamentals, when are they reflecting on those? Even back when they did do that, was it all so clear and resolved that it’d be easy to pick up again in 5 years when you need it? And what will the composition of all your new collaborators be by then? Will they have done any of this reflection or even be on board for the maybe-unpopular actions it recommends?

Losing something more

Beyond possibly falling into a classic trap of making your instrumental goals the enemy of your terminal goals, motivations and reflection just matter a lot for their own sake. If you don’t check in on yourself and your first principles, you’re at serious risk of getting lost both epistemically and morally. When you make arguments aimed at giving you power and influence, the next tradeoff you make is how much scrutiny to give instrumentally useful arguments, hires, and projects.

Another byproduct of checking in from first principles is who and what it connects you with. Everyone knows the vegans are the good guys. You should regard feeling alien and disconnected from them as a warning sign that you might not be aiming squarely as the good. And the specifics of factory farming feel particularly clarifying here. Even strong-identity vegans push the horrors of factory farming out of their heads most of the time for lack of ability to bear it. It strikes me as good epistemic practice for someone claiming that their project most helps the world to periodically stare these real-and-certain horrors in the face and explain why their project matters more – I suspect it cuts away a lot of the more speculative arguments and clarifies various fuzzy assumptions underlying AI safety work to have to weigh it up against something so visceral. It also forces you to be less ambiguous about how your AI project cashes out in reduced existential risk or something equivalently important. Economizing on the regulatory burden faced by downstream developers? Come now, is that the balance in which the lightcone hangs?

Then there is the burden of disease for humans. The thing I suspect brought most now-AI-safety people into the broader ecosystem. Mothers burying their children. The amount of money that you would personally sacrifice to stop it or some equivalent nightmare. Both this problem and this mode of thinking about tradeoffs are greatly if not wholly deemphasized in circles where they were once the cornerstones of how to think about your potential effects on the world. Sure, you don’t want to miss out on a phenomenal safety engineer because you offered too small a salary or set too strong an example of personal poverty, but is there really no place for this discourse nearby to you? Is this something you want distance from?

The confession I’ll make at this especially-moralizing juncture is that, ironically, I am a bad EA for basically the opposite reason that the AI safety identitarians are bad EAs. They care so much about putting every last chip down on the highest-marginal-EV bet that they risk losing themselves. I wallow in my conservatism and abstraction because I care more about the idea of EA than impact itself. That – my part – is really not what it’s supposed to be about.

You, reader, are not doomed to fall into one or the other of these traps though. There are people like Joe, or Benjamin, or Rohin, or Neel who do very impressive and important work on AI safety that is aimed where the value lies, but to my eye they also keep in touch with their moral compasses, with the urgency of animal and human suffering, and with the centrality of goodness itself. As individuals, I don’t think any of them disparage or even belittle by implication the practice of doing serious cross-cause prioritization.

Obviously, this is easier to do as an individual than as an organization. There’s clearly value to an organization making itself more legibly open to a broader range of partners and contributors. But as with all things, influence flows both ways. Your organization’s instrumental goals can rub off on you and how you orient yourself towards your life and work. Your terminal goals can be a north star for the shape your projects and initiatives take, even if there are hard tradeoffs to be made along the way. I worry that the people who care most about doing good in the world are being tempted by the former and becoming increasingly blind to the latter. I worry it’s being socially reinforced by people with weaker moral compasses who haven’t really noticed it’s a problem. I want both groups to notice and each of us individually to be the people we actually want to be.

^{^}
I would say “AI safety advocates,” but as will become clear, “advocacy” connotes some amount of moralizing and moralizing is the thing from which people are flinching.
^{^}
I pick on BlueDot because they’re the most public and legible, but I’ve seen even worse and more obfuscated curriculums on these terms from groups aiming at something very different than what their courses suggest.

360 Reactions

More posts like this

Comments42

Sorted by

New & upvoted

Click to highlight new comments since: Today at 8:23 PM

Will AldredMay 9 2025*70

Inspired by the last section of this post (and by a later comment from Mjreard), I thought it’d be fun—and maybe helpful—to taxonomize the ways in which mission or value drift can arise out of the instrumental goal of pursuing influence/reach/status/allies:

Epistemic status: caricaturing things somewhat

Never turning back the wheel

In this failure mode, you never lose sight of how x-risk reduction is your terminal goal. However, in your two-step plan of ‘gain influence, then deploy that influence to reduce x-risk,’ you wait too long to move onto step two, and never get around to actually reducing x-risk. There is always more influence to acquire, and you can never be sure that ASI is only a couple of years away, so you never get around to saying, ‘Okay, time to shelve this influence-seeking and refocus on reducing x-risk.’ What in retrospect becomes known as crunch time comes and goes, and you lose your window of opportunity to put your influence to good use.

Classic murder-Gandhi

Scott Alexander (2012) tells the tale of murder-Gandhi:

Previously on Less Wrong’s The Adventures of Murder-Gandhi: Gandhi is offered a pill that will turn him into an unstoppable murderer. He refuses to take it, because in his current incarnation as a pacifist, he doesn’t want others to die, and he knows that would be a consequence of taking the pill. Even if we offered him $1 million to take the pill, his abhorrence of violence would lead him to refuse.
But suppose we offered Gandhi $1 million to take a different pill: one which would decrease his reluctance to murder by 1%. This sounds like a pretty good deal. Even a person with 1% less reluctance to murder than Gandhi is still pretty pacifist and not likely to go killing anybody. And he could donate the money to his favorite charity and perhaps save some lives. Gandhi accepts the offer.
Now we iterate the process: every time Gandhi takes the 1%-more-likely-to-murder-pill, we offer him another $1 million to take the same pill again.
Maybe original Gandhi, upon sober contemplation, would decide to accept $5 million to become 5% less reluctant to murder. Maybe 95% of his original pacifism is the only level at which he can be absolutely sure that he will still pursue his pacifist ideals.
Unfortunately, original Gandhi isn’t the one making the choice of whether or not to take the 6th pill. 95%-Gandhi is. And 95% Gandhi doesn’t care quite as much about pacifism as original Gandhi did. He still doesn’t want to become a murderer, but it wouldn’t be a disaster if he were just 90% as reluctant as original Gandhi, that stuck-up goody-goody.
What if there were a general principle that each Gandhi was comfortable with Gandhis 5% more murderous than himself, but no more? Original Gandhi would start taking the pills, hoping to get down to 95%, but 95%-Gandhi would start taking five more, hoping to get down to 90%, and so on until he’s rampaging through the streets of Delhi, killing everything in sight.

The parallel here is that you can ‘take the pill’ to gain some influence, at the cost of focusing a bit less on x-risk. Unfortunately, like Gandhi, once you start taking pills, you can’t stop—your values change and you care less and less about x-risk until you’ve slid all the way down the slope.

It could be your personal values that change: as you spend more time gaining influence amongst policy folks (say), you start to genuinely believe that unemployment is as important as x-risk, and that beating China is the ultimate goal.

Or, it could be your organisation’s values that change: You hire some folks for their expertise and connections outside of EA. These new hires affect your org’s culture. The effect is only slight, at first, but a couple of positive feedback cycles go by (wherein, e.g., your most x-risk-focused staff notice the shift, don’t like it, and leave). Before you know it, your org has gained the reach to impact x-risk, but lost the inclination to do so, and you don’t have enough control to change things back.

Social status misgeneralization

You and I, as humans, are hardwired to care about status. We often behave in ways that are about gaining status, whether we admit this to ourselves consciously or not. Fortunately, when surrounded by EAs, pursuing status is a great proxy for reducing x-risk: it is high status in EA to be a frugal, principled, scout mindset-ish x-risk reducer.

Unfortunately, now that we’re expanding our reach, our social circles don’t offer the same proxy. Now, pursuing status means making big, prestigious-looking moves in the world (and making big moves in AI means building better products or addressing hot-button issues, like discrimination). It is not high status in the wider world to be an x-risk reducer, and so we stop being x-risk reducers.

I have no real idea which of these failure modes is most common, although I speculate that it’s the last one. (I’d be keen to hear others’ takes.) Also, to be clear, I don’t believe the correct solution is to ‘stay small’ and avoid interfacing with the wider world. However, I do believe that these failure modes are easier to fall into than one might naively expect, and I hope that a better awareness of them might help us circumvent them.

MjreardMay 10 20256

Great write up. I think all three are in play and unfortunately kind of mutually reinforcing, though I'm more agnostic about how much of each.

Ozzie GooenMay 8 202545

I definitely sympathize, though I'd phrase things differently.

As I've noted before, I think much of the cause is just that the community incentives very much come from the funding. And right now, we only have a few funders, and those funders are much more focused on AI Safety specifics then they are things like rationality/epistemics/morality. I think these people are generally convinced on specific AI Safety topics and unconvinced by a lot of more exploratory / foundational work.

For example, this is fairly clear at OP. Their team focused on "EA" is formally called "GCR Capacity Building." The obvious goal is to "get people into GCR jobs."

You mention a frustration about 80k. But 80k is getting a huge amount of their funding from OP, so it makes sense to me that they're doing the sorts of things that OP would like.

Personally, I'd like to see more donations come from community members, to be aimed at community things. I feel that the EA scene has really failed here, but I'm hopeful there could be changes.

I don't mean to bash OP / SFF / others. I think they're doing reasonable things given their worldviews, and overall I think they're both very positive. I'm just pointing out that they represent about all the main funding we have, and that they just aren't focused on the EA things some community members care about.

Right now, I think that EA is in a very weak position. There just aren't that many people willing to put in time or money to push forward the key EA programs and mission, other than using it as a way to get somewhat narrow GCR goals.

Or, in your terms, I think that almost no one is actually funding the "Soul" of EA, including the proverbial EA community.

Benevolent_RainMay 19 2025*17

I just have to call out the amazing work by Rethink Priorities and those that funded this sequence of analyses (not sure who that is, would welcome info!): https://forum.effectivealtruism.org/s/WdL3LE5LHvTwWmyqj

I guess this might be the "last, properly funded EA analysis" unless something came out after that which I missed ("last" in that going forward it seems funders are doubling down on AI and might not rethink this decision in the near future)? I think the takeaway from this work by Rethink Priorities for me is that it is not at all unreasonable to focus on other things than AI, as going all in on AI seemed to require a set of quite extreme beliefs/assumptions. Would be happy to be corrected if my simple take-away might be overly naive.

David Mathers🔸May 21 20252

What are the "extreme beliefs" you have in mind?

Benevolent_RainMay 21 20255

I think to me cubic or faster value increases and that we will mostly have a future with very low risk and that it is only now, or during a few periods that risk will be extremely high. In a sense, I see these assumptions in tension as high value often is accompanied by high risk. I was just made aware that even sending digital being to far-away galaxies looks extremely expensive energy-wise, even if one keep only the minimum power requirement during a multi-year travel between solar systems. I guess in essence, I feel like to justify these assumptions one would have to really look into what these assumptions materially mean, and use historical precedent and reasonable analysis across a wide range of scenarios to see if they make sense. For me this is more intuition and a scepticism that enough work have been done to get certainty of these assumptions. To some degree, I also feel like AI safety was a direction where funders might get more of a feeling "of doing something" - something I have been at fault at myself. Something like just chipping away at the stubborn problems of poverty/global health, or animal welfare is likely to remain "unsolved" problems even with billions more invested. Moreover, they do not have novelty, and these "industries" are less prone to be affected while AI is new and one can see more systemic effects. Maybe this last point actually drives at something supporting AI safety - it might be more tractable in a sense. Sorry this was long and not underpinned by much analysis so would welcome any analysis on these points, especially analysis that might change my mind.

David Mathers🔸May 21 20254

I do think one issue people may be underrating is that we might just not bother with space colonization, if the distances and costs mean that no one on Earth will ever see significant material gain from it.

David Mathers🔸May 21 20253

I think that given a few generations of expansion to different stars in all directions, it is not implausible (i.e. at least 25% chance) that X-risk becomes extremely low (i.e. under 1 in 100,000 per century, once there are say, 60 colonies with expansion plans, and a lot less once there are 1000 colonies.) After all, we've already survived a million years, and most X-risks not from AI seem mostly to apply to single planet civilizations, plus the lightspeed barrier makes it hard for a risk to reach everywhere at once. But I think I agree that thinking through this stuff is very, very hard, and I'm sympathetic to David Thorstad's claim that if we keep finding ways current estimates of the value of X-risk reduction could be wildly wrong, at some point we should just lose trust in current estimates (see here for Thorstad making the claim: https://reflectivealtruism.com/2023/11/03/mistakes-in-the-moral-mathematics-of-existential-risk-part-5-implications/), even though I am a lot less confident than Thorstad is that very low future per year risk is an "extreme" assumption.

It is disturbing to me how much Thorstad's work on this stuff seems to have been ignored by leading orgs; it is very serious work criticizing key assumptions that they base their decisions on, even if I personally think he tends to push points in his favour a bit far. I assume the same is true for the Rethink report you cite, although it is long and complicated enough, unlike Thorstad's short blog posts, that I haven't read any of it.

Benevolent_RainJul 2 20252

Actually reading this again, I think maybe you have a point about complexity of arguments/assumptions. Not sure if it is Occam's Razor, but if one has to contort an argument into this weird, windy argument with unusual assumptions - maybe this hard attempt at something like "rationalization" should be a warning flag. That said, the world is complex and unpredictable, so perhaps reasoning about it is complex too - I guess this is an age-old debate with no clear answer!

Animal welfare on the other hands seems so extremely easy to argue is important. Global poverty a little less so but still easier than x-risk (more about whether handing out mosquito nets is better than economic growth, democracy, human rights, etc.).

MjreardMay 8 20258

I agree with all of this.

My wish here is that specific people running orgs and projects were made of tougher stuff re following funding incentives. For example, it doesn't seem like your project is at serious risk of defunding if you're 20-30% more explicit about the risks you care about or what personally motivates you to do this work.

There are probably only about 200 people on Earth with the context x competence for OP to enthusiastically fund for leading on this work – they have bargaining power to frame their projects differently. Yet on this telling, they bow to incentives to be the very-most-shining star by OP's standard, so they can scale up and get more funding. I would just make the trade off the other way: be smaller and more focused on things that matter.

I think social feedback loops might bend back around to OP as well if they had fewer options. Indeed, this might have been the case before FTX. The point of the piece is that I see the inverse happening, I just might be more agnostic about whether the source is OP or specific project leaders. Either or both can correct if they buy my story.

Ozzie GooenMay 8 202517

I see it a bit differently.

> For example, it doesn't seem like your project is at serious risk of defunding if you're 20-30% more explicit about the risks you care about or what personally motivates you to do this work.

I suspect that most nonprofit leaders feel a great deal of funding insecurity. There's always neat new initiatives that a group would love to expand to, and also, managers hate the risk of potentially needing to fire employees. They're often thinking about funding on the margins - either they are nervous about firing a few employees, or they are hoping to expand to new areas.

> There are probably only about 200 people on Earth with the context x competence for OP to enthusiastically fund for leading on this work

I think there's more competition. OP covers a lot of ground. I could easily see them just allocating a bit more money to human welfare later on, for example.

> My wish here is that specific people running orgs and projects were made of tougher stuff re following funding incentives.

I think that the issue of incentives runs deeper than this. It's not just a matter of leaders straightforwardly understanding the incentives and taking according actions. It's also that people will start believing things that are convenient to said incentives, that leaders will be chosen who seem to be good fits for the funding situation, and so on. The people who really believe in other goals often get frustrated and leave.

I'd guess that the leaders of these orgs feel more aligned with the OP agenda then they do the agenda you outline, for instance.

MjreardMay 8 202510

Agree on most of this too. I wrote too categorically about the risk of "defunding." You will be on a shorter leash if you take your 20-30% independent-view discount. I was mostly saying that funding wouldn't go to zero and crash your org.

I further agree on cognitive dissonance + selection effects.

Maybe the main disagreement is that OP is ~a fixed monolith. I know people there. They're quite EA in my accounting; much like I think of many leaders at grantees. There's room in these joints. I think current trends are driven by "deference to the vibe" on both sides of the grant-making arrangement. Everyone perceives plain speaking about values and motivations as cringe and counterproductive and it thereby becomes the reality.

I'm sure org leaders and I have disagreements along these lines, but I think they'd also concede they're doing some substantial amount of deliberate deemphasis of what they regard as their terminal goals in service of something more instrumental. They do probably disagree with me that it is best all-things-considered to undo this, but I wrote the post to convince them!

Will AldredMay 9 2025*13

For what it’s worth, I find some of what’s said in this thread quite surprising.

Reading your post, I saw you describing two dynamics:

Principles-first EA initiatives are being replaced by AI safety initiatives
AI safety initiatives founded by EAs, which one would naively expect to remain x-risk focused, are becoming safety-washed (e.g., your BlueDot example)

I understood @Ozzie’s first comment on funding to be about 1. But then your subsequent discussion with Ozzie seems to also point to funding as explaining 2.^[1]

While Open Phil has opinions within AI safety that have alienated some EAs—e.g., heavy emphasis on pure ML work^[2]—my impression was that they are very much motivated by ‘real,’ x-risk-focused AI safety concerns, rather than things like discrimination and copyright infringement. But it sounds like you might actually think that OP-funded AI safety orgs are feeling pressure from OP to be less about x-risk? If so, this is a major update for me, and one that fills me with pessimism.

^{^}
For example, you say, “[OP-funded orgs] bow to incentives to be the very-most-shining star by OP’s standard, so they can scale up and get more funding. I would just make the trade off the other way: be smaller and more focused on things that matter.”
^{^}
At the expense of, e.g., more philosophical approaches

MjreardMay 9 202513

I think OP and grantees are synced up on xrisk (or at least GCRs) being the terminal goal. My issue is that their instrumental goals seem to involve a lot of deemphasizing that focus to expand reach/influence/status/number of allies in ways that I worry lend themselves to mission/value drift.

Ozzie GooenMay 9 202512

Yea, I broadly agree with Mjreard here.

The BlueDot example seems different to what I was pointing at.

I would flag that lack of EA funding power sometimes makes xrisk less of an issue.

Like, some groups might not trust that OP/SFF will continue to support them, and then do whatever they think they need to in order to attract other money - and this often is at odds with xrisk prioritization.

(I clearly see this as a issue with the broader world, not with OP/SFF)

Arden KoehlerMay 20 2025*30

Hey Matt,

I share several of the worries articulated in this post.
I think you're wrong about how you characterise 80k's strategic shift here, and want to try to correct the record on that point. I'm also going to give some concrete examples of things I'm currently doing, to illustrate a bit what I mean, & also include a few more personal comments.

(Context: I run the 80k web programme.)

if you glorify some relatively-value-neutral conception of AI safety as the summum bonum of what is or used to be EA, there is just a good chance that you will lose the plot and end up not pursuing the the actual highest good, the good itself.

Well put. And I agree that there are some concerning signs in this direction (though I've also had countervailing, inspiring experiences of AIS-focused people questioning whether some prevailing view about what to do in AIS is actually best for the world.)

I'd also love to see more cause prioritisation research. And it's gonna be hard to both stay open enough to changing our minds about how to make the world better & to pursue our chosen means with enough focus to be effective. I think this challenge is fairly central to EA.

On 80k's strategic shift:

You wrote:

Perhaps the clearest and most predictive embodiment of the trend is 80,000 Hours’ new strategic focus on AI. 80k was always fundamentally about providing thorough, practical cause/intervention prioritization and that exercise can be fairly regarded as the core of EA. They’re now effectively saying the analysis is done: doing the most good means steering AI development, so we’ll now focus only on the particulars of what to do in AI. Thanks, we'll take it from here indeed.

How do we see the relationship between focusing on helping AGI go well and doing the most good?

It has always been the case that people and organisations need to find some intermediary outcome that comes before the good to point at strategically, some proxy for impact. Strategy is always about figuring out what's gonna be the biggest/most cost-effective causal factor for that (i.e. means), & therefore the best proxy to pursue.

We used to focus on career changes not necessarily inside one specific cause area but it was still a proxy for the good. Now our proxy for the good is helping people work on making AGI go well, but our relationship to the good is the same as it was before: trying our best to point at it, trying to figure out the best means for doing so.

EA values & ideas are still a really important part of the strategy.

We wrote this in our post on the shift:

As mentioned, we’re still using EA values (e.g. those listed here and here) to determine what to prioritise, including in making this strategic shift.

And we still think it’s important for people to use EA values and ideas as they’re thinking about and pursuing high-impact careers. Some particular examples which feel salient to us:

Scope sensitivity and thinking on the margin seem important for having an impact in any area, including helping AGI go well.
We think there are some roles / areas of work where it’s especially important to continually use EA-style ideas and be steadfastly pointed at having a positive impact in order for it to be good to work in the area. For example, in roles where it’s possible to do a large amount of accidental harm, like working at an AI company, or roles where you have a lot of influence in steering an organisation's direction.
There are also a variety of areas where EA-style thinking about issues like moral patienthood, neglectedness, leverage, etc. are still incredibly useful – e.g. grand challenges humanity may face due to explosive progress from transformatively powerful AI.

Though one might understandably worry that was paying lip service, just to reassure people. Let me talk about some internal recent goings-on off the top of my head, which hopefully do something to show we mean it:

1. Our internal doc on web programme strategy (i.e. the strategy for the programme I run) currently says that in order for our audience to actually have much more impact with their careers, engagement with the site ideally causes movement along at least 3[1] dimensions:

A. Awareness of transformative AI potential
B. EA-mindest (i.e. using ideas like impartiality, scope sensitivity, and thinking on the margin)
C. Career position

This makes re-designing the user flow post-strategic-shift a difficult balancing act/full of tradeoffs. How do we both quickly introduce people to AI being a big deal & urgent, and communicate EA ideas, plus help people shift their careers? Which do we do first?

We're going to lose some simplicity (and some people, who don't want to hear it) trying to do all this, and it will be reflected in the site being more complex than a strategy like "maximize for engagement or respectability" or "maximize for getting one idea across effectively" would recommend.

My view is that it's worth it, because there is a danger of people just jumping into jobs that have "AI" or even "AI security/safety" in the name, without grappling with tough questions around what it actually means to help AGI go well or prioritising between options based on expected impact.

(On the term "EA mindset" -- it's really just a nickname; the thing I think we should care about is the focus on impact/use of the ideas.)

2. Our CEO (Niel Bowerman) spent several weeks recently with his top proactive priority helping figure out the top priorities within making AGI go well – i.e. which is more pressing (in the sense of where can additional talented people do the most marginal good) between issues like AI-enabled human coups, getting things right with rights and welfare of digital minds, and catastrophic misalignment. We argued about questions like "how big is the spread between issues within making AGI go well?" and "to what extent is AI rights and welfare an issue human has to get right before AI becomes incredibly powerful, due to potential lock-in effects of bad discourse or policies?"

So, we agree with this:

exactly what people end up doing within “AI safety” matters enormously from the EA perspective. [...] We actually care about the overall moral value of the long run future. Making AI less racist or preventing its use in petty scams doesn’t really cut it in those terms.

In other words, the analysis, as you say, is not done. It's gonna be hecka hard to figure out "the particulars of what to do with AI." And we do not "have it from here" – we need people thinking critically about this going forward so they stand the best chance of actually helping AGI go well, rather than just having a career in "something something AI."

[1](I'm currently debating whether we should add a 4th: tactical sophistication about AI.)

MjreardMay 21 2025*19

My view is that it's worth it, because there is a danger of people just jumping into jobs that have "AI" or even "AI security/safety" in the name, without grappling with tough questions around what it actually means to help AGI go well or prioritising between options based on expected impact.

I appreciate the dilemma and don't want to imply this is an easy call.

For me the central question is all of this is whether you foreground process (EA) or conclusion (AGI go well). It seems like the whole space is uniformly rushing to foreground the conclusion. It's especially costly when 80k – the paragon of process discourse – decides to foreground the conclusion too. Who's left as a source of wisdom foregrounding process?

I know you'e trying to do both. I guess you can call me pessimistic that even you (amazing Arden, my total fav) can pull it off.

BellaMay 10 202527

I strongly agree with this part:

[T]he specifics of factory farming feel particularly clarifying here. Even strong-identity vegans push the horrors of factory farming out of their heads most of the time for lack of ability to bear it. It strikes me as good epistemic practice for someone claiming that their project most helps the world to periodically stare these real-and-certain horrors in the face and explain why their project matters more – I suspect it cuts away a lot of the more speculative arguments and clarifies various fuzzy assumptions underlying AI safety work to have to weigh it up against something so visceral. It also forces you to be less ambiguous about how your AI project cashes out in reduced existential risk or something equivalently important.

I think it's quite hard to watch slaughterhouse footage and then feel happy doing something where you haven't, like, tried hard to make sure it's among the most morally important things you could be doing.

I'm not saying everyone should have to do this — vegan circles have litigated this debate a billion times — but if you feel like you might be in the position Matt describes, watch Earthlings or Dominion or Land of Hope and Glory.

NickLaingMay 8 202523

This is one of my favorite posts this year, as cooling as it is. You've articulated better than I could some of the discomforts I have. Great job.

VanessaMay 20 202515

I'm a vegan existential AI safety researcher. I once identified as EA, now as EA-adjacent. So, superficially, I'm part of the problem you describe. However, my reasons for not identifying as EA anymore have nothing to with FTX or other PR concerns. It's not a "mask". I just have philosophical disagreements with EA, coming out of my own personal growth, that seem sufficiently significant to be acknowledged.

To be clear, I'm very grateful to EA donors and orgs for supporting my research. I think that both EAs in AI safety and EAs more broadly are doing tonnes of good, for which they genuinely deserve my and most of everyone's gratitude and praise.

At the same time, it's a perfectly legitimate personal choice to not identify as EA. Moreover, the case for the importance of AI X-safety doesn't rest on EA assumptions (some of which I reject), but is defensible much more broadly. And, there is no reason that every individual or organization working on AI X-safety must identify as EA or recruit only EA-aligned personnel. Even if they have history with EA or funding from EA etc.

Let's keep cooperating and accomplishing great things, but let's also acknowledge each other's right to ideological pluralism.

MjreardMay 21 202511

Thanks Vanessa, I completely agree on the meta level. No one owes "EA" any allegiance because they might have benefitted from it in the past or benefitted from its intellectual progeny and people are of course generally entitled to change their minds and endorse new premises.

Your comment *is a very meta comment though* and leaves open the possibility that you're post hoc rationalizing following a trend that I see as starting with Claire Zebel's post "EA and Longtermism, not Cruxes for Saving the World," which I see as pretty paradigmatic of "the particular ideas that got us here (AI X-safety) no longer [are/feel] necessary, and seem inconvenient to where we are now in some ways, so let's dispense with them."

There could be fine object-level reasons for changing your mind on which premises matter of course and I'm extremely interested to hear those. In the absence of those object-level reasons though, I worry!

I'm still trying to direct the non-selfish part of myself towards scope-sensitive welfarism in a rationalisty way. For me that's EA. Others, including maybe you, seem to construe it as something narrower than that and I wonder both what that narrow conception is and whether its fair to the public meaning of the term "Effective Altruism."

VanessaMay 22 20254

I intentionally stayed meta because I didn't especially want to start an argument about EA premises. Concretely, my disagreements with EA are, that I don't believe in any of:

Moral realism
Radical impartiality
Utilitarianism
Longtermism

I view improving the world as an enterprise of collective rationality / cooperation, not a moral imperative (I don't believe in moral imperatives). I care much more about the people (and other creatures) closer to me in the social graph, but I also want to cooperate with other people for mutual gain, and in particular endorse/promote social norms that create incentives beneficial for most of everyone (e.g. reward people for helping others / improving the world).

Why I changed some of my views in this particular direction is a long story, but it involved a lot of reflection and thinking about my preferences on different levels of abstraction (from "how do I feel about such-and-such particular situation" to "what could an abstract mathematical formalization of my preferences look like").

Clara Torres Latorre 🔸May 21 20255

I am in a similar boat as you. I don't feel comfortable being identity-EA because I have some core philosophical disagreements.

However, I have been inspired by EA to the point of making some substantive life changes, and participate in my local EA group. I try to do things that are convincing enough for their own sake, even though I do not necessarily agree with all the premises.

I believe there is value to participation in the whatever-ist party, even if you are not comcortable calling yourself a whatever-ist, not because of ideological purity, but because it doesn't even feel true.

LeahCMay 9 202515

I'm more optimistic that people who showed up because they wanted to do the most good still believe in it. Even time spent with "EA-adjacent-adjacent-etc." people is refreshing compared to most of my work in policy, including on behalf of EA organizations.

Community groups, EAGs, and other events still create the space for first principles discussions that you're talking about. As far as I know, those spaces are growing. Even if they weren't, I can't remember a GCR-focused event that served non-vegan food, including those without formal EA affiliations. It's a small gesture, but I think it's relevant to some of the points made in this post.

I understand picking on BlueDot because they started as free courses designed for EAs who wanted to learn more about specific cause areas (AI safety, biosecurity, and alternative proteins, if I remember correctly). They are now much larger, focused exclusively on AI, and have a target audience that goes beyond EA and may not know much about GCRs coming into the course. The tradeoffs they make for the increased net is between them and their funders, and does not necessarily speak to the values of the people running the course.

Chris LeongMay 9 202512

I'm focused on AI safety, so I'm glad to see so many folk pivot to AIS, but I am simultaneously worried that so much top talent being sucked into AIS will undermine many EA communities. Similarly, I do have a slight degree of concern about EA memes being important for maintaining our epistemics, but I haven't seen enough effects yet to warrant a higher level of concern.

SummaryBotMay 8 20258

Executive summary: This personal reflection argues that many prominent Effective Altruists are abandoning EA principles as they rebrand themselves solely as "AI safety" workers, risking the loss of their original moral compass and the broader altruistic vision that initially motivated the movement.

Key points:

There's a concerning trend of former EA organizations and individuals rebranding to focus exclusively on AI safety while distancing themselves from EA principles and community identity.
This shift risks making instrumental goals (building credibility and influence in AI) the enemy of terminal goals (doing the most good), following a pattern common in politics where compromises eventually hollow out original principles.
The move away from cause prioritization and explicit moral reflection threatens to disconnect AI safety work from the fundamental values that should guide it, potentially leading to work on less important AI issues.
Organizations like 80,000 Hours shifting focus exclusively to AI reflects a premature conclusion that cause prioritization is "done," potentially closing off important moral reconsideration.
The author worries that by avoiding explicit connections to EA values, new recruits and organizations will lose sight of the ultimate aims (preventing existential risks) in favor of more mainstream but less important AI concerns.
Regular reflection on first principles and reconnection with other moral causes (like animal suffering and global health) serves as an important epistemic and moral check that AI safety work genuinely aims at the greatest good.

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

EffectiveAdvocate🔸May 16 20258

I’m still figuring out how I feel about all of this, and I’m not yet sure how it should change my behaviour. But I wanted to say that this post has stayed with me since I read it, thank you for writing it.

I also find it quietly reassuring that Mjreard, of all people, is engaging with this topic; someone I think of as unsentimental, focused on what truly matters, and a sharp critic of ideas. I’m not sure I entirely endorse the comfort I take from that, but it does make it feel more socially acceptable to be concerned about these shifts too.

Tom_DavidsonJun 7 20255

Can you give more examples of where ppl are getting this wrong?

I support the 80k pivot, and the blue dot page seems ok (but yes, I'd maybe prefer smg more opinionated).

While these concerns make sense in theory I'm not sure whether it's a problem in practice

T_WJun 23 20252

You might be looking for something larger, but as a bit of anecdata, I found myself at LISA post-EAG and, much to my surprise, found that not even the majority of the food they were offering was vegan. IIRC, last time I was there it was fully vegan, so that was a bit of a shock, and a potential sign of the times.

James FodorMay 23 20256

Given the timelines that are most popular these days, there will have to be a reckoning by the end of this decade, one way or the other.

NickLaingMay 26 20252

This is a really important comment. I hope there will be enough humility to reconsider if it doesn't happen...

Owen Cotton-BarrattMay 27 202513

I feel like voicing that I centrally expect AI to continue to have bigger real-world impacts, but not get very weird until the 2030s. I think worlds where things go faster than that are a serious enough possibility to take seriously, but I think that the apparent zeitgeist suggests timelines which are a bit more aggressive than I think is justified.

The reason I want to say something is that I sort of suspect there are a bunch of people in a similar epistemic position -- where it doesn't seem like a priority to properly explore what % to put on craziness this decade; nor to get into big arguments about whether the zeitgeist is slightly off -- but for whom your comment might feel like something of a trap.

ArepoMay 15 20255

I was nodding along until I got to here:

Some reduce the problem to AI-not-kill-everyone-ism, which seems straightforward enough and directed and the most robust source of value here,

By any normal definition of 'robust', I think this is the opposite of true. The arguments for AI extinction are highly speculative. By the arguments that increasingly versatile AI destabilises the global economy and/or military are far more credible. Many jobs already seem to have been lost to contemporary AI, and OpenAI has already signed a deal with autonomous arms dealer Anduril.

I think it's not hard to imagine worlds where even relatively minor societal catastrophes significantly increase existential risk, as I've written about elsewhere, and AI credibly (though I don't think obviously) makes these more likely.

So while I certainly wouldn't advocate the EA movement pivoting toward soft AI risk or even giving up on extinction risk entirely, I don't see anything virtuous in leaning too heavily into the latter.

MjreardMay 16 20256

If your AI work doesn't ground out in reducing the risk of extinction, I think animal welfare work quickly becomes the more impactful than anything AI. Xrisk reduction can be through more indirect channels, of course, though indirectness generally increases speculativeness of the xrisk story.

ArepoMay 17 20252

There are many ways to reduce existential risk. I don't see any good reason to think that reducing small chances of extinction events is better EV than reducing higher chances of smaller catastrophes, or even just building human capacity in preferentially non-destructive way. The arguments that we should focus on extinction have always boiled down to 'it's simpler to think about'.

Yarrow Bouchard 🔸May 9 20253

This is a Forum Team crosspost from Substack.

What does this mean? Is the author of this post, Matt Reardon, on the EA Forum team? Or did a moderator/admin of the EA Forum crosspost this from Matt Reardon's Substack, under Matt's EA Forum profile?

MjreardMay 10 202515

Admin posted under my name after asking permission. It's cool they have a system for accommodating people like me who are lazy in this very specific way

Milan Weibel🔹May 14 20253

It would probably be good AI Safety orgs explicitly and prominently endorsed the CAIS statement on AI Risk:
"Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war."

Alex (Αλέξανδρος)May 9 20251

Why not form an informal group within EA which shares the values and views that you outlined in your post?

"For those whose EA retreat experience is mostly pre-2023 like me, both the numbers and reactions here are kind of shocking. I would have expected the retreat to be ~70% vegetarian and for most of the response to be hard-nosed questions about the most effective interventions, not “huh, so do you think any charities actually work?”

Yarrow Bouchard 🔸May 9 2025*-9