This is a Forum Team crosspost from Substack.
Whither cause prioritization and connection with the good?
There’s a trend towards people who once identified as Effective Altruists now identifying solely as “people working on AI safety.”[1] For those in the loop, it feels like less of a trend and more of a tidal wave. There’s an increasing sense that among the most prominent (formerly?) EA orgs and individuals, making AGI go well is functionally all that matters. For that end, so the trend goes, the ideas of Effective Altruism have exhausted their usefulness. They pointed us to the right problem – thanks; we’ll take it from here. And taking it from here means building organizations, talent bases, and political alliances at a scale incommensurate with attachment to a niche ideology or moralizing language generally. I think this a dangerous path to go down too hard and my impression is the EAs are going down the path quite hard.
I’ll acknowledge right off the bat that it is emphatically not the case that everyone doing some version of rebranding themselves to AI safety is making a mistake at all or even harming the ideas of EA on balance. A huge EA insight relative to other moral schools of thought is that excessive purity about particular means and motivation often comes at too great a cost to good outcomes in the world and you should be willing to trade these off at least somewhat. It is definitely the case that specific labels, ideologies, and communities bring with them baggage that makes building broad alliances unnecessarily difficult and not everyone working on AI safety or any other EA-inspired cause should feel obligated to foreground their inspiration and the intellectual history that led them to do what they're doing.
A central point of my previous post on roughly this topic is that people have crossed the line of merely not-foregrounding EA towards things that look more like active disparagement. That seems like a straightforward mistake from many perspectives. Smart people will draw a line from Effective Altruism to specific perspectives on AI safety and associate one with the other. If you disparage EA, you disparage the specific parts of AI safety you, as part of the EA progeny, are supposed to care most about.
The worry I want to express here is sort of the inverse: if you glorify some relatively-value-neutral conception of AI safety as the summum bonum of what is or used to be EA, there is just a good chance that you will lose the plot and end up not pursuing the the actual highest good, the good itself.
What I see
The [fictionalized] impetus for writing this post came from going to a retreat held by what used to be a local EA group that had rebranded to a neutral name while keeping the same basic cause portfolio. The retreat was about 25 people, and I’d guess only 3-4 were vegan/vegetarian. Beyond that, when someone gave a lightning talk on earning to give to the presumably-relatively-high-context attendees, it seemed to go over like it might with a totally neutral, cold audience. Most nodded along, but didn’t engage; a few came up to ask typical questions re e.g., shouldn't the government do this, what about loans; and a few were mildly offended/felt put-upon.
For those whose EA retreat experience is mostly pre-2023 like me, both the numbers and reactions here are kind of shocking. I would have expected the retreat to be ~70% vegetarian and for most of the response to be hard-nosed questions about the most effective interventions, not “huh, so do you think any charities actually work?” As you might predict, almost all the rest of the retreat was split between technical AI safety and AI policy, with some lip service to biosecurity along the way.
Perhaps the clearest and most predictive embodiment of the trend is 80,000 Hours’ new strategic focus on AI. 80k was always fundamentally about providing thorough, practical cause/intervention prioritization and that exercise can be fairly regarded as the core of EA. They’re now effectively saying the analysis is done: doing the most good means steering AI development, so we’ll now focus only on the particulars of what to do in AI. Thanks, we'll take it from here indeed.
Now, even though it’d be easy to frame these moves as reacting to external evidence – perhaps laudably noticing the acceleration of AI capabilities, and perhaps less laudably wanting to cut ties with the past after FTX – one claim is that this is a turn towards greater honesty and transparency with audiences. To some degree, it has always been the case that AI career changes have been the primary measure of success of EA commun– ahem– field-building programs and now we’re just being clearer about what we want and hope for from participants.
This response seems question-begging in this context. Do we want people to work on AI safety or do we want them to do the most good, all things considered? Arguably, we genuinely wanted the latter, so the process mattered here. Maybe someone’s personal fit and drive for animals really did make that the better overall outcome. Maybe we were wrong about some key assumption in the moral calculus of AI safety and would welcome being set straight.
Even putting the question begging concern to the side, exactly what people end up doing within “AI safety” matters enormously from the EA perspective. Don’t you remember all the years, up to and including the present, where it was hard to know whether someone really meant what we thought (or hoped) they did when they said “AI safety?” We actually care about the overall moral value of the long run future. Making AI less racist or preventing its use in petty scams doesn’t really cut it in those terms.
Some reduce the problem to AI-not-kill-everyone-ism, which seems straightforward enough and directed and the most robust source of value here, but I notice people in more sophisticated (and successful) orgs are skittish about parsing things in those terms, lest they turn off the most talented potential contributors and collaborators.
Even this assumes, however, that the problem and its dimensions are and will remain simple enough to communicate in principle without needing to delve into any philosophy or moralizing about the kind of future we want. The obviously-biggest bads will be obvious, and so too the obviously-biggest goods. Thank goodness that our new, most highly capable contributors won’t need to know the ins and outs of our end goals in order to drive progress towards them, they’d be a lot harder to recruit otherwise.
The threat means pose to ends
And this strategy spawns things like the BlueDot curriculum, whose most digestible summary reading on risks from AI covers discrimination, copyright infringement, worker exploitation, invasions of privacy, reduced social connection, and autonomous vehicle malfunctions before touching on what I might call “real risks.”[2] It might not be so bad if this was all just due diligence to cast the widest possible net before, in the course itself, participants would compare the seriousness of these risks. But on multiple occasions, I’ve had the sad experience of speaking to someone who had completed the course who seemed to not even have an awareness of existential risks as a concern.
I understand the temptation. The people I spoke to in this context were very impressive on paper. So you give them the course they want to take and maybe they get excited about doing work at an org you think is doing great and important work on AI. Once they’re there, they’ll catch on and see what’s up, or at least enough of them will do that to make this all worthwhile.
Well, then there’s the orgs. They’re also taking more and more steps to garner conventional credibility by working on more mundane and lower stakes questions than those aimed squarely at value. And it’s working. For those in the know, it’s hard to deny these EA-founded orgs are getting more prominent: better talent, more connections, more influence. A lot of it is a traceable consequence of moderating. The plan is that once there are clearer levers to pull to reduce existential risk (and I agree there aren’t really hugely ripe policy opportunities or ideas for this now), they’ll be in a great position to pull them.
Perhaps you see the worry. Compromise your goals now, pander to your constituents now, and later you’ll be able to cash it all in for what you really care about. The story of every politician ever. Begin as a young idealist, start making compromises, end up voting to add another $5 trillion to the debt because even though you’re retiring next term, you’d hate not to be a team player when these midterms are going to be so. close.
This isn’t just a problem for politics and public-facing projects. It’s a deep weakness of the human condition. People will often decide that some particular partner, or house, or car, or job, or number of kids will make them happy. So they fixate on whatever specific instrument of happiness they chose and after enough time goes by, they fully lose their original vision in the day-to-day onrush of things involved in pursuing the instrument. It’s much easier to simply become the mask you put on to achieve your goals than it is to always remind yourself it’s a mask. In competitive, high-stakes, often zero-sum competitions like policy, it is even harder to pay the tax of maintaining awareness of your mask, lest you fall behind or out of favor, and this is exactly the situation I see AI safety orgs headed towards.
All the same, I don’t think we’re at a point of crisis. None of these tradeoffs seem too dumb at the moment (with some exceptions) and I generally trust EAs to be able to pull off this move more than most. But we’re not setting ourselves up well to escape this trap when we consciously run away from our values and our roots. Likewise when we don't acknowledge or celebrate people doing the hard work of reflecting directly on what matters in the present. This all corresponds too neatly to the hollowing out of principles-first EA community building, either from 80k or from local groups or university groups converting to AI safety or, tellingly, “AI Security” programs.
The social signals are also powerful. The serious, important people no longer dabble in cause prioritization, obsess about personal donations, or debate population ethics. They build fancy think tanks staffed with Ivy Leaguers and take meetings with important people in government and tech about the hot AI topic of the month. And so community builders take their cues and skip as much of the background ethics, assumptions, and world modeling as they can to get their audiences looking and acting like the big people as fast as possible.
Again, the fanciness and the meetings are good moves and have a lot of value, but if the people executing them never show up to EAG or speak to a university group about the fundamentals, when are they reflecting on those? Even back when they did do that, was it all so clear and resolved that it’d be easy to pick up again in 5 years when you need it? And what will the composition of all your new collaborators be by then? Will they have done any of this reflection or even be on board for the maybe-unpopular actions it recommends?
Losing something more
Beyond possibly falling into a classic trap of making your instrumental goals the enemy of your terminal goals, motivations and reflection just matter a lot for their own sake. If you don’t check in on yourself and your first principles, you’re at serious risk of getting lost both epistemically and morally. When you make arguments aimed at giving you power and influence, the next tradeoff you make is how much scrutiny to give instrumentally useful arguments, hires, and projects.
Another byproduct of checking in from first principles is who and what it connects you with. Everyone knows the vegans are the good guys. You should regard feeling alien and disconnected from them as a warning sign that you might not be aiming squarely as the good. And the specifics of factory farming feel particularly clarifying here. Even strong-identity vegans push the horrors of factory farming out of their heads most of the time for lack of ability to bear it. It strikes me as good epistemic practice for someone claiming that their project most helps the world to periodically stare these real-and-certain horrors in the face and explain why their project matters more – I suspect it cuts away a lot of the more speculative arguments and clarifies various fuzzy assumptions underlying AI safety work to have to weigh it up against something so visceral. It also forces you to be less ambiguous about how your AI project cashes out in reduced existential risk or something equivalently important. Economizing on the regulatory burden faced by downstream developers? Come now, is that the balance in which the lightcone hangs?
Then there is the burden of disease for humans. The thing I suspect brought most now-AI-safety people into the broader ecosystem. Mothers burying their children. The amount of money that you would personally sacrifice to stop it or some equivalent nightmare. Both this problem and this mode of thinking about tradeoffs are greatly if not wholly deemphasized in circles where they were once the cornerstones of how to think about your potential effects on the world. Sure, you don’t want to miss out on a phenomenal safety engineer because you offered too small a salary or set too strong an example of personal poverty, but is there really no place for this discourse nearby to you? Is this something you want distance from?
The confession I’ll make at this especially-moralizing juncture is that, ironically, I am a bad EA for basically the opposite reason that the AI safety identitarians are bad EAs. They care so much about putting every last chip down on the highest-marginal-EV bet that they risk losing themselves. I wallow in my conservatism and abstraction because I care more about the idea of EA than impact itself. That – my part – is really not what it’s supposed to be about.
You, reader, are not doomed to fall into one or the other of these traps though. There are people like Joe, or Benjamin, or Rohin, or Neel who do very impressive and important work on AI safety that is aimed where the value lies, but to my eye they also keep in touch with their moral compasses, with the urgency of animal and human suffering, and with the centrality of goodness itself. As individuals, I don’t think any of them disparage or even belittle by implication the practice of doing serious cross-cause prioritization.
Obviously, this is easier to do as an individual than as an organization. There’s clearly value to an organization making itself more legibly open to a broader range of partners and contributors. But as with all things, influence flows both ways. Your organization’s instrumental goals can rub off on you and how you orient yourself towards your life and work. Your terminal goals can be a north star for the shape your projects and initiatives take, even if there are hard tradeoffs to be made along the way. I worry that the people who care most about doing good in the world are being tempted by the former and becoming increasingly blind to the latter. I worry it’s being socially reinforced by people with weaker moral compasses who haven’t really noticed it’s a problem. I want both groups to notice and each of us individually to be the people we actually want to be.
- ^
I would say “AI safety advocates,” but as will become clear, “advocacy” connotes some amount of moralizing and moralizing is the thing from which people are flinching.
- ^
I pick on BlueDot because they’re the most public and legible, but I’ve seen even worse and more obfuscated curriculums on these terms from groups aiming at something very different than what their courses suggest.
Hey Matt,
(Context: I run the 80k web programme.)
Well put. And I agree that there are some concerning signs in this direction (though I've also had countervailing, inspiring experiences of AIS-focused people questioning whether some prevailing view about what to do in AIS is actually best for the world.)
I'd also love to see more cause prioritisation research. And it's gonna be hard to both stay open enough to changing our minds about how to make the world better & to pursue our chosen means with enough focus to be effective. I think this challenge is fairly central to EA.
On 80k's strategic shift:
You wrote:
How do we see the relationship between focusing on helping AGI go well and doing the most good?
It has always been the case that people and organisations need to find some intermediary outcome that comes before the good to point at strategically, some proxy for impact. Strategy is always about figuring out what's gonna be the biggest/most cost-effective causal factor for that (i.e. means), & therefore the best proxy to pursue.
We used to focus on career changes not necessarily inside one specific cause area but it was still a proxy for the good. Now our proxy for the good is helping people work on making AGI go well, but our relationship to the good is the same as it was before: trying our best to point at it, trying to figure out the best means for doing so.
EA values & ideas are still a really important part of the strategy.
We wrote this in our post on the shift:
Though one might understandably worry that was paying lip service, just to reassure people. Let me talk about some internal recent goings-on off the top of my head, which hopefully do something to show we mean it:
1. Our internal doc on web programme strategy (i.e. the strategy for the programme I run) currently says that in order for our audience to actually have much more impact with their careers, engagement with the site ideally causes movement along at least 3[1] dimensions:
This makes re-designing the user flow post-strategic-shift a difficult balancing act/full of tradeoffs. How do we both quickly introduce people to AI being a big deal & urgent, and communicate EA ideas, plus help people shift their careers? Which do we do first?
We're going to lose some simplicity (and some people, who don't want to hear it) trying to do all this, and it will be reflected in the site being more complex than a strategy like "maximize for engagement or respectability" or "maximize for getting one idea across effectively" would recommend.
My view is that it's worth it, because there is a danger of people just jumping into jobs that have "AI" or even "AI security/safety" in the name, without grappling with tough questions around what it actually means to help AGI go well or prioritising between options based on expected impact.
(On the term "EA mindset" -- it's really just a nickname; the thing I think we should care about is the focus on impact/use of the ideas.)
2. Our CEO (Niel Bowerman) spent several weeks recently with his top proactive priority helping figure out the top priorities within making AGI go well – i.e. which is more pressing (in the sense of where can additional talented people do the most marginal good) between issues like AI-enabled human coups, getting things right with rights and welfare of digital minds, and catastrophic misalignment. We argued about questions like "how big is the spread between issues within making AGI go well?" and "to what extent is AI rights and welfare an issue human has to get right before AI becomes incredibly powerful, due to potential lock-in effects of bad discourse or policies?"
So, we agree with this:
In other words, the analysis, as you say, is not done. It's gonna be hecka hard to figure out "the particulars of what to do with AI." And we do not "have it from here" – we need people thinking critically about this going forward so they stand the best chance of actually helping AGI go well, rather than just having a career in "something something AI."
[1](I'm currently debating whether we should add a 4th: tactical sophistication about AI.)
I appreciate the dilemma and don't want to imply this is an easy call.
For me the central question is all of this is whether you foreground process (EA) or conclusion (AGI go well). It seems like the whole space is uniformly rushing to foreground the conclus... (read more)