Hide table of contents

Summary

As part of the EA Strategy fortnight, I am sharing a reflection on my experience doing AI safety movement building over the last year, and why I am more excited about more efforts in the space compared to EA movement-building. This is mostly due to the relative success of AI safety groups compared to EA groups at universities with both (e.g. read about Harvard and MIT updates from this past year here). I expect many of the takeaways to extend beyond the university context. The main reasons AI safety field building seems more impactful are:

  • Experimental data from universities with substantial effort put into EA and AI safety groups: Higher engagement overall, and from individuals with relevant expertise, interests, and skills
  • Stronger object-level focus encourages skill and knowledge accumulation, offers better career capital, and lends itself to engagement from more knowledgeable and senior individuals (including graduate students and professors). 
  • Impartial/future-focused altruism not being a crux for many for working on AI safety
  • Recent developments increasing the salience of potential risks from transformative AI, and decreasing the appeal of the EA community/ideas. 

I also discuss some hesitations and counterarguments, of which the large decrease in neglectedness of existential risk from AI is most salient (and which I have not reflected too much on the implications of yet, though I still agree with the high-level takes this post argues for). 

Context/Why I am writing about this

I helped set up and run the Cambridge Boston Alignment Initiative (CBAI) and the MIT AI Alignment group this past year. I also helped out with Harvard’s AI Safety team programming, along with some broader university AI safety programming (e.g. a retreat, two MLAB-inspired bootcamps, and a 3-week research program on AI strategy). Before this, I ran the Stanford Existential Risks Initiative and effective altruism student group and have supported many other university student groups.

Why AI Safety Field Building over EA Community Building

From my experiences over the past few months, it seems that AI safety field building is generally more impactful than EA movement building for people able to do either well, especially at the university level (under the assumption that reducing AI x-risk is probably the most effective way to do good, which I assume in this article). Here are some reasons for this:

  1. AI-alignment-branded outreach is empirically attracting many more students with relevant skill sets and expertise than EA-branded outreach at universities. 
    1. Anecdotal evidence: At MIT, we received ~5x the number of applications for AI safety programming compared to EA programming, despite similar levels of outreach last year. This ratio was even higher when just considering applicants with relevant backgrounds and accomplishments. Around two dozen winners and top performers of international competitions (math/CS/science olympiads, research competitions) and students with significant research experience engaged with AI alignment programming, but very few engaged with EA programming. 
    2. This phenomenon at MIT has also roughly been matched at Harvard, Stanford, Cambridge, and I’d guess several other universities (though I think the relevant ratios are slightly lower than at MIT). 
    3. It makes sense that things marketed with a specific cause area (e.g. AI rather than EA) are more likely to attract individuals highly skilled, experienced, and interested in topics relevant to the cause area.
  2. Effective cause-area specific direct work and movement building still involves the learning, understanding, and application of many important principles and concepts in EA:
    1. Prioritization/Optimization are relevant, to maximally reduce existential risk.
      1. Relatedly, consequentialism/effectiveness/focusing on producing the best outcomes and what actually works, as well as willingness to pivot, seem important to emphasize as part of strong AI safety programming and discussions. 
      2. Intervention neutrality—Even within AI alignment, there are many ways to contribute: conceptual alignment research, applied technical research, lab governance, policy/government, strategy research, field-building/communications/advocacy, etc. Wisely determining which of these to focus on requires engagement with many principles core to EA.   
      3. (Low confidence) So far, I’ve gotten the impression that the students who have gotten most involved with AIS student groups are orienting to the problem with a “How can I maximally reduce x-risk?” frame, not “Which aspect of the problem seems most intellectually stimulating?”.
    2. The existential vs. non-existential risks distinction remains relevant, to prioritize mitigating the former
      1. This distinction also naturally leads to discussion about population ethics, moral philosophy, altruism (towards future generations), and other related ideas.
    3. Truth-seeking and strong epistemics remain relevant.
      1. Caveat: Empirically, maintaining strong epistemics and a culture of truth-seeking have not been emphasized as much in AIS groups from my experience, and it feels slightly unnatural to do so (though I think the case for its importance can be made pretty straightforwardly given how confusing AI and alignment is, the paucity of feedback loops, and the importance of prioritization given limited time and resources). 
    4. When much of the cause-area specific field-building work is done by EAs, and much of the research/content engaged with is from EAs, people will naturally interact with EAs, and some will be sympathetic to the ideas. 
  3. Cause-area specific movement building incentivizes a strong understanding of cause area object-level content, which both acts as a selection filter (which standard EA community building lacks), and helps make movement-builders better suited to pivot to object-level work. This makes organizing especially appealing for students who might not want to commit to movement building work long-term.
    1. I think it is useful for people running cause-area specific movement building projects (including student groups) to be pretty motivated to have their group maximally mitigate existential risk/improve the long-term future, since doing the aforementioned prioritization well and creating/maintaining strong culture (with e.g. high levels of truth-seeking, and a results-focused framework) is difficult and unlikely without these high-level goals. 
    2. A stronger object-level focus also makes engagement more appealing to individuals with subject matter expertise, like graduate students and professors. Empirically, grad student and professor engagement has been much stronger and more successful with AI safety groups than EA/existential risk focused groups so far. 
  4. The words “effective altruism” do not really elicit what I believe is most important and exciting about EA principles and the community, and what many of us currently think is most important to work on (e.g. global/universal impartial focus, prioritization/optimization, navigating and improving technological development and addressing its risks, etc).
    1. AI risk, existential risk, and longtermism get at some items listed above, but maybe don’t get at prioritization/optimization well. Still, perhaps STEM-heavy cause area programming naturally attracts people interested in applying optimization to real life. 
  5. The reputation of the EA community and name has (justifiably) taken a big hit in light of the several recent scandals, making EA CB look worse. On the other hand, AI alignment has been getting a ton of positive attention and concern from the general public and relevant stakeholders. 
    1. That being said, the effects of the scandals on top university students’ perception of EA seem much smaller than I initially expected (e.g. most people think of the FTX crash as an example of crypto being crazy/fake). According to a Rethink Priorities survey only 20% of people who have heard about EA have heard about FTX. 
  6. Not needing to externally justify expenditures on common-sense altruistic grounds: Many of the community building interventions that seem most exciting involve spending money in ways that seem unusual in a university or common-sense altruistic context (e.g. group organizing salaries and costs, organizing workshops at large venues, renting office spaces). I think that some of these are more socially acceptable when not done in the name of ‘altruism’ or charity even if the group has similar motivations to EA groups in its culture (or at the very least this helps to insulate EA from some negative reputational effects).
  7. Anecdotally, impartial/future-focused altruism is not the primary motivation for a large portion of individuals working full-time on AI existential risk reduction (and maybe the majority). Impartial altruism does not seem like the most compelling way one would get people to seriously consider working on existential risk reduction, as is discussed herehere, and here.


 

Counterarguments and Hesitations

  • I have not been working on AI safety/cause-area specific movement building for long enough (and AIS groups in general have not been very active for long enough) to feel confident that exciting leading indicators will translate into long-term impact. EA community building has a longer track record. The small sample sizes also reduce my confidence in the above takeaways.
  • Perhaps strong philosophical/ethical commitments (as opposed to say visceral urgency/concern and amazement at the capabilities of AI, or its rate of improvement) end up being more important than I currently estimate for long-term changes to career plans and behavior more generally. 
  • Maybe the non-altruistic case for existential risk mitigation isn’t sound, e.g. because someone’s likelihood of being able to contribute is too low to justify working on x-risk reduction, instead of achieving their goals another way. If so, maybe insufficiently altruistically motivated people will realize this and pivot to something else.
  • Figuring out what is true and helpful in the context of AI safety might be sufficiently difficult that the downsides of movement building and outreach (e.g. lower epistemic standards and lower-quality content on e.g. LessWrong/the alignment forum) might outweigh the upsides (e.g. more motivated/talented people working on AI alignment). 
  • AI safety is getting more mainstream than EA. Many of the people I expect to be most impactful would not have initially gotten involved with an AI safety group, but got into EA first and eventually switched to AI (though others like Open Philanthropy would have a better sense of this). The huge increase in discourse and attention on advanced AI might make the usefulness of proactive outreach and education about AI safety much lower moving forward than it was half a year ago. 
  • Historically, AI-alignment-driven writing and field-building seems to have significantly contributed to (speeding up) AI capabilities—potentially more than it has contributed alignment/making the future better. AI alignment field-building might continue (or start to) have this effect. 
    • My current intuition is: AGI hype has gotten high enough that the ratio of median capabilities researchers to safety researchers that would be beneficial from CB is pretty high (maybe >10:1, not sure), and definitely higher than what leading indicators suggest is produced by field-building at the moment.

Conclusion 

On the margin, I’d direct more resources towards AI safety movement building, though I still think EA movement-building can be very valuable and should continue to some extent. I’d be interested in hearing others’ experiences and thoughts on AI safety and other cause area field building compared to EA CB in the comments. 


 

Comments16
Sorted by Click to highlight new comments since:

Explicitly switching to AI only seems like a case of putting all our eggs in one highly speculative basket. We don't know how the case for AI safety will stack up in 10 years: if we commit too hard and it turns out to be overblown, will EA as a movement be over? 

I think the premise that EA will be over because of AI safety community building is confused given this is on the margins and EA movement building literally still exists? There's literally a companion piece to this by Jessica McCurdy about EA community building on AI Safety specific community building. I also don't think this piece makes the case for every resource to go to AI Safety community building. 

In case anyone is interested, here is that piece

Anecdotal evidence: At MIT, we received ~5x the number of applications for AI safety programming compared to EA programming, despite similar levels of outreach last year. This ratio was even higher when just considering applicants with relevant backgrounds and accomplishments. Around two dozen winners and top performers of international competitions (math/CS/science olympiads, research competitions) and students with significant research experience engaged with AI alignment programming, but very few engaged with EA programming. 

Dunno what the exact ratio would look like (since the different groups run somewhat different kinds of events), but we've definitely seen a lot of interest in AIS at Carnegie Mellon as well. There's also not very much overlap between the people who come to AIS things and those who come to EA things.

Thanks, you make a compelling argument for AI safety movement building. I especially like that you have a lot of experience with community building already to draw these conclusions from. However I think you might be (perhaps unintentionally) setting up the impression of a false dichotomy here between general EA community building and AI safety community building.

I might be wrong, but perhaps you are saying that EA should intentionally more heavily support the budding AI alignment community than they are now, and in some cases this community should be prioritised more than funding over other EA groups? That would seem reasonable to me at least. Your conclusion of "On the margin, I’d direct more resources towards AI safety movement building, though I still think EA movement-building can be very valuable and should continue to some extent."  seems to back up my take?

It makes sense to me that EA funds could experiment in investing a decent amount in communities built specifically around AI safety, then gather data for a couple of years and see if it produces both a consistent community and fruitful counterfactual AI safety efforts. Its seems likely these communities could be intertwined and connected with current EA communities to different extents in different places), but it could also be very separate. This might already be an explicit plan which is happening and I've missed it.

Also, initial recruitment numbers only tell part of the effectiveness story. One of the strengths of EA is that people, once joining the community often...
1. Devote a decent part of their life/time/resources to the community and the work
2. Have a decent likelihood of being in it for the long term (This must be quantified somewhere too)

Whether these features would also be present in an AI safety community remain to be seen.

Like titotal said, I don't think a drastic pivot pulling a huge amount of money away from EA community building and towards AI safety groups would be a great strategic move. Putting all our eggs in one basket and leaving established communities high and dry seems like a bad move  - mind you I don't think that will happen anyway.

Final Question "Anecdotally, impartial/future-focused altruism is not the primary motivation for a large portion of individuals working full-time on AI existential risk reduction (and maybe the majority)."  If not this, then what is their motivation outside of perhaps selfish fear for themselves or their family? I'm genuinely intrigued here.

Nice one!

the ratio of median capabilities researchers to safety researchers that would be beneficial from CB is pretty high (maybe >10:1, not sure), and definitely higher than what leading indicators suggest is produced by field-building at the moment.

What's your current best-guess for what the leading indicators would suggest?

I would guess the ratio is pretty skewed in the safety direction (since uni AIS CB is generally not counterfactually getting people interested in AI when they previously weren't, if anything EA might have more of that effect), so maybe something in the 1:10 - 1:50 range (1:20ish point estimate for median capabilities research: median safety research contribution ratio from AIS CB)?

I don't really trust my numbers though. This ratio is also more favorable now than I would have estimated a few months/years ago, when contribution to AGI hype from AIS CB would have seemed much more counterfactual (but also AIS CB seems less counterfactual now that AI x-risk is getting a lot of mainstream coverage). 

I would be surprised if the accurate number is as low as 1:20 or even 1:10. I wish there was more data on this, though it seems a bit difficult to collect since at least for university groups most of the impact (to both capabilities and safety) will occur a few+ years after the students start engaging with the group. 

I also think it depends a lot on what the best opportunities available to them are. It would depend heavily on what opportunities to work on AI safety exist in the near future versus on AI capabilities for people with their aptitudes. 

I agree with this, eg I think I know specific people who went through AIS CB (tho not the recent uni groups because they are younger and there's more lag) and either couldn't or wouldn't find AIS jobs so ended up working in AI capabilities.

Yeah, same. I know of recent university graduates interested in AI safety who are applying for jobs in AI capabilities alongside AI safety jobs. 

It makes me think that what matters more is changing the broader environment to care more about AI existential risk (via better arguments, more safety orgs focused on useful research/policy directions, better resources for existing ML engineers who want to learn about it etc.) rather than specifically convincing individual students to shift to caring about it.

I've also heard people doing SERI MATS for example explicitly talk/joke about this, about how they'd have to work in AI capabilities now if they don't get AI safety jobs 

I'm impressed the ratio is that favourable! One note to be careful of is that just because people start of hyped about AI safety doesn't mean they stay there - there's a decent chance they will swing to the dark side of capabilities, as we sore with Open AI and probably others as well. Just making the point that the starting ratio might look more favourable than after a few years.

Thanks, this is helpful!

Not worsening the current ratio would be a reasonable first guess, and although it depends a lot on how you define safety researchers, I'd say it's effectively somewhere around 20:1.

sorry are you saying that the current ratio of capabilities researchers to safety researchers produced by AIS field-building is 20:1, or that the current ratio of the researchers overall is 20:1?

(If the latter, then I think my original question was insufficiently clear and I should probably edit it).

The second one - I'm addressing what ratio would be beneficial, but maybe you wanted to understand what actually is? 

More from kuhanj
Curated and popular this week
 ·  · 16m read
 · 
At the last EAG Bay Area, I gave a workshop on navigating a difficult job market, which I repeated days ago at EAG London. A few people have asked for my notes and slides, so I’ve decided to share them here.  This is the slide deck I used.   Below is a low-effort loose transcript, minus the interactive bits (you can see these on the slides in the form of reflection and discussion prompts with a timer). In my opinion, some interactive elements were rushed because I stubbornly wanted to pack too much into the session. If you’re going to re-use them, I recommend you allow for more time than I did if you can (and if you can’t, I empathise with the struggle of making difficult trade-offs due to time constraints).  One of the benefits of written communication over spoken communication is that you can be very precise and comprehensive. I’m sorry that those benefits are wasted on this post. Ideally, I’d have turned my speaker notes from the session into a more nuanced written post that would include a hundred extra points that I wanted to make and caveats that I wanted to add. Unfortunately, I’m a busy person, and I’ve come to accept that such a post will never exist. So I’m sharing this instead as a MVP that I believe can still be valuable –certainly more valuable than nothing!  Introduction 80,000 Hours’ whole thing is asking: Have you considered using your career to have an impact? As an advisor, I now speak with lots of people who have indeed considered it and very much want it – they don't need persuading. What they need is help navigating a tough job market. I want to use this session to spread some messages I keep repeating in these calls and create common knowledge about the job landscape.  But first, a couple of caveats: 1. Oh my, I wonder if volunteering to run this session was a terrible idea. Giving advice to one person is difficult; giving advice to many people simultaneously is impossible. You all have different skill sets, are at different points in
 ·  · 47m read
 · 
Thank you to Arepo and Eli Lifland for looking over this article for errors.  I am sorry that this article is so long. Every time I thought I was done with it I ran into more issues with the model, and I wanted to be as thorough as I could. I’m not going to blame anyone for skimming parts of this article.  Note that the majority of this article was written before Eli’s updated model was released (the site was updated june 8th). His new model improves on some of my objections, but the majority still stand.   Introduction: AI 2027 is an article written by the “AI futures team”. The primary piece is a short story penned by Scott Alexander, depicting a month by month scenario of a near-future where AI becomes superintelligent in 2027,proceeding to automate the entire economy in only a year or two and then either kills us all or does not kill us all, depending on government policies.  What makes AI 2027 different from other similar short stories is that it is presented as a forecast based on rigorous modelling and data analysis from forecasting experts. It is accompanied by five appendices of “detailed research supporting these predictions” and a codebase for simulations. They state that “hundreds” of people reviewed the text, including AI expert Yoshua Bengio, although some of these reviewers only saw bits of it. The scenario in the short story is not the median forecast for any AI futures author, and none of the AI2027 authors actually believe that 2027 is the median year for a singularity to happen. But the argument they make is that 2027 is a plausible year, and they back it up with images of sophisticated looking modelling like the following: This combination of compelling short story and seemingly-rigorous research may have been the secret sauce that let the article to go viral and be treated as a serious project:To quote the authors themselves: It’s been a crazy few weeks here at the AI Futures Project. Almost a million people visited our webpage; 166,00
 ·  · 32m read
 · 
Authors: Joel McGuire (analysis, drafts) and Lily Ottinger (editing)  Formosa: Fulcrum of the Future? An invasion of Taiwan is uncomfortably likely and potentially catastrophic. We should research better ways to avoid it.   TLDR: I forecast that an invasion of Taiwan increases all the anthropogenic risks by ~1.5% (percentage points) of a catastrophe killing 10% or more of the population by 2100 (nuclear risk by 0.9%, AI + Biorisk by 0.6%). This would imply it constitutes a sizable share of the total catastrophic risk burden expected over the rest of this century by skilled and knowledgeable forecasters (8% of the total risk of 20% according to domain experts and 17% of the total risk of 9% according to superforecasters). I think this means that we should research ways to cost-effectively decrease the likelihood that China invades Taiwan. This could mean exploring the prospect of advocating that Taiwan increase its deterrence by investing in cheap but lethal weapons platforms like mines, first-person view drones, or signaling that mobilized reserves would resist an invasion. Disclaimer I read about and forecast on topics related to conflict as a hobby (4th out of 3,909 on the Metaculus Ukraine conflict forecasting competition, 73 out of 42,326 in general on Metaculus), but I claim no expertise on the topic. I probably spent something like ~40 hours on this over the course of a few months. Some of the numbers I use may be slightly outdated, but this is one of those things that if I kept fiddling with it I'd never publish it.  Acknowledgements: I heartily thank Lily Ottinger, Jeremy Garrison, Maggie Moss and my sister for providing valuable feedback on previous drafts. Part 0: Background The Chinese Civil War (1927–1949) ended with the victorious communists establishing the People's Republic of China (PRC) on the mainland. The defeated Kuomintang (KMT[1]) retreated to Taiwan in 1949 and formed the Republic of China (ROC). A dictatorship during the cold war, T