Hide table of contents

Note: I am the web programme director at 80,000 Hours and the view expressed here currently helps shape the web team's strategy. However, this shouldn't be taken to be expressing something on behalf of 80k as a whole, and writing and posting this memo was not undertaken as an 80k project. I wrote it as a memo for the meta coordination forum, then decided to clean it up a bit, attach people's comments, and post it here.

80,000 Hours, where I work, has made helping people make AI go well [1]its focus. As part of this work, I think my team should continue to:

  • Talk about / teach ideas and thinking styles that have historically been central to effective altruism (e.g. via our career guide, cause analysis content, and podcasts)
  • Encourage people to get involved in the EA community explicitly and via linking to content.

I think talking about EA ideas and encouraging people to get involved in EA is valuable for making AGI go well, and that it's good for at least us to push on, and possibly other AI-focused organisations as well.[2]

Here's why:

1. The effort to make AGI go well needs people who are flexible and equipped to to make their own good decisions

  • It’s especially hard to tell which interventions aimed at making AGI go well are high-impact or even good, because we're trying to improve (the expected value / trajectory of) a future we don't really understand.
  • This means that people acting to make AGI go well have to think pretty hard and continuously about what's actually going to help, and course-correct as they go, because we don't have robust answers now about how to do it.[3]
  • If AI progress speeds up a bunch (as many think it will), I think this will become more important. People will need to be agile and equipped with the thinking tools to make good decisions in a changing environment and without much institutional guidance.

Counterargument: Agendas are starting to take shape, so this is less true than it used to be.

Reply: We might know what to do somewhat more than we used to, especially in particular domains, like chip governance. But I think we are still largely feeling around in the dark, especially if we're thinking about making AGI go well broadly. Many concerns, like AI-enabled powergrabs, have surfaced only recently, and researchers are still figuring out what exactly the issue is and should be done about it.

2. Making AGI go well calls for a movement that thinks in explicitly moral terms

I think this for 2 reasons:

  • If we're right to prioritise making AGI go well – that is, if it's right that the trajectory of AI may well influence the future profoundly – then advanced AI will create really hard / big / new decisions with enormous moral stakes.
    • I think this calls for an explicitly moral and principled movement, not only because the stakes for the world are huge, but because we need to be capable of and encouraging of moral innovation– trying to figure out "ok but what would actually be good/right in this radically new situation? What new structures could make that happen?" If the world is transformed radically by AI, it won't be just obvious what we should do based on common sense morality.[4]
  • As a field, AI is particularly full of hype, big money, politics/power, and trends, meaning there will be strong incentives to do things that are not aligned with the good. (And if it's right that AI will become an even bigger deal, this will become even more true with time.)
    • A chief function of moral communities is that they help people resist these other incentives, by rewarding people for doing the right thing instead of the cool or self-profiting thing.

Counterargument: movements can be morally good without being explicitly moral, and being morally good is what's important.

Reply: I think trying particularly to be moral, or "moral de dicto" is important for adapting to new high stakes situations, because we have to reason about what is right/good & then do things because of that reasoning.

I think being morally good without explicitly moral reasoning is much easier in situations where we've had enough experience / relevant past moral reasoning to inform our implicit views about moral goodness & have that be right. But if AI transforms the world dramatically, we won't be in one of those situations.

3. EA is (A) at least somewhat able to equip people to flexibly make good decisions, (B) explicitly morally focused.

(A) EA is at least somewhat able to equip people to flexibly make good decisions

Because it is focused on something as abstract as doing good effectively per se, the methods, projects, and intermediate goals of EA are constantly up for re-negotiation. This has downsides, but it is a way of staying flexible.

EA outreach materials generally focus on explaining concepts, arguments, empirical facts that are important. Even cause-forward EA materials like 80k problem profilesseek to explain "the why" & point out uncertainties.

Concepts like scope sensitivitycoordination mechanisms, and moral patienthood, which tend to be central to EA thinking styles, can help people reason through future complex questions as they come up.

Empirically, people who come up through EA seem to do a fair amount of independent reasoning about new questions and situations; this is also reflected in the heterogeneity of EA projects, which suggests people actively thinking about how to apply these ideas to different situations & with different assumptions. (Likely, this is partly due to EA's purposeful presentation as a question.)

(B) EA is explicitly morally focused

…. Hence the 'altruism' thing.

I think it's non-coincidental with this that EA has been morally innovative in the sense of figuring out what doing good might mean when we take into account XYZ surprising / new / under-examined descriptive facts we believe / put a fair amount of credence on.

Ideas like earning to givelongtermismpatient philanthropyinvertebrate welfares-risks, etc. are all, in my view, examples of moral innovation.

Of course, the EA community does also get distracted by trends/power/politics/money/hype, but I think its explicitly moral aspirations give it more tools to resist those things, and to think hard and originally about what's right.

Counterargument: A different flexible & explicitly moral movement could be better for trying to make AGI go well.

Maybe so. The EA movement is certainly far from perfect in encouraging morally good behavior / helping people think for themselves. And given that EA is not specifically focused on making AGI go well, there's a prima facie case for some other movement/community to be the focus.

That said, I don't see obvious better candidates for playing this role better than EA – see appendix just below.

Should there be something new?

I do think that there is a case for creating a new movement which is just focused around making AGI go well in an explicitly flexible & morally driven way. However, from a practical perspective, that seems pretty hard and likely to fail, & also like it'd take a while & maybe make things more confusing. I think EA is not too far off, so feel inclined to work with what we've got.

Appendix: What are the relevant alternatives?

Caveat that I don't have such a deep understanding of these other movements that I'm confident these assessments are correct.

Double caveat that I'm not saying these communities aren't great and helpful in various ways – the point I'm making is that I think they are all less flexible, explicitly moral and morally innovative vs. EA. The purpose is to explain why I think EA has an important role to play in making AGI go well.

The Rationalists?

  • Flexible / seeking to equip people to solve novel problems for themselves?  ✅
  • Morally innovative? ✅ (acausal trade, etc.)
  • Focused on making AGI go well? ✅
  • However, rationalism is more focused on improving human reasoning & decisionmaking, building accurate models of the world, etc., than doing good. ❌
  • Other notes
    • I personally tend to disagree more often with moral views that are common among rationalists, so feel less excited about it from that angle.
    • The rationality community also has a few examples of people going off-piste and doing something harmful — hard to say more than EA though (EA has SBF).

The AI safety community as it currently exists distinct from EA / rationalism?

e.g. working alignment researchers at academic institutions & AI companies, AI governance networks and initiatives, etc.

  • Flexible / seeking to equip people to solve novel problems for themselves?❓❓
    • As far as I know, part of the value-add of this community is that it is more focused on professionalised, concrete action without frequent reconsideration or thinking from first principles, so it doesn't seem set up to do a great job here.
    • I think it's also fairly focused on US & human-centered concerns about misalignment and misuse, rather than things going well broadly. I think that, for example, this gives AI safety as a community an uneasy relationship with questions like whether AI systems should someday take over, or be granted some rights, if that could conflict at all with the welfare of humanity.
  • Explicitly moral?❓❌
    • I think the push to professionalise / be more normal is at odds with doing a lot of explicit moral reasoning, though sometimes moral language is used. My impression is that the moral language is more often used in the parts that overlap with EA & rationalism.
  • Morally innovative? ❌

To be clear, I think that we (80k web) should also be seeking to spread the ideas of the AIS community and help people get involved with it too(our community page talks about both!). But I don't think it's currently serving all the important roles here, so think there's a need for more EA-style thinking too.

Another option would be to try to add more of the EA-ish qualities discussed above – morally driven & innovative, flexible – to the AIS community, such that it's able to fill more of the needed roles. I think that'd also be a reasonable strategy, and could imagine taking it up too.

The School of Moral Ambition?

It might have the key ingredients – though it seems to not at all be focused on AI, so if it were going to contribute a lot here, it'd be a big pivot for them.

  • Flexible / seeking to equip people to solve novel problems for themselves?❓
    • Plausibly yes! I don't know enough about them but I don't see anything that contradicts this.
  • Explicitly moral?  ✅
    • Yes!
  • Morally innovative?
    • Plausibly could soon (it’s very early days for them), but I think not yet❓❌

Appendix 2: anon notes from others

I copy-pasted most of the substantive comments from the memo here anonymously, to show what other people in attendance thought. I omitted various +1 style comments, plus a few I didn't get permission to share.

"Agree. My current framing on this is that people don't currently need EA as a crux to work on AI safety, but we're in an extremely narrow range of time where that is true. Before now, people needed to be weird enough to see that this was important before it was everywhere, and soon there will be extremely stakesy questions about what kind of AI futures to create."

"Also, it just seems like a lot of the people were most excited about working in AI safety came through EA, and that has remained true for a long time, even after there was room for that hypothesis to be proven false."

--> reply "Given # of people involved, I think a huge crux here is how excited people are about non-EA people working in labs, particularly Anthropic"

"An additional thought: This [a world dramatically changed by AI]'ll be in a world where AI has the capability to massively amplify our effective intelligence/reasoning/understanding, but where most people will be slow to actually use and get the most from the relevant AI tools."

[about the fact that there are lots of incentives not aligned with the good in AI] "And can get harder to tell, from someone's AI work alone, how altruistically motivated someone is"

[about the fact that EA outreach materials often talk about "the why" behind prioritising various issues, showing EA concepts at work] "[minor] I think worth disentangling the ways this has impact:

1. Filtering ppl who are into these sorts of concepts

2. Getting economics of scale from having such people form a network

3. New people into that network quickly learning the relevant concepts

I think these are all big.

Whereas I think that not much comes from teaching the concepts, per se - they only really works if they're enhancing a proto-EA."

--> reply "Interesting. I think of us as often enhancing a proto-ea or popularising the concepts with this stuff, rather than it just serving as a filter"

"Just for the record (and obviously not speaking for EA), personally I don't think of EA topics in particularly moral terms, or at least that's not the language I use to do it.  I think in terms of "problems and opportunities".  It just turns out that attempting to prioritize them has often led me to projects & people associated with EA.

Probably many would call this semantic hairsplitting...  Long topic I guess"

--> reply "I think it's relevant! some people do think of EA in less moral terms. My view is that we should think of EA in moral terms, and I think that a lot of people do; but it's true also that people are mixed on it. (I'm reminded of the old "responsibility vs. opportunity" conversation from years ago)"

[on whether to try to build a new movement around making AGI go well vs. build EA] "I agree with this line of reasoning too. It feels like:

(i) at the moment, there isn't a movement which is exactly like what one might design if one wanted to design a movement for "make AGI go well" purposes

(ii) But EA is still pretty close, actually, and closer than anything else

(iii) and creating something new seems infeasible on the relevant timescales (and probably worse, too, for regression to the mean reasons)"


  1. I am saying "make AI go well" rather than "AI safety" because "AI safety" is often taken to mean "technical work to reduce risks from misalignment", and I have in mind a much broader set of interventions – basically trying to tackle all the AI-related issues here, including taking technical and governance approaches to reducing risks from misalignment and risks from AI-enabled powergrabs, getting it right on digital minds, reducing AI-enabled engineered bio-threats, preventing (to the extent good and necessary) gradual disempowerment, and figuring out how to actually make things go well in addition to preventing catastrophes. ↩︎
  2. I also think it's valuable to build a community of people who will focus on doing good & maybe pivot away from AI if it seems like there's something more pressing, but that's out of scope for this memo; this memo makes the case that it's good to highlight ea ideas/ community for the sake of making AGI go well as well. ↩︎
  3. Another way of putting this: we don't have a machine for making AGI go well that we can just scale up with more people / $; it needs much more strategic work. If the effort to make AGI go well were a company, it'd be a startup experimenting with widgets and testing them out, not a widget factory. ↩︎
  4. A few examples: it's pretty morally non-obvious what to do about AI rights/welfare/personhood, & whether it'd actually be good to try to make the US win the AGI race feels like it requires a bunch of hard, cosmopolitan thinking. ↩︎
Comments13
Sorted by Click to highlight new comments since:

Thanks for writing this Arden! I strong upvoted.

I do my work at Open Phil — funding both AIS and EA capacity-building — because I'm motivated by EA. I started working on this in 2020, a time when there were way fewer concrete proposals for what to do about averting catastrophic AI risks & way fewer active workstreams. It felt like EA was necessary just to get people thinking about these issues. Now the catastrophic AI risks field is much larger and somewhat more developed, as you point out. And so much the better for the world!

But it seems so far from the case that EA-style thinking is "done" with regard to TAI. This would mean we've uncovered every new consideration & workstream that could/should be worked on in the years before we are obsoleted by AIs. This sounds so unlikely given how huge and confusing the TAI transition would be.

EAs are characteristic in their moral focus plus their flexibility in what they work on. I like your phrasing here of "constantly up for re-negotiation," which imo names a distinctively EA trait. To add to your list, I think EA-style thought is also characteristic in its ambition and focus on the truth (even in very confusing/contentious domains). I think EAs in the AI safety field are still person-for-person outperforming, e.g. in founding new helpful AI safety research agendas. And I think our success is in large part due to the characteristics I mention above. This seems like a pretty robust dynamic so I expect it to continue at least in the medium term.

(And overall, my guess is distinctively EA characteristics will become more important as the project of TAI preparation becomes more multifaceted and confusing.)

Counterargument: EA AI Safety is a talent program for Anthropic. 

I wish it weren’t but that’s what’s going to continue happen if what the community has become pushes to grow. “Make AI go well” is code for their agenda. EA may be about morality but its answer on AI Safety is stuck and it is wrong. Anthropic’s agenda is not “up for renegotiation” at all. If you want to fix EA AI Safety it has to break out of the mentality 80k has done so much to put it in that the answer is to get a high-powered job working with AI companies or otherwise “playing the game”.

The good EA, the one I loved so much, was about being willing to do what was right even if it was  scrappy and unglamorous (especially then bc it would be more neglected!). EA AI Safety today is sneering reviews of a book that could help rally the public bc insiders all know we’re doing this wacky Superalignment thing today, and something else tomorrow, but whatever the “reason” we always support Anthropic trying to achieve world domination. And the young EAs are scared not to be elite and sophisticated by agreeing, and it breaks my heart. Getting more kids into current EA would not teach them “flexible decisonmaking”. 

EA needs to return to its roots in a way I gave up on waiting for before it needs to grow.

I personally am also often annoyed at EAs preferring the status/pay/comfort of frontier labs over projects that I think are more impactful. But it nonetheless seems to me like EAs are very disproportionately the ones doing the scrappy and unglamorous work. E.g. frontier lab Trust and Safety teams usually seem like <25% EAs, but the scrappiest/least glamorous AI safety projects I've worked on were >80% EAs.

I'm curious if your experience is different?

No I’m just concerned that the overwhelming effect of training EAs to do safety stuff that’s highly dependent on where the frontier labs are is them working at frontier labs. In theory there’s plenty of technical stuff to do that’s helpful, but in practice working at a frontier lab is the attractor. There are also knock-on effects in EA as a culture and movement when working at frontier labs is a primary occupation for top talent.

There's definitely some selection bias (I know a lot of EAs), but anecdotally, I feel that almost all the people who, in my view, are "top-tier positive contributors" to shaping AGI seem to exemplify EA-type values (though it's not necessarily their primary affinity group).

Some "make AGI go well influencers" who have commented or posted on the EA Forum and, in my view, are at the very least EA-adjacent include Rohin Shah, Neel Nanda, Buck Shlegeris, Ryan Greenblatt, Evan Hubinger, Oliver Habryka, Beth Barnes, Jaime Sevilla, Adam Gleave, Eliezer Yudkowsky, Davidad, Ajeya Cotra, Holden Karnofsky ....  most of these people work on technical safety, but I think the same story is roughly true for AI governance and other "make AGI go well" areas.[1]

I personally wouldn't describe all of the above people's work as being in my absolute top tier according to my idiosyncratic worldview (note that many of them are working on at least somewhat conflicting agendas so they can't all be in my top tier), and it could be true that "early EA" was a strong attractor for such people, but EA has since lost its ability to attract "future AI thought leaders". [2]

I also want to make a stronger, but harder to justify, claim that the vast majority of people doing top-tier work in AI safety are ~EAs. For example, many people would consider Redwood Research's work top tier, and both Buck and Ryan (according to me) exemplify EA values (scope sensitivity, altruism, ambition, etc.). Imo, some combination of, scope sensitivity, impartiality, altruism, and willingness to take weird ideas seriously seems extremely useful (and maybe even critical) for doing the most important "make AI go well" work.

  1. ^

    I know that some of these people wouldn't "identify as EA" but that's not particularly relevant. The think I'm trying to point at is a set of values that are common in EA but rare in AI/ambitous people/elites/the general public.

  2. ^

    It also seems good to mention there are some people who are EAs (according to my defintion) having a large negative impact on AI risk.

I would say the main people "shaping AGI" are the people actually building models at frontier AI companies. It doesn't matter how aligned "AI safety" people are if they don't have a significant say on how AI gets built.

 I would not say that "almost all" of the people at top AI companies exemplify EA-style values. The most influential person in AI is Sam Altman, who has publicly split with EA after EA board members tried to fire him for being a serial liar. 

I agree with some parts of your comment  though it’s not particularly  relevant to the thesis that most people with significant responsibility for most of the top-tier (according to my view on top tier areas for making AGI go well) have values that are much more EA like than would naively be expected.

Doesn't this depend on what you consider the "top tier areas for making AI go well" (which doesn't seem to be defined by the post)? If that happens to be AI safety research institutes focused specifically on preventing "AI doom" via stuff you consider to be non-harmful, then naively I'd expect nearly all of them to be aligned with the movement focused on that priority, given that those are relatively small niches, the OP and their organisation and the wider EA movement are actively nudging people into them based on EA assumption that they're the top tier ones, and anyone looking more broadly at AI as a professional interest will find a whole host of lucrative alternatives where they won't be scrutinised on their alignment at interview and can go and make cool tools and/or lots of money on options.

If you define it as "areas which have the most influence on how AI is built" then those are more the people @titotal was talking about, and yeah, they don't seem particularly aligned with EA, not even the ones that say safety-ish things as a marketing strategy and took money from EA funds.

And if you define "safety" more broadly there are plenty of other AI research areas focusing on stuff like cultural bias or job market impact. But you and your organisation and 80000 hours probably don't consider them top tier for effectiveness and (not coincidentally) I suspect these have very low proportions of EAs. Same goes for defence companies who've decided the "safest" approach to AI is to win the arms race.  Similarly, it's no surprise that people who happen to be very concerned about morality and utilitarianism and doing the best they can with their 80k hours of working life who get their advice from Brutger don't become AI researchers at all, despite the similarities of their moral views.

I think this may have been a misunderstanding, because I also misunderstood your comment at first. At first you refer simply to the people who play the biggest role in shaping AGI - but then later (and in this comment) you refer to people who contribute most to making AGI go well - a very important distinction!  

I’m sure it was a misunderstanding, but fwiw, in the first paragraph, I do say “positive contributors” by which I meant people having a positive impact. 

I'll point to my dated but still relevant counterpoint: the way that EA has been built is worrying, and EA as a global community that functions as a high-trust collaborative society is bad. This conclusion was tentative at the time, and I think has been embraced to a very limited extent since then - but the concerns seem not to be noted in your post.

One application of this line of reasoning here is, as @Holly Elmore ⏸️ 🔸 has said more than once, including here, is that being friends and part of a single community seems to have dampened people's ability to fight over what people should do, and oppose the labs.

Executive summary: The author argues that encouraging engagement with effective altruism (EA) is strategically important for making AGI “go well,” since EA uniquely combines flexible, first-principles reasoning with explicit moral focus—qualities that are needed to navigate uncertain, high-stakes AI futures.

Key points:

  1. Making AGI go well requires people who can flexibly adapt and make independent, well-reasoned decisions in the face of uncertainty, especially as AI progress accelerates.
  2. Because advanced AI will raise novel moral challenges under strong countervailing incentives (hype, money, politics), a movement explicitly committed to moral reasoning and innovation is necessary.
  3. EA is unusually suited for this role: it equips people with conceptual tools (e.g. scope sensitivity, coordination, moral patienthood) and fosters explicit moral inquiry, producing moral “innovations” like longtermism and s-risks.
  4. While other movements (rationalists, AI safety professionals, School of Moral Ambition) have valuable qualities, none combine flexibility, explicit morality, and moral innovation as well as EA.
  5. Creating a new AGI-focused movement might be ideal in theory but is impractical and risky in practice; building EA’s role is more feasible and effective.
  6. The author acknowledges counterarguments (EA is imperfect, other communities could evolve to fill this role) but concludes that strengthening EA’s visibility and community remains the best available strategy.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Great post. 

Adding on one of the points mentioned: I think that if you are driven to make AI go well because of EA, you’d probably like to do this in a very specific way (ie big picture: astronomical waste, x risks are way worse than catastrophic risks, avoiding s risks; smaller picture: what to prioritize in AIS, ect). This, I think, means that you want people (or at least the most impactful people) in the field to be ea/ea-adj (because what are the odds the values of an explicitly moral normie and EA will be perfectly correlated on the actionable things that really matter?). 

Another related point is that a bunch of people might join AIS for clout/(future) power (perhaps not even consciously; finding out your real motivations are hard until there are big stakes!) and having been an EA for a bunch of time (and having shown flexibility about cause prio) before AIS is a good signal that you’re not (not a perfect one but some substantial evidence imo) 

Curated and popular this week
Relevant opportunities