Why I think capacity building to make AGI go well should include spreading EA-style ideas and helping people engage with EA

Arden Koehler

Note: I am the web programme director at 80,000 Hours and the view expressed here currently helps shape the web team's strategy. However, this shouldn't be taken to be expressing something on behalf of 80k as a whole, and writing and posting this memo was not undertaken as an 80k project. I wrote it as a memo for the meta coordination forum, then decided to clean it up a bit, attach people's comments, and post it here.

80,000 Hours, where I work, has made helping people make AI go well ^[1]its focus. As part of this work, I think my team should continue to:

Talk about / teach ideas and thinking styles that have historically been central to effective altruism (e.g. via our career guide, cause analysis content, and podcasts)
Encourage people to get involved in the EA community explicitly and via linking to content.

I think talking about EA ideas and encouraging people to get involved in EA is valuable for making AGI go well, and that it's good for at least us to push on, and possibly other AI-focused organisations as well.^[2]

Here's why:

1. The effort to make AGI go well needs people who are flexible and equipped to to make their own good decisions

It’s especially hard to tell which interventions aimed at making AGI go well are high-impact or even good, because we're trying to improve (the expected value / trajectory of) a future we don't really understand.
This means that people acting to make AGI go well have to think pretty hard and continuously about what's actually going to help, and course-correct as they go, because we don't have robust answers now about how to do it.^[3]
If AI progress speeds up a bunch (as many think it will), I think this will become *more *important. People will need to be agile and equipped with the thinking tools to make good decisions in a changing environment and without much institutional guidance.

Counterargument: Agendas are starting to take shape, so this is less true than it used to be.

Reply: We might know what to do somewhat more than we used to, especially in particular domains, like chip governance. But I think we are still largely feeling around in the dark, especially if we're thinking about making AGI go well broadly. Many concerns, like AI-enabled powergrabs, have surfaced only recently, and researchers are still figuring out what exactly the issue is and should be done about it.

2. Making AGI go well calls for a movement that thinks in explicitly moral terms

I think this for 2 reasons:

If we're right to prioritise making AGI go well – that is, if it's right that the trajectory of AI may well influence the future profoundly – then advanced AI will create really hard / big / new decisions with enormous moral stakes.
- I think this calls for an explicitly moral and principled movement, not only because the stakes for the world are huge, but because we need to be capable of and encouraging of moral innovation– trying to figure out "ok but what would actually be good/right in this radically new situation? What new structures could make that happen?" If the world is transformed radically by AI, it won't be just obvious what we should do based on common sense morality.^[4]
As a field, AI is particularly full of hype, big money, politics/power, and trends, meaning there will be strong incentives to do things that are not aligned with the good. (And if it's right that AI will become an even bigger deal, this will become even more true with time.)
- A chief function of moral communities is that they help people resist these other incentives, by rewarding people for doing the right thing instead of the cool or self-profiting thing.

Counterargument: movements can be morally good without being explicitly moral, and being morally good is what's important.

Reply: I think trying particularly to be moral, or "moral de dicto" is important for adapting to *new *high stakessituations, because we have to *reason about what is right/good & then do things because of that reasoning. *

I think being morally good without explicitly moral reasoning is much easier in situations where we've had enough experience / relevant past moral reasoning to inform our implicit views about moral goodness & have that be right. But if AI transforms the world dramatically, we won't be in one of those situations.

3. EA is (A) at least somewhat able to equip people to flexibly make good decisions, (B) explicitly morally focused.

(A) EA is at least somewhat able to equip people to flexibly make good decisions

Because it is focused on something as abstract as doing good effectively per se, the methods, projects, and intermediate goals of EA are constantly up for re-negotiation. This has downsides, but it is a way of staying flexible.

EA outreach materials generally focus on explaining concepts, arguments, empirical facts that are important. Even cause-forward EA materials like 80k problem profilesseek to explain "the why" & point out uncertainties.

Concepts like scope sensitivity, coordination mechanisms, and moral patienthood, which tend to be central to EA thinking styles, can help people reason through future complex questions as they come up.

Empirically, people who come up through EA seem to do a fair amount of independent reasoning about new questions and situations; this is also reflected in the heterogeneity of EA projects, which suggests people actively thinking about how to apply these ideas to different situations & with different assumptions. (Likely, this is partly due to EA's purposeful presentation as a question.)

(B) EA is explicitly morally focused

…. Hence the 'altruism' thing.

I think it's non-coincidental with this that EA has been morally innovative in the sense of figuring out what doing good might mean when we take into account XYZ surprising / new / under-examined descriptive facts we believe / put a fair amount of credence on.

Ideas like earning to give, longtermism, patient philanthropy, invertebrate welfare, s-risks, etc. are all, in my view, examples of moral innovation.

Of course, the EA community does also get distracted by trends/power/politics/money/hype, but I think its explicitly moral aspirations give it more tools to resist those things, and to think hard and originally about what's right.

Counterargument: A different flexible & explicitly moral movement could be better for trying to make AGI go well.

Maybe so. The EA movement is certainly far from perfect in encouraging morally good behavior / helping people think for themselves. And given that EA is not specifically focused on making AGI go well, there's a *prima facie *case for some other movement/community to be the focus.

That said, I don't see obvious better candidates for playing this role better than EA – see appendix.

Should there be something new?

I do think that there is a case for creating a new movement which is just focused around making AGI go well in an explicitly flexible & morally driven way. However, from a practical perspective, that seems pretty hard and likely to fail, & also like it'd take a while & maybe make things more confusing. I think EA is not too far off, so feel inclined to work with what we've got.

Appendix: What are the relevant alternatives?

*Caveat that I don't have such a deep understanding of these other movements that I'm confident these assessments are correct. *

Double caveat that I'm not saying these communities aren't great and helpful in various ways – the point I'm making is that I think they are all less flexible, explicitly moral and morally innovative vs. EA. The purpose is to explain why I think EA has an important role to play in making AGI go well.

The Rationalists?

Flexible / seeking to equip people to solve novel problems for themselves? ✅
Morally innovative? ✅ (acausal trade, etc.)
Focused on making AGI go well? ✅
However, rationalism is more focused on improving human reasoning & decisionmaking, building accurate models of the world, etc., than doing good. ❌
Other notes
- I personally tend to disagree more often with moral views that are common among rationalists, so feel less excited about it from that angle.
- The rationality community also has a few examples of people going off-piste and doing something harmful — hard to say more than EA though (EA has SBF).

The AI safety community as it currently exists distinct from EA / rationalism? e.g. working alignment researchers at academic institutions & AI companies, AI governance networks and initiatives, etc.

Flexible / seeking to equip people to solve novel problems for themselves?❓❓
- As far as I know, part of the value-add of this community is that it is more focused on professionalised, concrete action without frequent reconsideration or thinking from first principles, so it doesn't seem set up to do a great job here.
- I think it's also fairly focused on US & human-centered concerns about misalignment and misuse, rather than things going well broadly. I think that, for example, this gives AI safety as a community an uneasy relationship with questions like whether AI systems should someday take over, or be granted some rights, if that could conflict at all with the welfare of humanity.
Explicitly moral?❓❌
- I think the push to professionalise / be more normal is at odds with doing a lot of explicit moral reasoning, though sometimes moral language is used. My impression is that the moral language is more often used in the parts that overlap with EA & rationalism.
Morally innovative? ❌
- Similarly, AFAICT this community wants to lean away from "big moral questions" and push toward helping sensible actions get adopted, ideally presenting them as common sense and simple rather than philosophical.
- That said, it's not a monolith. Probably there is some of this.

To be clear, I think that we (80k web) should *also *be seeking to spread the ideas of the AIS community and help people get involved with it too(our community page talks about both!). But I don't think it's currently serving all the important roles here, so think there's a need for more EA-style thinking too.

Another option would be to try to add more of the EA-ish qualities discussed above – morally driven & innovative, flexible – to the AIS community, such that it's able to fill more of the needed roles. I think that'd also be a reasonable strategy, and could imagine taking it up too.

The School of Moral Ambition? It might have the key ingredients – though it seems to not at all be focused on AI, so if it were going to contribute a lot here, it'd be a big pivot for them.

Flexible / seeking to equip people to solve novel problems for themselves?❓
- Plausibly yes! I don't know enough about them but I don't see anything that contradicts this.
Explicitly moral? ✅
- Yes!
Morally innovative?
- Plausibly could soon (it’s very early days for them), but I think not yet❓❌

Appendix 2: anon notes from others

I copy-pasted most of the substantive comments from the memo here anonymously, to show what other people in attendance thought. I omitted various +1 style comments, plus a few I didn't get permission to share.

"Agree. My current framing on this is that people don't currently need EA as a crux to work on AI safety, but we're in an extremely narrow range of time where that is true. Before now, people needed to be weird enough to see that this was important before it was everywhere, and soon there will be extremely stakesy questions about what kind of AI futures to create."

"Also, it just seems like a lot of the people were most excited about working in AI safety came through EA, and that has remained true for a long time, even after there was room for that hypothesis to be proven false."

--> reply "Given # of people involved, I think a huge crux here is how excited people are about non-EA people working in labs, particularly Anthropic"

"An additional thought: This [a world dramatically changed by AI]'ll be in a world where AI has the capability to massively amplify our effective intelligence/reasoning/understanding, but where most people will be slow to actually use and get the most from the relevant AI tools."

[about the fact that there are lots of incentives not aligned with the good in AI] "And can get harder to tell, from someone's AI work alone, how altruistically motivated someone is"

[about the fact that EA outreach materials often talk about "the why" behind prioritising various issues, showing EA concepts at work] "[minor] I think worth disentangling the ways this has impact:

1. Filtering ppl who are into these sorts of concepts

2. Getting economics of scale from having such people form a network

3. New people into that network quickly learning the relevant concepts

I think these are all big.

Whereas I think that not much comes from teaching the concepts, per se - they only really works if they're enhancing a proto-EA."

--> reply "Interesting. I think of us as often enhancing a proto-ea or popularising the concepts with this stuff, rather than it just serving as a filter"

"Just for the record (and obviously not speaking for EA), personally I don't think of EA topics in particularly moral terms, or at least that's not the language I use to do it. I think in terms of "problems and opportunities". It just turns out that attempting to prioritize them has often led me to projects & people associated with EA.

Probably many would call this semantic hairsplitting... Long topic I guess"

--> reply "I think it's relevant! some people do think of EA in less moral terms. My view is that we should think of EA in moral terms, and I think that a lot of people do; but it's true also that people are mixed on it. (I'm reminded of the old "responsibility vs. opportunity" conversation from years ago)"

[on whether to try to build a new movement around making AGI go well vs. build EA] "I agree with this line of reasoning too. It feels like:

(i) at the moment, there isn't a movement which is exactly like what one might design if one wanted to design a movement for "make AGI go well" purposes

(ii) But EA is still pretty close, actually, and closer than anything else

(iii) and creating something new seems infeasible on the relevant timescales (and probably worse, too, for regression to the mean reasons)"

^{^}
I am saying "make AGI go well" rather than "AI safety" because "AI safety" is often taken to mean "technical work to reduce risks from misalignment", and I have in mind a much broader set of interventions – basically trying to tackle all the AI-related issues here, including taking technical and governance approaches to reducing risks from misalignment and risks from AI-enabled powergrabs, getting it right on digital minds, reducing AI-enabled engineered bio-threats, preventing (to the extent good and necessary) gradual disempowerment, and figuring out how to actually make things go well in addition to preventing catastrophes.
^{^}
I also think it's valuable to build a community of people who will focus on doing good & maybe pivot away from AI if it seems like there's something more pressing, but that's out of scope for this memo; this memo makes the case that it's good to highlight ea ideas/ community for the sake of making AGI go well as well.
^{^}
Another way of putting this: we don't have a machine for making AGI go well that we can just scale up with more people / $; it needs much more strategic work. If the effort to make AGI go well were a company, it'd be a startup experimenting with widgets and testing them out, not a widget factory.

^{^}

A few examples: it's pretty morally non-obvious what to do about AI rights/welfare/personhood, & whether it'd actually be good to try to make the US win the AGI race feels like it requires a bunch of hard, cosmopolitan thinking.

Show all footnotes

EA Forum Bot Site
EA Forum