Idea: Red-teaming fellowships

MaxRa; JasperGo; slg; Yannick_Muehlhaeuser

Summary

A red team is an independent group that challenges an organization or movement in order to improve it. Red teaming is the practice of using red teams.

Participants get prepared for and work on a red-teaming challenge, focussed on specific arguments, ideas, cause areas or programmes of organizations.
This will:
- i) Allow fellows to dig deeply into a topic of their interest, encourage critical thinking and reduce deferring to an “EA consensus”
- ii) Increase scrutiny of key EA ideas and lead to clarifying discussions
- iii) Allow fellows to show useful research skills to potential employers
Factors to make this succesful are a good selection process for fellows, well-informed EAs to supervise the fellowship and connection to EA research organizations.

Edit: HT to Linch, who came up with the same idea in a shortform half a year ago and didn't follow Aaron's advice of turning it into a top-level post!

Example of a concrete red-teaming fellowship, focussing on LAWS advocacy

Introducing fellows to red-teaming.
4 weeks of background reading and discussions on AI governance, autonomous lethal weapons and related policy considerations.
2 full-day red-teaming sprints scrutinizing the hypothetical report “Why EAs should work on bans of autonomous lethal weapons”.
Write-up results of the sprints, receive and incorporate feedback from organizers and volunteer EA researchers, and finally share the write-up with possibly interested parties like FLI, and possibly post it on the EA forum.

Why?

Red-teaming fellows might be more involved and engaged compared to reading groups
Fellowships would use and reinforce epistemic norms
Red-teaming might generate useful insights, generally increase scrutiny of EA ideas, assumptions and cause prioritization
Identifying people that could become red-teaming or cause prioritization researchers

Who might organize this?

Fellowships can be organized by broadly capable local or national groups or EA orgs.
This will requires some ability to
- choose good red-teaming targets
- choose appropriate background reading material
- (if applicable) communicate with organization whose work is scrutinized and ensure that results would be published in a constructive manner
- give feedback
The fellowship would greatly benefit from mentorship and feedback from experienced EA researchers.

More concrete thoughts on implementation

Participants

Fellows might ideally be EAs with some solid background knowledge on EA, for example those who have finished an In-Depth Fellowship before.
Submission of past writing samples could be part of the application.

Structure

Participants are assigned to groups of 3–4 that focus on a particular topic.
Groups initially gather and read material on the topic (e.g. 4–8 weeks) before doing the red-teaming exercise during the second part of the fellowship.
Results from the exercise are written up by a chosen deadline and, if participants decide so, shared with interested parties and afterwards possibly on the EA forum

The first part of the red-teaming fellowship would be structured like a normal EA introductory fellowship, only focused on specific background reading material for the red-teaming exercise

Desiderata for the red-teaming challenge

The scope of the problem should fit the fellowship’s limited duration of ~2–3 workdays.
Facilitators should provide a variety of topics so fellows can sort according to interest.
Red-teaming targets should ideally be actual problems from EA researchers who would like to have an idea/approach/model/conclusion/… red-teamed against.

Some off-the-cuff examples for topics

“Make the best case why this recommendation of charity X should not convince a potential donor to donate.”
“Why might one not believe in the arguments for -
- living at the hinge of history?”
- shrimps mattering morally?"
- infinite ethics being a thing?"
- GMO regulations being relevant?"
- fixing adolescence being a new cause area?"
"Why might this case against prioritizing climate change be less convincing?"
"Scrutinize this career profile on X. Why might it turn out to be misleading/counterproductive/unhelpful lecture for a young aspiring EA?"

Some reservations

We have some concerns about potential down-side risks of this idea. They probably can be averted by input from experienced community members, though.

Un-constructive critiques could create discontent and decrease the cooperative atmosphere in the community.
Similarly, this might make EA look unwelcoming and uncooperative from the outside.
If research isn’t supervised properly, fellows could spread hazardous information if working on sensitive subjects.

Other thoughts

We would encourage experimenting with paying fellows, as discussed here by Aaron_Scher:
- the red-teaming version of an EA fellowship might alleviate some concerns about the oddness of paying for the participation because fellows are expected to put in work that can easily be understood as a service to the EA community
Facilitators might use Karnovsky’s minimal trust investigations as a concrete example for going deep on the case that long-lasting insecticide-treated nets (LLINs) are a cheap and effective way of preventing malaria.
Another good example is the red-team by AppliedDivinityStudies on Ben Todd’s post about the impact of small donations.

102 Reactions

Mentioned in

92Apply for Red Team Challenge [May 7 - June 4]

More posts like this

Comments12

Sorted by

New & upvoted

Click to highlight new comments since: Today at 3:06 AM

AppliedDivinityStudiesFeb 3 202228

One really useful way to execute this would be to bring in more outside non-EA experts in relevant disciplines. So have people in development econ evaluate GiveWell (great example of this here), engage people like Glen Wely to see how EA could better incorporate market-based thinking and mechanism design, engage hardcore anti-natalist philosophers (if you can find a credible one), engage anti-capitalist theorists skeptical of welfare and billionaire philanthropy, etc.

One specific pet project I'd love to see funded is more EA history. There are plenty of good legitimate expert historians, and we should be commissioning them to write for example on the history of philanthropy (Open Phil did a bit here), better understanding the causes of past civilizations' ruin, better understanding intellectual moral history and how ideas have progressed over time, and so on. I think there's a ton to dig into here, and think history is generally underestimated as a perspective (you can't just read a couple secondary sources and call it a day).

cryptograthorApr 6 20224

Hi there. This thread was advertised to me by a friend of mine, and I thought I would put a comment somewhere. In the spirit of red teaming, I'm a cryptography engineer with work in multi-party computation, consensus algorithms, and apologetically, nft's. I've done previous work with the Near grants board, evaluating technical projects. I also comaintain the https://zkmesh.substack.com/ monthly newsletter on cryptography. I could have time to contribute toward a red-teaming effort and idea shaping and evaluation, for projects along these lines, if the project comes into existence.

MaxRaApr 7 20222

Hi! That’s really cool, I haven’t seen someone with actual OG red teaming experience involved in the discussion here!

The team at Training for Good are currently running with the idea and I imagine would be happy about hearing from you: https://forum.effectivealtruism.org/posts/DqBEwHqCdzMDeSBct/apply-for-red-team-challenge-may-7-june-4

AppliedDivinityStudiesFeb 3 202216

This is a good idea, but I think you mind find that there's surprisingly little EA consensus. What's the likelihood that this is the most important century? Should we be funding near-term health treatments for the global poor, or does nothing really matter aside from AI Safety? Is the right ethics utilitarian? Person-affecting? Should you even be a moral realist?

As far as I can tell, EAs (meaning both the general population of uni club attendees and EA Forum readers, alongside the "EA elite" who hold positions of influence at top EA orgs) disagree substantially amongst themselves on all of these really fundamental and critical issues.

What EAs really seems to have in common is an interest in doing the most good, thinking seriously and critically about what that entails, and then actually taking those ideas seriously and executing. As Helen once put it, Effective Altruism is a question, not an ideology.

So I think this could be valuable in theory, but I don't think your off-the-cuff examples do a good job of illustrating the potential here. For pretty much everything you list, I'm pretty confident that many EAs already disagree, and that these are not actually matters of group-think or even local consensus.

Finally, I think there are questions which are tricky to red-team because of how much conversation around them is private, undocumented, or otherwise obscured. So if you were conducting this exercise, I don't think it would make sense as an entry-level thing, I think you would have to find people who are already fairly knowledgeable.

MaxRaFeb 3 202212

Thanks those are good points, especially when the focus is on making progress on issues that might be affected by group-think. Relatedly, I also like your idea of getting outside experts to scrutinize EA ideas. I've seen OpenPhil pay for expert feedback on at least one occasion, which seems pretty useful.

We were thinking about writing a question post along something like "Which ideas, assumptions, programmes, interventions, priorities, etc. would you like to see a red-teaming effort for?". What do you think about the idea, and would you add something to the question to make it more useful?

And I think what your comment neglects is the value of:

having this fellowship only as a first stepping-stone for bigger projects in the future (by installing habits & skills and highlighting the value of similar investigations)
have fellows work on a more serious research project together will build stronger ties among them relative to discussion groups, and I expect will lead to deeper engagement with the ideas

Vaidehi Agarwalla 🔸Feb 3 202213

I really like this idea and would love to organise and participate in such a fellowship!

To address this concern:

Similarly, this might make EA look unwelcoming and uncooperative from the outside.

It might be better to avoid calling it "red-teaming". According to Wikipedia, red teams are used "cybersecurity, airport security, the military, and intelligence agencies" so the connotations of the word are probably not great.

Maybe we could use more of a scout mindset framing to make the framing less adversarial.
I find that framework to be inherently cooperative / non-violent. So rather than asking "why is X wrong?" we ask "what if X were wrong? What might the reasons for this be?". (or something like this)

AppliedDivinityStudiesFeb 3 20226

I agree that it's important to ask the meta questions about which pieces of information even have high moral value to begin with. OP gives as an example, the moral welfare of shrimps. But who cares? EA puts so little money and effort into this already on the assumption that they probably are valuable. Even if you demonstrated that they weren't or forced an update in that direction, the overall amount of funding shifted would be fairly small.

You might worry that all the important questions are already so heavily scrutinized as to bear little low-hanging fruit, but I don't think that's true. EAs are easily nerd sniped, and there isn't any kind of "efficient market" for prioritizing high impact questions. There's also a bit of intimidation here where it feels a bit wrong to challenge someone like MacAskill or Bostrom on really critical philosophical questions. But that's precisely where we should be focusing more attention.

MichaelA🔸Feb 4 20227

Other things I imagine the authors or some readers might find interesting:

There was interesting discussion of similar ideas in this shortform from Linch and the comments there
Posts tagged https://forum.effectivealtruism.org/tag/research-training-programs , https://forum.effectivealtruism.org/tag/fellowships-and-internships , and/or https://forum.effectivealtruism.org/tag/scalably-using-labour
My sequence on Improving the EA-aligned research pipeline

MichaelA🔸Feb 4 20227

Thanks for this post!

Quick thoughts:

I think someone should probably actually run something like this (like, this idea should probably actually be implemented), either as a new thing or as part of an existing fellowship or research training program
I think if someone who was potentially a good fit wanted to run this and had a potentially good somewhat more detailed plan, or wanted to spend a few weeks coming up with a somewhat more detailed plan, they should seriously consider applying for funding. And I think there's a decent chance they'd get that funding.
- See also List of EA funding opportunities and Things I often tell people about applying to EA Funds.
- As the first of those 2 posts says: "I strongly encourage people to consider applying for one or more of these things. Given how quick applying often is and how impactful funded projects often are, applying is often worthwhile in expectation even if your odds of getting funding aren’t very high. (I think the same basic logic applies to job applications.)"
If someone is interested in potentially implementing this idea, feel free to reach out to me and I could connect you with other research training program organizers and with some potentially useful resources, and maybe (if time permits and it seems useful) could provide some advice myself

Aaron Gertler 🔸Feb 9 20226

I was surprised that this post didn't define "red-teaming" in the introduction or summary. The concept isn't especially well-known, and many of the examples that come up in an online search specifically involve software engineering or issues of physical security (rather than something like "critical reading").

Might be good to add a definition to this, perhaps based on the one Linch gave in the Shortform you link to.

MaxRaFeb 9 20224

Thanks, good catch, I added the definition from the red-teaming tag.

Aaron_ScherFeb 3 20224

Thanks for writing this up. It seems like a good idea, and you address what I view as the main risks. I think that (contingent on a program like this going well) there is a pretty good chance that it would generate useful insights (Why #3). This seems particularly important to me for a couple reasons.

Having better ideas and quality scrutiny = good
Relatively new EAs who do a project like this and have their work be received as meaningful/valuable would probably feel much more accepted/wanted in the community

I would therefore add what I think is helpful structure, the goal being to increase the chances of a project like this generating useful insights. In your Desiderata you mention

“Red-teaming targets should ideally be actual problems from EA researchers who would like to have an idea/approach/model/conclusion/… red-teamed against.”

I propose a stronger view here: topics are chosen in conjunction with EA researchers or community members who want a specific idea/approach/model/conclusion/… red-teamed against and agree to provide feedback at the end. Setting up this relationship from the beginning seems important if you actually want the right people to read your report. I think with a less structured format, I'm worried folks might construct decent arguments or concerns in their red-team write up, but nobody or not the right people read them, so it's useless.

Note 1: maybe researchers are really busy so this is actually "I will provide feedback on a 2 page summary"

Note 2: asking people what they want red-teamed is maybe a little ironic when a goal is good epistemic norms. This makes me quite uncertain that this is a useful approach, but it also might be that researchers are okay providing feedback on anything. But it seems like one way of increasing the chances of projects like this having actual impact.

This idea makes me really excited because I would love to do this!

I agree that this gets around most of the issues with paying program participants.