Hide table of contents

Epistemic status: I’m ~75% confident in my bottom line – that EAGx events are more cost-effective than EA community-building retreats – based on this rough analysis (which took ~50 hours) and my 1.5 - 2 years of experience supporting EA community-building events. This post is part of the Community Events Retrospective sequence.

I lead the Community Events Programme (CEP) at CEA. This has consisted of two sub-programmes:

  • EAGx events - locally-organised EA conferences supported by CEA.[1]
  • CEP grantmaking - making grants to support events, usually smaller retreats, usually also focused on EA movement-building.

I wanted to evaluate outcomes to see which of these programmes was producing more value per $ spent. In doing so, I hoped to also form a better sense of how to evaluate the impact of EA community-building events in general.

Bottom line upfront

The EAGx events I looked at[2] were ~half the cost per person but the retreats I looked at[3] produced a similar amount of value per person. This suggests that EAGx events are more cost-effective than smaller, community-building retreats. Specifically, taking the central estimates from analyses of attendee survey results, EAGx events seem to be ~1.4 - 2.1x more cost-effective than community-building retreats we funded via the Community Events programme across three metrics. [4]

This highlights the increasing returns to scale for these types of events and updates me towards focusing on larger events, but only when the goal of the event is to build the EA community by helping people build their networks. This finding is probably less relevant for events which aren’t focused on community-building, or which aim to have impact at the group level (e.g. by building consensus). Unless otherwise stated, the results don't apply to CEA's other events programmes, which often have different aims to the Community Events programme.

The impact of events is difficult to analyse. There are a wide range of potential outcomes; many outcomes might take a long time to materialise, and many will be fuzzy and hard to evaluate precisely (e.g. the benefits of meeting a new friend or a motivation boost). 

Gathering data is also difficult; we rely on self-reported outcomes, often from a minority of participants. Answers we receive are brief and we expect some respondents don’t interpret our questions as we intend. 

All of these difficulties mean you should take all of my claims in this report with a big pinch of salt. Some ways I could be wrong about all of this include:

  • My analysis here has missed some important class of outcomes, perhaps because they’re difficult to analyse or don’t show up in our post-event surveys. For example, perhaps the value of the retreats I looked at will show up later, through more well-informed community-builders.
  • Attendees who had an unusually valuable or unpleasant experience didn’t complete the survey and, if they did, it would materially change the topline result.
  • The events I incorporated into this analysis were unusual in some way. I didn’t get results from a large enough range of events.

 

--

Process

My process, at a high level, was:

  1. Survey event organisers to establish whether the event would’ve happened without CEA’s support;
  2. Survey attendees on the value they got from events (n = ~500);
  3. Subjectively score each case on how much value the event provided to the attendee (in impact terms), adjusting for the self-reported counterfactual likelihood of that outcome occurring without the event;
  4. Use these scores in a BOTEC to calculate a 'impact per cost' metric;
  5. Compare this metric between EAGx and CEP events, adjusting for the self-reported counterfactual likelihood of that event occurring without CEA’s support;
  6. Supplement this with BOTECs based on raw connections data and “impactful” connections.

This survey went out over February and March 2023 and asked attendees about events that occurred in the second half of 2022, so attendees were being asked about events that took place 3–8 months ago.

Counterfactuals

To establish the counterfactual likelihood of whether the event would’ve happened at all, I also surveyed event organisers and asked:

  • If CEA didn’t support their event, whether they would’ve sought funding elsewhere and what for;
  • If they would’ve run the same event, where they might have applied, and how likely it is they would’ve applied;
  • If they would’ve run a different event, which event, where they would’ve sought funding and how likely it is they would’ve applied;
  • If they wouldn’t have run an event, why and what they would’ve done instead.

With this information, I eyeballed how much of the event’s value might have been captured in the worlds where I didn’t support the event. I estimate that ~16% of the value of EAGx events would’ve happened otherwise (if we didn’t support it), and ~35% of the value of CEP retreats would’ve happened otherwise. I expect these are both underestimates since some organisers claimed that there weren’t other funding opportunities available for them when there probably were (e.g. Open PhilEA Infrastructure Fund). Since I expect both EAGx organisers and retreat organisers to underestimate in a similar way, I don’t expect this to materially affect the results.

This estimate of how much of the value of an event would’ve happened if we hadn’t supported it is baked into several parts of the following analysis and meaningfully affects the results. Specifically, my best guess estimate that EAGx events are ~1.8x more cost-effective on the specified outcome measures falls to ~1.5x if I assume that all of the events were equally likely to happen. Note that other EA-aligned funders reviewing this analysis should be wary of these counterfactuals: the existence of other funders is part of the reason why the CEP retreats were judged to be more likely to happen in the absence of the Community Events Programme.


Outcome measures

I compared the impact of EAGx events and CEP retreats across three outcome measures:

  • Counterfactual “raw” connections created - We define a connection as someone an attendee met at the event who they now feel comfortable asking for a favour. This could be someone they met at the event for the first time or someone they’ve known for a while but didn’t feel comfortable reaching out to until the event. We ask attendees for the number of new connections they made at the end of the event, and I asked them for this number again in the follow-up survey. I took the mean average of responses from both questions for each event and multiplied this by the number of attendees to get the approximate total number of connections created at the event. I then multiplied this by the % of counterfactual value CEA can claim by supporting the event to get counterfactual raw connections for each event.
     
  • Counterfactual “impactful” connections created - I introduced a new metric in the follow-up survey to try to capture certain event attendee connections that might be more impactful than other connections. I asked them for the number of connections which “might accelerate (or have accelerated) you on your path to impact (e.g. someone who might connect you to a job opportunity or a new collaborator on your work)”. I then followed the same process as above - taking the mean for each event and multiplying it by the number of attendees and the % of counterfactual value CEA can claim.
     
  • Counterfactual valuable outcomes -  The number of connections isn’t the only source of value for event attendees. To approximate other sources of value, I assessed attendees' written answers to two open-ended questions about sources of value asked in the follow-up survey. I went through every written answer, scored each answer based on the system outlined in the appendix and adjusted this score based on the counterfactual value the attendee attributed to the event. Tallying these up, and extrapolating this out to the total number of attendees gave me a total score for each event which I called “impact points”, to introduce a spark of insight and a twist of originality. Note that this system is not intended to be a holistic scoring of the attendee’s potential impact, but rather a system for analysing the impact of the event according to the attendees’ survey responses. More on that in Appendix: Scoring system.

Costs

With three approximate outcome scores aggregated across the two event types, I calculated the costs for producing each event:

  • EAGx events cost significantly more to run than retreats - 5 EAGx events in 2H22 cost ~$2.6m while the 7 CEP retreats I looked at cost ~$190k.
  • I also spent significantly more time running EAGx events in 2H22; ~256 hours on EAGx events, and only ~24 hours on CEP retreats.
  • I added in additional costs such as other CEA staff time and event organiser time
  • Overall, the costs of running 5 EAGx events in 2H22 came to ~$3.14m while the cost of running 7 CEP retreats came to ~$203k.

Crucially, many more people attended the EAGx events I evaluated than the CEP retreats I evaluated.

  • There were 3,568 attendees across the 5 EAGx events I looked at, but only 129 attendees across the 7 CEP retreats I looked at, a ~28x difference.
  • This means the cost per person at EAGx events is ~half that of the cost per person at CEP retreats:[5]
    • The cost per person for the EAGx events was ~$880 ($3.14m / 3,568)
    • The cost per person for the CEP retreats was ~$1,573 ($203k / 129)
    • This is a ~1.78x difference.
       

Comparison

So, before we even assess how well the retreats score on the outcome measures, it’s clear that the bar for retreats is high; to be cost-competitive on any measure, they need to generate almost twice the value per person than EAGx events. On the three outcome measures I used, the outcomes for retreats do not meet this bar:

Counterfactual “raw” connections 

Before introducing counterfactuals, it’s worth noting attendees report approximately the same average number of connections at EAGx events (7.1) as at retreats (6.8). Introducing the counterfactual value captured by the event suggests that EAGx events produced ~20,667 new connections, while CEP retreats produced ~655 connections. This means the cost per counterfactual raw connection at EAGx events is ~$152, while the cost per connection at CEP retreats is ~$310, a ~2x difference.

Counterfactual “impactful” connections 

Attendees report approximately the same average number of impactful connections at EAGx events (3.0) as at retreats (2.9). Introducing counterfactual value suggests EAGx events produced ~8,300 new impactful connections, while CEP retreats produced ~255 new impactful connections. This means the cost per counterfactual impactful connection at EAGx events is ~$378, while the cost per counterfactual impactful connection at CEP retreats is ~$797, a ~2.1x difference.

Counterfactual valuable outcomes

The impact points from each event varied quite widely, from an average of 0.4 per attendee (EAGxVirtual) to 2 (Longtermist organiser summit). The most impactful reported outcomes were 10 - 20x more valuable than the average (though the scoring system somewhat expected this heavy-tailedness), and many attendees reported outcomes that didn’t register as valuable on the scoring system.[6] Overall, EAGx events produced ~2,573 counterfactual impact points, while CEP retreats produced ~119 counterfactual impact points. This means the cost per counterfactual impact point at EAGx events is ~$1,219, while the cost per counterfactual impact point at CEP retreats is ~$1,711, a ~1.4x difference.

That bottom line again: taking the central estimates from these three measures, EAGx events are ~1.4 - 2.1x more cost-effective than CEP retreats. This is driven by the following: 

  • The cost per person at EAGx events is ~half the cost per person for retreat attendees;
  • Outcomes per person are approximately similar in value;
  • EAGx organisers reported that their events were less likely to happen without CEA support than retreat organisers. 

Uncertainty

I’m uncertain how to weight the different outcome measures - I trust the raw connections data most strongly because attendees gave approximately the same number when we asked them a second time. But the number of new connections does not capture the full picture of why people find EA events valuable. 

My impact scoring was more comprehensive in terms of capturing value because it is agnostic with regard to how attendees got value from the events. But the scoring system I created wasn’t rigorously designed, and it seems likely that there are other reasonable scoring systems or even that I would give different scores if I did the exercise again. 

Impactful connections is somewhere in between: it’s a measure which seeks to capture which connections were actually valuable, but we haven’t replicated this metric by asking the same people for this number at different times and it seems likely that different attendees have different standards for what they consider “impactful”.

My best guess is that I should put 40% weight on the counterfactual “raw” connections number, 20% on the counterfactual impactful connections measure, and 40% weight on the counterfactual valuable outcomes measure, leading to a best guess ratio of ~1.8x. Note that this is very close to the raw cost-per-person ratio of 1.78x, suggesting the difference in the cost-per-person figure more or less explains the cost-effectiveness difference because everything else cancels out.

My actual uncertainty is far wider than the ~1.4 - 2.1x ratio. Approximating a plausible lower and upper bound from these estimates suggests I actually think EAGx events are 0.7x - 3x more cost-effective than CEP retreats. 

  • I put at least 10% credence on the claim that CEP retreats are actually more cost-effective than EAGx events. This might be because my analysis here has missed something important or because the events I incorporated into this analysis were unusual in some way.
  • I put ~30% credence on the claim that EAGx events are more than twice as cost-effective as CEP retreats.
  • I put ~60% credence on the claim that EAGx events are 1 - 2x more cost-effective than CEP retreats, with ~15% of this credence around the “it’s very hard to tell” range (i.e. ~1x or slightly above). 
  • Overall, this means I’m ~75% confident in the fuzzier claim that EAGx events are more cost-effective than EA community-building retreats.

Commentary

The above numbers suggest a surprisingly large difference, but if we break it down into two key claims, it comes across as far more reasonable:

  1. EAGx events are approximately half the cost per head;
  2. Attendees get a similar amount of value at CEP retreats and EAGx events; 
  3. Therefore, EAGx events are ~twice as cost-effective.

That second claim is perhaps the most surprising. I have heard (and made!) various claims like “retreats foster more impactful or deeper connections” or “I expect attendees get more value from a nicer, more intimate setting”. This might still be true, but these survey results should make us downweight these claims - I don’t see evidence that EA community-building retreats are more valuable for attendees than EAGx events, at least among the ones I’ve been involved with. 

I think the most salient thing this conclusion should point to is the importance of increasing returns to scale; perhaps retreats help those 20 - 40 people feel something special, but conferences provide a seemingly similar experience at ~half the cost per person. That retreat feeling needs to be really special to counter increasing returns to scale.

A reminder that this is just comparing events which aim to build the EA community, or connect community-builders. I still expect retreats to be more appropriate for goals such as:

  • Achieving a specific outcome or agreeing on a set of goals;
  • Creating a stronger / deeper social network among a small group of people;
  • Creating stronger / deeper relationships between high-profile individuals.

But, for these kinds of retreats, I weakly advise against relying on metrics like connections in impact evaluations because, as shown above, retreats do not seem efficient for producing these outcomes.

Appendix: EAGxVirtual is unusually cost-effective

EAGxVirtual is something of an anomaly in the data. The cost per head for this event is substantially lower than any other, at ~$50 - $80 per person (depending on whether you count everyone registered or only those who were online at the event for at least a few hours). There is no real limit to how many people can attend this event either, so future iterations can probably improve on this number.

However, correspondingly, virtual conference attendees report less valuable experiences than in-person events:

  • EAGxVirtual attendees reported an average of ~3.7 new “raw” connections. The in-person EAGx average is ~7.8.
  • EAGxVirtual attendees reported an average of ~1.5 impactful connections. The in-person EAGx average is ~3.3.
  • EAGxVirtual attendees scored an average of ~0.4 impact points. The in-person EAGx average is ~1.1.

But of course, in the cost-effectiveness game, costs matter. Although EAGxVirtual attendees reported getting ~half as much value at the event as in-person attendees[7], EAGxVirtual costs ~10x less per person than in-person events, making EAGxVirtual far and away the most cost-effective event we ran in the second half of 2022. I don’t plan on switching entirely to virtual events, but this was an update in favour of virtual events and made me excited about EAGxVirtual 2023 (17 - 19 November 2023).

Appendix: Scoring system

I (Ollie) scored all the attendee responses on value from connections and responses on other sources of value, using the rough system below. [8]

  • 50: Someone reports starting a project in an EA-aligned cause area
  • 20: Someone reports getting an internship or job or full-time grant at an EA-aligned organisation
  • 10: Someone reports shifting their focus to an EA-aligned cause area
  • 5: Someone meets a collaborator who they intend to work with later on an EA-aligned project
  • 1: Someone meets someone who inspires them to take an opportunity or cause area more seriously

The key thing about this system is that it’s essentially a logarithmic scale - the top of the chart is 50  times the bottom and intervals are ~5x. This is because I expect the value of event outcomes to fall into a heavy-tailed distribution or a power law distribution, in line with many other things (productivityeffectiveness of charitiesathletes and researchers). After I started analysing the survey results, I feel more confident in this claim. For EAGxAustralia, three outcomes really stood out and the rest were good but unremarkable, according to the attendees. It seems likely to me that a good chunk of the value of the event was accrued by just a few people who had more life-changing things happen to them (e.g. getting a grant or job). 

Note that this system is not intended to be a holistic scoring of the attendee’s potential impact, but rather a system for analysing the impact of the event according to the attendees’ survey responses. I initially expected that cause prioritisation might have been an important factor for how outcomes were scored, but I ended up mostly focusing on whether the event was valuable for that person, based on their own prioritisation. If someone reports getting a job, that’s more impactful for that person than if that person just had a few interesting conversations.

As a result, I don’t expect disagreements in cause prioritisation to materially change many conclusions in this analysis. I find that EAGx events are cost-effective because they allow more interactions and learning to happen per $, so, naïvely, I expect this result to hold regardless of the cause areas represented at the event.

Appendix: Top impact stories

The impact reported in the follow-up survey I conducted suggests a heavy-tailed distribution (though note again that my scoring system somewhat expected this distribution, so this claim could be begging the question). The top 3 impact stories per event accounted for 20 - 55% of the impact of reported stories, according to my assessment, and this was true across various events.

Here are some quotes from the top stories reported, all of which received a score of >5 points. I’ve bold highlighted some text:

  • Without EAGxSG, I would never have received funding to do independent alignment research, nor have the courage to apply to SERI MATS which I eventually got accepted to.
  • [One connection] recommended that I apply for a planning grant for [organisation] and went on to become our grant evaluator - the grant was successful and enabled me to work full-time on the organisation.
  • I have ended up collaborating with all 3 of them for a successful grant I received
  • One person I met there introduced me to her organization and offered me to apply as an intern. Since I was on my way to shift my career to one with more impact anyway, I applied and actually got the spot.
  • One [connection] has already led to a contract-based job offer, while the other has made further connections within my field which have proven really useful.
  • I am working now in [organisation] because of these connections that I made during the event.

For comparison, here are some quotes from stories that received a score of 1 - 5:

  • I made a connection with a speaker after attending his talk. I then approached him to deliver a talk to students from the program I work for and we had 68 attendees come and listen to his views on existential risk.
  • We (those 10 [connections]) started a project together and meet weekly since the conference 
  • Helped shed light on the opportunities and jobs within the EA community. Reinforced the idea that it can be valuable to work outside of typical EA orgs, and that working in non-EA roles can help bring new ideas to later work in EA orgs.
  • Fellow AI safety researchers who helped me enter the field, explore research agendas, discuss strategy considerations, and help plan my US travels; I am still in contact with them and profit from these connections.

And here are some quotes from stories that received a score of 0 (this isn’t to say that these outcomes won’t be valuable, just that they haven’t led to anything tangible yet):

  • I met a like-minded EA.
  • They were very useful in directing me to resources and programmes in the areas of interest that I have.
  • I had very interesting recommendations regarding my career plans, and discussions that taught me much about other EA topics.
  • I found other people working on similar projects in their home countries. This led to fruitful discussion and we may collaborate in the future.

My thanks to Callum Calvert, Jona Glade, Michel Justen, Sophie Thomson, Oscar Howie, Ben West, Eli Nathan, Ivan Burduk and Amy Labenz for comments and feedback.

 

  1. ^

    What’s the difference between EA Global and EAGx? EA Global conferences are organized by the Centre for Effective Altruism, and people from all over the world attend these. The team at CEA is responsible for choosing the content, processing admissions, and production of the conference. EAGx events, on the other hand, are community-organized with some support from CEA. The target audience for EAGx events is broader than EAG, but tends to have a more regional focus.

  2. ^

    EAGxAustralia, EAGxSingapore, EAGxBerlin, EAGxRotterdam and EAGxBerkeley.

  3. ^

    The EA Workplace/Professional Group Organizer Retreat, the Longtermist Organizers Summit, the Asia Community-builders retreat, the FERSTs (French x-risk) retreat, the Wild Animal Welfare Policy Summit, the African EA organisers summit and the Australia and New Zealand Group Leader's Retreat. The Wild Animal Welfare Policy stands out as one event not focused on community-building.

  4. ^

    This is the range implied by the central estimates from analyses of the three metrics I tracked. My actual uncertainty is far wider than this. Approximating a plausible lower and upper bound from these estimates suggests I actually think EAGx events are 0.7x - 3x more cost-effective than CEP retreats. 

  5. ^

    Note that this cost is not the same figure as how much CEA is willing to pay for a marginal attendee at an event. 

  6. ^

    e.g. “It was nice meeting people within my cause area in a more global setting”, and “knowing people to reach out to”.

  7. ^

    Note that, for these outcome measures, I’m only including survey respondents who presumably attended the event for enough time to consider themselves attendees.

  8. ^

    There were higher scores for more senior hires, but I ended up not using them.

Comments26
Sorted by Click to highlight new comments since: Today at 10:31 AM

Thanks for publishing this! Very helpful. 

One quick thought: those retreats seem extraordinarily expensive. EA Netherlands has organised four retreats since February last year. I haven't checked the most recent one but, of the first three, the most expensive had a cost per person of EUR 260 (not including the cost of EA Netherlands spending time organising it). Granted, they were short retreats (arrive late on the Friday, leave Sunday afternoon before the evening meal), but it would be interesting to see how they compare to the events you looked at on the outcome measures you specified. 

For example, for our 'raw' average number of connections, across all four retreats the figure is 4.6. So that's approx 35% less than the EAGx events you looked at, and 30% less than the retreats you looked at, but our most expensive retreat cost approximately 80% less per person than the average CEP retreat.  

Thanks! 

I agree the retreats I looked at were on the more expensive end because of some travel grants because they were longer, as were the EAGx events (which means you should also expect the EAGx costs to go down). I think some recent EAGx events might come close to that cost-per-person.

I think another reasonable takeaway from this is "keep retreats cheap", and perhaps I should've included that.

I want to chip in that several years ago it was very normal for retreat participants to chip in on the cost of the retreats. I think this is pretty normal in comparison settings (ie: student group retreats for clubs in the US) and would be excited about more groups doing a bit more of this (not necessarily all of them but I think this isn't in the option space of some group organizers right now and should be). I think this gives participants a bit more stake in the retreat going well but that is not super evidence-based. 

It is also, always possible to offer subsidies/financial assistance for anyone who might find the cost prohibitive. Although, it seems important to make it very easy and nonawkward for them to flag if they need assistance (ie: in the signup form explicitly say that people are in very different financial situations and you expect some people to need this.)

Yes good point! We haven't done this yet (apart from the fact we expect participants to cover their own travel costs (easily done in NL with good public transport)) BUT for our 'EA professionals' retreat last autumn we did ask people to give an indication of how much they'd be willing to pay. I've included the results below.



And to give an idea of the respondents, here's some more data:



Thanks for sharing this, James!

Appendix: EAGxVirtual is unusually cost-effective

 

Quick thought that I expect if you accounted for non-financial costs, especially the time spent by attendees that would otherwise have been spent on other impact-focused activities, then the cost-effectiveness would go down substantially.

A weekend at a virtual conference vs an in-person conference probably takes like 50% as much time per attendee? If that's right, by a measure of cost-effectiveness that more like "connections made per hour of work lost", EAGx virtual and EAGx would be roughly equally cost-effective?

I'm not quite sure I understand. EAGxVirtual is unusually cost-effective because:

  • Organising costs are >>2x lower (no catering, no venue, no AV etc.)
  • The time for attendees is considerably lower (~2x lower seems right, maybe more)
  • But the impact seems to be ~2x lower.

It seems like you're missing the organising costs in your last two questions? Or perhaps we disagree about the difference in the value of organising costs and attendee time? 

Ah yeah I think I wasn't counting organising costs. 

I meant that if you measure cost-effectiveness in terms of impact per $, then EAGx looks way better , but if you measure cost-effectiveness in terms of impact per hour of (attendee) time, then EAGx looks similar. So there's a 'regression to the mean' type effect when you consider additional metrics. 

But you're right I wasn't considering organiser time. Apologies for the "quick thought" comment ending up being confusing rather than helpful.

No worries at all! I think always good to poke at this stuff, and I agree that per attendee hour, EAGxVirtual is less cost-effective than per $ spent.

Thank you for writing this sequence; it provides some nice analysis and transparency into CEA’s thinking.

I like your attempt to measure “valuable outcomes” in addition to “connections” based metrics, especially since your other post suggests that “learning” creates about as much value as “connections”. I’d be curious in seeing an equal-weighted “valuable outcomes” measure (i.e. every outcome that passes a bar gets one point vs. different scores for different outcomes) and whether that changes any results. I think it’s reasonable to believe that the value of different outcomes follows a power law distribution; I just think it’s difficult to score those outcomes properly on an ex-ante basis. 

I do wish you hadn’t relegated the discussion of EAGxVirtual to an appendix. I think the finding that “EAGxVirtual is unusually cost-effective” belongs in the “bottom line up front” section, and could conceivably be the most important finding of this analysis. Virtual events aren’t great for some things (e.g. building career or social connections) but they also have a lot of advantages. In addition to the 10x(!) difference in cost-effectiveness you mentioned, virtual events:

  • Reduce the disappointment people feel when they are rejected from EAG/EAGX events
  • Are accessible to people in any location, helping decentralize the community and lessen the importance of major hubs
  • Are more accessible to parents, people working very busy jobs or multiple jobs, and others that might find it hard to attend an in-person event
  • Can likely attract better speakers from more diverse locations due to easier logistics
  • Probably have more efficiencies if you run them year after year than physical events (where you’re likely dealing with different venues/vendors each time)

Thanks, Anon!

I’d be curious in seeing an equal-weighted “valuable outcomes” measure (i.e. every outcome that passes a bar gets one point vs. different scores for different outcomes) and whether that changes any results

Sounds interesting, but what's the case for doing this? To see if some events score better at producing outcomes in general? I agree outcomes are difficult to score, but I feel fairly confident that some are better than others, so some adjustment seems more appropriate to me.

I think the finding that “EAGxVirtual is unusually cost-effective” belongs in the “bottom line up front” section, and could conceivably be the most important finding of this analysis.

Thanks for this feedback. In fact, this was the case in an earlier draft, but I wanted to make the BLUF a clear outline of my core claim. Perhaps that was a mistake!

I agree with your points, thanks for adding to the case here.

Same goal as your analysis really- to find the most cost-effective event models for producing valuable outcomes. With a very fat-tailed scoring rubric, I’m concerned that legitimate differences between the event types might be overshadowed by the particulars of the rubric. As some of the other comments indicate, it’s not obvious how to value different outcomes on a relative basis.

Even if you don’t want to use an equal-weighted scoring system, you could see if the results change materially if you use a much less fat-tailed rubric (e.g. have scores ranging from 1-5 vs. 1-50). You can think of that as a type of sensitivity analysis to see how dependent your findings are on the specifics of the scoring system.

Even if you don’t want to use an equal-weighted scoring system, you could see if the results change materially if you use a much less fat-tailed rubric (e.g. have scores ranging from 1-5 vs. 1-50).

Thanks, I like this idea, and it would be easy to implement! There were only a handful of cases with scores above 5 across the dataset and I think they weren't bunched anywhere. I'll report back by the end of next week (probably just in this thread), please DM me if I drop this.

That's great, thanks Ollie!

Hey Anon,

I ran this analysis, beheading (?) the data to remove all responses which I scored over 5 and changed them to 5 (18 responses).

I realised that all of these responses were from EAGx attendees - that's not that surprising since 90% of survey responses were from EAGx attendees but that highlights a limitation of my analysis. Plausibly, if I had data from CEP retreats, we might find more high-scoring impact stories.

Previously:

This means the cost per counterfactual impact point at EAGx events is ~$1,219, while the cost per counterfactual impact point at CEP retreats is ~$1,711

Post-beheading:

The cost per counterfactual impact point at EAGx events is ~$1,639, while the cost per counterfactual impact point at CEP retreats is ~$1,724 (due to GBP/USD fluctuation).

So, EAGx still performs slightly better but it's much closer, probably not a significant difference.

Thanks for proposing this - it does suggest the results are sensitive to my scoring system. I'm not sure where this leaves me; that isn't ideal and I'd like something more robust but, on the other hand, I think these high-scoring results (people securing jobs, teams being formed) are exactly the kind of things we want to happen at our events so I think it's reasonable to put significant weight on them. 

Thanks for running this analysis Ollie! Interesting findings!

it does suggest the results are sensitive to my scoring system. I'm not sure where this leaves me; that isn't ideal and I'd like something more robust but, on the other hand, I think these high-scoring results (people securing jobs, teams being formed) are exactly the kind of things we want to happen at our events so I think it's reasonable to put significant weight on them. 

Agree that this exercise doesn’t yield an obvious conclusion. Given that you’ve found the results to be sensitive to the scoring system, I suggest trying to figure out how sensitive. You’ve crunched the numbers using max scores of 50 and 5; I imagine it’d be quick to do the same with max scores of 20,10, and 1 (the other scores you used in your original scoring system). 

The other methodology I’d suggest looking at would be to keep the same relative rankings you used originally, but just condense the range of scores (to say 1,2,3,4,5 vs. 1,5,10,20,50). That would capture the fact that you think starting an EA project is more valuable than meeting a collaborator (which is lost by capping the scores at 5), but would assess it as 2.5x more valuable vs. 10x. (Btw, I think the technical term for “beheading” the data is “Winsorizing” though that’s usually done using percentiles of the data set, which is another way you could do a sensitivity analysis). 

This sort of more comprehensive sensitivity analysis would shed some light on whether your observation about EAGxAustralia is supported by the broader data set: 

For EAGxAustralia, three outcomes really stood out and the rest were good but unremarkable, according to the attendees. It seems likely to me that a good chunk of the value of the event was accrued by just a few people who had more life-changing things happen to them (e.g. getting a grant or job). 

If that looks to be a robust finding, that has pretty big implications for how events should be run. FWIW I’d consider that a more important finding than EAGx events looking more cost-effective than CEP events, and would suggest editing the bottom line upfront section to note that. 

Longer term, I’d look to refine the metrics you use for events and how you collect the data. I love that you’ve started looking beyond “number of connections” to “valuable outcomes”; this definitely seems like a move in the right direction. However, it’s also not feasible for you to score responses from attendees at scale going forward. So I’d suggest giving asking responders to score the event themselves, while providing guidance on how different experiences should be scored (e.g. starting a new project = X) to promote consistency across respondents. 

My hunch is that it’d be good to have people score the event along the different dimensions (connections, learning, motivation/positivity, action, other) you listed in the “How do attendees get value from EA community-building events?” post. That might make the survey too onerous, but if you could collect that data you’d have a lot of granularity about which events accrued which type of value and it's probably easier to do relative scoring within categories rather than across them. You'd  still be able to create a single score based on a weighted average of the different dimensions (where you’d presumably give connections and learning the most weight, since that’s where people seem to get the most value).

Thanks for writing this! A couple of random thoughts

  1. I'm also surprised by the cost of these CEP retreats ($1,500+ per attendee). Assuming the organizer's salary is already provided for, I expected the cost for the average attendee to be closer to $300-$750. 
  2. Also, I respect the establishment of a scoring system, but the weightings seem problematic. For instance,  "someone reports starting a project in an EA-aligned cause area" receives a score of 50, and "someone meets someone who inspires them to take an opportunity or cause area more seriously" receives a score of 1. 

    That's not intuitive to me. I would much prefer 50 people attending a retreat/EAGx and reporting they felt inspired to take a cause area or opportunity more seriously over 1 person reporting they started a new project. "Projects" are just too vague, but maybe you have something more specific in mind? 
  3. Scoring systems like this will affect how community builders design events. e.g. Say I'm an events organizer and I want funding from the CEA events team. I know the CEA events team prefers projects over updated cause prioritization at 50:1. Then I'm going to shape my event in a way that makes starting new projects an especially large (plausibly) the largest focus of my event. Is that your intent? Are there already guidelines on how community builders should think about this?

On (2), my guess is that the 1-point category covers a broad range of events, some of which are worth significantly more than 2% of starting the median project, and some of which are very unlikely to lead to any real-world impact at all. 

On (3), I'd note that it may be inadvisible to evaluate community builders on a 50:1 ratio even assuming that is the funders' true preference. If the super-high scoring item is uncommon enough, its existence during a period being evaluated may be a fairly weak predictor of whether it would be present in a future period. For instance, I care a whole lot more about airplane crashes than near-misses, but the number of near-misses in the past few years might be a stronger predictor of future crashes than the number of crashes in the past few years.

Thanks JD!

Assuming the organizer's salary is already provided for

That isn't always the case. For some, the costs included organizer salary which is a significant expense. But yes, I agree the retreats I looked at weren't super cost-efficient (not that they necessarily should've been) and that retreat organizers can aim for lower.

"Projects" are just too vague, but maybe you have something more specific in mind? 

Yes, I do mean something like "a project which generated full-time work for at least two people for several months", not just "updating a website". Note that I rarely used this scoring so I don't think the definition here will swing the results much.

I'm going to shape my event in a way that makes starting new projects an especially large (plausibly) the largest focus of my event. Is that your intent?

Tentatively yes; I think new community members should be encouraged to take actions and try things out, though thinking about cause prioritization and engaging with the ideas thoroughly is probably a first step they need to take anyway so I'm not sure that should change how you think about presenting ideas.


Note that, unfortunately, the CEA events team won't be making grants any more (see later post in the sequence).

Thanks for publishing this Ollie - really interesting, and definitely great 'truth-seeking'.

I think I'd highlight some of your caveats (although this could be a case of me prioritising intuition over data and being misled by that).

My experience as a full-time ED within EA is that retreats are substantially more valuable than EAGs/EAGxs/EAGxVirtuals. For example, I would skip every conference to go to the Effective Giving Summit; and I felt that this summit was roughly 10x more valuable than what I would otherwise have spent the time on. I expect this relates to my specific circumstances, where making connections within EA is relatively easy outside a conference, but deepening connections is relatively hard. (Also that the Summit was especially excellent.)

I also agree with you that cost-effectiveness might not be the best way to assess retreats, as demonstrated by the apparent cost-effectiveness of EAGxVirtuals, because if you make the costs small enough, the benefit can be very low and still seem very cost-effective. OFTW was able to host a conference online for $200 for 75 people, and I don't think an in-person conference could ever compete with that on a cost-effectiveness basis because it would be at least two orders of magnitude more expensive; but I don't really know anyone who thinks the online format was any good!

All-in-all, though, great work on this retrospective.

I think your online-conference paragraph points, among other things, to the cost of attendee time as an important factor to weigh in many cases. It's plausible to me that the online vs in person decision would come down to a time-money tradeoff.

Thanks, Jack.

Yes, I wouldn't be surprised if some retreats are more valuable for people who are already engaged, perhaps because the admissions process is more selective. But I would say you also aren't the main target for community-building events; the difference seems small for those newer to the field.

Yes, good point

Interesting results, thanks for sharing! I think getting data from people who attend events is an important source of information about what's working and what's not.

I do worry a bit about what's best for the world coming apart from what people report as being valuable to them. (This comment ended up a bit rambley, sorry.)

Two main reasons that might be the case:

  1. If the event causes someone's goals or motivations to change in a way that's great for the world, my guess is that doesn't feel valuable to the person compared to helping the person get or do things they already want.
    • Eg if someone isn't that on board with the EA project, then them getting more enthusiastic about making it an important priority in their life could be very good for the world, but feel like only a small personal benefit from the event (maybe it feels like "I had a good time and felt more excited" - and it's very hard to tell from a report like that whether the person is now much more likely to go on to do the EA project well, or whether it's not really going to affect their actions).
    • Eg if someone is attached to a cause area or job type that they have existing connections with, but the event nudges them to seriously consider other possibilities that are in fact more valuable, then this could be a great outcome that again doesn't feel that valuable to the individual.
  2. If the people attending the event don't have that good an understanding of what's in fact counterfactually valuable for achieving their goals.
    • Eg they might report that the event caused something to happen, but that thing would likely have happened anyway. Eg learning about something, getting funding, getting a job.
    • Eg they might just overrate or underrate the importance of specific outcomes (specific relationships, changes to motivation, specific ideas).

 

I think these reasons are actually important enough for people professionally involved in community building to try to beat the baseline of "let's do things that people report as valuable" by trying to build a detailed understanding of the mechanisms that cause someone to go on to do things that are valuable for the world (weighted by their value). How exactly do their different motivations, beliefs and experiences fit together? Is there a typical "journey", or maybe several different journeys? Are there certain things that are necessary in order for people to go on to do great work, and in particular are there things that individuals are likely to underrate?

Of course this happens some amount, but I'd be keen to see more discussion of this in general among people doing EA meta work.

(Something like the 2020 OP longtermist survey but much more focused on understanding the mechanisms that caused the good work, rather than the categories of thing that the people interacted with. Rather than a survey, maybe more like in-depth user interviews. I think 80k may have done a bit of this, I'm not sure.)

Thanks Isaac, I agree relying on self-report is a key limitation here. In fact, when reviewing people's stories I would often wish they expanded on something that seemed small to them but important to us (e.g. they'd write "became more interested in pursuing X path" as part of a list, but that stood out to me as something exciting from an impact perspective).

I didn't mention this in the report, but I also do user interviews fairly regularly to get some more colour on things like this and followed up with several people whose stories seemed impactful. I wouldn't say the events team are following the baseline of "let's do things that people report as valuable", and are just using that as one guiding light (albeit a significant one). I agree forming a clearer framework of how people arrive at impactful work would be exciting.

A quick note to say that I'm taking some time off after publishing these posts. I'll aim to reply to any comments from 17 July.