Summary
Following up on the challenge to quantify the impact of 80,000 hours' top career paths introduced by Nuño Sempere, I have estimated the cost-effectiveness of operations management in high-impact organisations (OM), which arguably include 80,000 Hours’ top-recommended organisations.
The results for the mean cost-effectiveness of various metrics in bp/G$ in terms of existential risk reduction are summarised in the table below for my preferred method. I present all results with 3 digits, but I think their resilience is such that they only represent order of magnitude estimates (i.e. they may well be wrong by a factor of 10^0.5 = 3).
The full results are in this Sheet, and the calculations in this Colab[1].
Mean cost-effectiveness (bp/G$) of… | Method 3 with truncation |
---|---|
Global health and development | 0.431 |
Longtermism and catastrophic risk prevention | 3.95 |
Animal welfare | 1.62 |
Effective altruism infrastructure | 3.20 |
The effective altruism community | 1.55 |
Operations management in high-impact organisations | 7.01 |
Acknowledgements
Thanks to Abraham Rowe, Dan Hendrycks, Luke Freeman, Matt Lerner, Nuño Sempere, Sawyer Bernath, Stien van der Ploeg, and Tamay Besiroglu.
Methods
I estimated the cost-effectiveness[2] from the product between:
- The cost-effectiveness of the high-impact organisations, which I assumed equal to that of the effective altruism community.
- The multiplier of OM, which I defined as the ratio between the cost-effectiveness of OM and the high-impact organisations.
This method assumes the cost-effectiveness distribution of the high-impact organisations is represented by the one theorised for the effective altruism community in the next section. Moreover, the cost-effectiveness estimates are only accurate to the extent that future opportunities are as valuable as recent ones.
The calculations are in this Colab[1].
Cost-effectiveness of the effective altruism community
I calculated the cost-effectiveness of the effective altruism community from the mean cost-effectiveness weighted by cumulative spending between 1 January 2020 and 15 August 2022 of 4 cause areas:
- Global health and development.
- Longtermism and catastrophic risk prevention.
- Animal welfare.
- Effective altruism infrastructure.
These are the areas for which Tyler Maule collected data here[3] (see EA Forum post here). I adjusted the 2020 and 2021 values for inflation using the calculator from in2013dollars.
I computed the cost-effectiveness of each area using 3 methods. All rely on distributions which are either truncated to the 99 % confidence interval[4] (CI) or not truncated, in order to understand the effect of outliers. The parameters of the pre-truncation distributions, which are the final distributions for the non-truncation cases, are provided below.
Method 1
I defined the cost-effectiveness of longtermism and catastrophic risk prevention as a truncated lognormal distribution with pre-truncation 5th and 95th percentiles equal to 1 and 10 bp/G$ in terms of existential risk reduction. These are the lower and upper bounds proposed here by Linchuan Zhang.
I assumed the ratio between the cost-effectiveness of i) longtermism and catastrophic risk prevention and ii) global health and development to be a truncated lognormal distribution with pre-truncation 5th and 95th percentiles equal to 10 and 100. These are the lower and upper bounds guessed here by Benjamin Todd for the ratio between the cost-effectiveness of the Long-Term Future Fund (LTFF) and Global Health and Development Fund (search for “10-100x more cost-effective”).
I considered the ratio between the cost-effectiveness of i) animal welfare and ii) global health and development to be a truncated lognormal distribution with pre-truncation 5th and 95th percentiles equal to 270 μ and 211. I computed these multiplying:
- The 5th and 95th percentiles of 0.0436 and 34.1 k I estimated here for the ratio between the cost-effectiveness of corporate campaigns for chicken welfare and GiveWell’s Maximum Impact Fund[5], which is now designated Top Charities Fund.
- A factor of 1.07 m to adjust the moral weight downwards, which I calculated from the reciprocal of the product between:
- The mean moral weight of chickens relative to humans of 2.41 I obtained in the analysis mentioned just above.
- The ratio of 389 between the number of neurons of humans and red junglefowls (similar to chickens), which I took from Wikipedia.
This adjustment is analogous to assuming the mean moral weight is directly proportional to the number of neurons.
I set the cost-effectiveness of effective altruism infrastructure to the mean cost-effectiveness weighted by cumulative spending between 1 January 2020 and 15 August 2022 of the other 3 areas.
Method 2
I obtained the cost-effectiveness of each area based on the 27 answers regarding the mean cost-effectiveness of the Effective Altruism Funds given in the EA talent needs survey - 2018. Such answers are in the table below, whose last column was calculated by me.
Fund | Mean cost-effectiveness relative to the Effective Altruism Infrastructure Fund (%) | |||
---|---|---|---|---|
10th percentile | Median | 90th percentile | Geometric mean between the 10th and 90th percentiles | |
Global Health and Development Fund | 1 | 5 | 63 | 7.94 |
Long-Term Future Fund | 16 | 167 | 283 | 67.3 |
Animal Welfare Fund | 3 | 10 | 107 | 17.9 |
Effective Altruism Infrastructure Fund | 100 | 100 | 100 | 100 |
I defined the cost-effectiveness of longtermism and catastrophic risk prevention as in method 1.
For the other areas, I assumed a truncated lognormal distribution with pre-truncation 10th and 90th percentiles of the area relative to those of longtermism and catastrophic risk prevention based on the 10th and 90th percentiles in the table above:
- For global health and development, 6.25 % (= 1/16) and 22.3 % (= 63/283).
- For animal welfare, 18.8 % (= 3/16) and 37.8 % (= 107/283).
- For effective altruism infrastructure, 625 % (= 100/16) and 35.3 % (= 100/283).
Method 3
I defined the cost-effectiveness of each area from the mean between those of methods 1 and 2. My best guesses regard the truncation case of this method.
Multiplier of operations management in high-impact organisations
I defined the multiplier of OM as the median of the 11 distributions described in the table below, and also experimented with truncating to the 99 % CI the component distributions of each of them. I used the median with the intention of following Jaime Sevilla’s best guess on how to aggregate forecasts:
- The assumptions going into each of the estimates are apparently not mutually exclusive.
- There are outliers:
- The interquartile range of the mean multipliers is 23.0 (= 30.0 - 6.99), which is the difference between the mean multiplier of the 9th and 5th estimates.
- Consequently, the 4th and 8th estimates are outside of the Tukey’s fences, as their mean multipliers are higher than the 3rd quartile by more than 1.5 times the interquartile range (1.36 k > 136 > 64.5 = 30.0 + 1.5*23.0).
- However, these are not necessarily “poorly calibrated”[6], so I opted not to exclude them.
- According to Jaime’s flowchart, the 3 choices above justify using the median[7].
I obtained the distributions via asking i) 75 people working at 80,000 Hours’ top-recommended organisations (on October 30 and 31), and ii) 259 people in the Slack “EA Forecasting & Epistemics” (on November 2) for the multiplier of OM of their organisations and the effective altruism community. You can see here the list of emails I contacted, and the messages regarding i) and ii) (see “Emails” and “Slack message”, respectively).
I should emphasise the multiplier of OM may depend a lot on the organisation (e.g. its size, maturity, cause area, and what it understands as operations[8]), specific position (e.g. seniority), and personal fit[9]. Consequently, aggregating all estimates as I did has serious limitations.
Multiplier of OM for own organisations (N = 7) | |
---|---|
Distribution (without truncation) | Mean (5th to 95th percentile) |
Product between[10]: | 9.19 (0.274 to 35.1) |
Lognormal with 5th and 95th percentiles 0.75 and 3.5. | 1.81 (0.750 to 3.50) |
Product between[12]:
| 29.1 (0.672 to 112) |
Lognormal with 25th and 75th percentiles 100 and 1 k. | 1.36 k (19.1 to 5.25 k) |
Normal[11] with mean and standard deviation 7 and 83. | 6.99 (-129 to 144) |
Product between[13]:
| 0.420 (0.339 to 0.512) |
Product between[14]:
| 2.69 (0.609 to 6.89) |
Multiplier of OM for the effective altruism community (N = 4) | |
---|---|
Distribution (without truncation) | Mean (5th to 95th percentile) |
Lognormal with 5th and 95th percentiles 0.75 and 3.5. | 1.81 (0.750 to 3.50) |
Product between[12]:
| 22.5 (-0.354 m to 88.8) |
Lognormal with 25th and 75th percentiles 10 and 100. | 136 (1.90 to 524) |
Normal[11] with mean and standard deviation 30 and 161. | 30.0 (-235 to 295) |
The low number of estimates is evidence of:
- The difficulty and lack of quantification of the marginal cost-effectiveness of positions. I guess I would receive more estimates for the multiplier of OM if these could be readily obtained from internal data.
- Me having framed the questions poorly.
In addition, what OM refers to is somewhat unclear. Based on what 80,000 Hours describes here, I think it can refer to both operations more broadly, or to senior operations positions which are further down the career path.
I also thought about estimating the multiplier based on the number of vacancies and candidates for operations management and all positions, but decided not given their unclear relationship with value. As vacancies decrease and candidates increase for a given position, the difference between the factual and counterfactual decreases, but the value of the factual increases.
Results
The tables below have the results for the mean, 5th percentile, and 95th percentile of the multiplier of OM and cost-effectiveness metrics. This Sheet contains more results (see tab “TOC”).
Multiplier of operations management
Multiplier of OM… | Mean | 5th percentile | 95th percentile |
---|---|---|---|
Without truncation | 4.55 | 1.30 | 13.2 |
With truncation | 4.53 | 1.31 | 13.0 |
Cost-effectiveness
Method 1
Without truncation
Cost-effectiveness (bp/G$) of… | Mean | 5th percentile | 95th percentile |
---|---|---|---|
Global health and development | 0.163 | 0.0196 | 0.509 |
Longtermism and catastrophic risk prevention | 4.04 | 1.00 | 10.0 |
Animal welfare | 41.1 | 3.85 μ | 4.42 |
Effective altruism infrastructure | 5.61 | 0.283 | 3.41 |
The effective altruism community | 5.61 | 0.283 | 3.41 |
Operations management in high-impact organisations | 22.5 | 0.643 | 20.6 |
With truncation
Cost-effectiveness (bp/G$) of… | Mean | 5th percentile | 95th percentile |
---|---|---|---|
Global health and development | 0.156 | 0.0208 | 0.481 |
Longtermism and catastrophic risk prevention | 3.95 | 1.03 | 9.71 |
Animal welfare | 1.95 | 4.67 μ | 3.64 |
Effective altruism infrastructure | 1.31 | 0.291 | 3.14 |
The effective altruism community | 1.31 | 0.291 | 3.14 |
Operations management in high-impact organisations | 5.92 | 0.666 | 18.7 |
Method 2
Without truncation
Cost-effectiveness (bp/G$) of… | Mean | 5th percentile | 95th percentile |
---|---|---|---|
Global health and development | 0.762 | 0.0522 | 2.66 |
Longtermism and catastrophic risk prevention | 4.04 | 1.00 | 10.0 |
Animal welfare | 1.35 | 0.170 | 4.18 |
Effective altruism infrastructure | 5.14 | 2.35 | 9.39 |
The effective altruism community | 1.85 | 0.766 | 3.80 |
Operations management in high-impact organisations | 8.42 | 1.54 | 26.0 |
With truncation
Cost-effectiveness (bp/G$) of… | Mean | 5th percentile | 95th percentile |
---|---|---|---|
Global health and development | 0.705 | 0.0549 | 2.53 |
Longtermism and catastrophic risk prevention | 3.95 | 1.03 | 9.71 |
Animal welfare | 1.29 | 0.177 | 4.01 |
Effective altruism infrastructure | 5.10 | 2.39 | 9.23 |
The effective altruism community | 1.79 | 0.773 | 3.59 |
Operations management in high-impact organisations | 8.10 | 1.56 | 24.7 |
Method 3
Without truncation
Cost-effectiveness (bp/G$) of… | Mean | 5th percentile | 95th percentile |
---|---|---|---|
Global health and development | 0.463 | 0.0660 | 1.43 |
Longtermism and catastrophic risk prevention | 4.04 | 1.00 | 10.0 |
Animal welfare | 21.2 | 0.101 | 4.11 |
Effective altruism infrastructure | 5.37 | 1.60 | 5.76 |
The effective altruism community | 3.73 | 0.567 | 3.58 |
Operations management in high-impact organisations | 15.5 | 1.17 | 23.8 |
With truncation
Cost-effectiveness (bp/G$) of… | Mean | 5th percentile | 95th percentile |
---|---|---|---|
Global health and development | 0.431 | 0.0676 | 1.36 |
Longtermism and catastrophic risk prevention | 3.95 | 1.03 | 9.71 |
Animal welfare | 1.62 | 0.104 | 3.49 |
Effective altruism infrastructure | 3.20 | 1.62 | 5.48 |
The effective altruism community | 1.55 | 0.574 | 3.27 |
Operations management in high-impact organisations | 7.01 | 1.19 | 21.6 |
Discussion
Multiplier of operations management
The mean of the multiplier of OM for the non-truncation and truncation cases are 4.55 and 4.53. I thought organisations would organise themselves such that the expected (marginal) cost-effectiveness would be similar for all positions, so I was somewhat surprised to get values 5 times as high as 1.
The p-values for the null hypothesis that the OM follows distributions with the same shape as the ones I obtained, but with a mean of 1, are 1.27 % and 1.16 % for the non-truncation and truncation cases[15]. So one can be reasonably confident that the multiplier is higher than 1, but only if the 11 answers I got are representative of the effective altruism community, which is far from certain.
The mean multiplier of OM for the truncation case is 99.5 % the one for the non-truncation case. This means the outliers of each of the individual estimates practically do not affect the results.
Cost-effectiveness
For the truncation case, the mean cost-effectiveness metrics as a fraction of that of longtermism and catastrophic risk prevention for methods 1, 2 and 3 are:
- For global health and development, 3.96 %, 17.8 % and 10.9 %.
- For animal welfare, 49.3 %, 32.7 % and 41.0 %.
- For effective altruism infrastructure, 33.0 %, 1.29 and 81.0 %.
- For the effective altruism community, 33.0 %, 45.2 % and 39.1 %.
- For OM, 1.50, 2.05 and 1.77.
Consequently, for the truncation case:
- Longtermism and catastrophic risk prevention is the most effective area for methods 1 and 3, and the 2nd most effective behind effective altruism infrastructure for method 2.
- Global health and development is the least effective area for all methods.
- OM is more effective than all areas for all methods. Its mean cost-effectiveness is 1.77 times as high as that of longtermism and catastrophic risk prevention for method 3.
For the non-truncation case:
- Animal welfare is the most effective area for methods 1 and 3, and the 2nd least effective in front of global health and development for method 2.
- Global health and development remains the least effective area for all methods.
- OM remains more effective than all areas for all methods. Its mean cost-effectiveness is 3.83 times as high as that of longtermism and catastrophic risk prevention for method 3.
I believe the results for the truncation case are more accurate because it is hard to represent outliers well based on subjective 90 % CIs. For example, I think the cost-effectiveness of animal welfare for the non-truncation case is too heavy-tailed, with its mean being 9.98 k (= 41.1/0.00411) times its median. The heavy-tailedness of this same metric for the truncation case seems more reasonable, with its mean being 474 (= 1.95/0.00411) times its median.
The mean cost-effectiveness of OM for the truncation case as a fraction of that for the non-truncation case is:
- For method 1, 26.3 %.
- For method 2, 96.2 %.
- For method 3, 45.3 %.
This means the outliers have a material effect for methods 1 and 3, but not for 2.
The 5th percentile, median, and 95th percentile of the cost-effectiveness of OM for method 3 with truncation are 17.0 %, 58.1 % and 3.09 times the mean of 7.01 bp/G$. I expected the distribution to be more heavy-tailed, but I arguably had in mind the wider distribution of potential applicants instead of the narrower one of those selected for working in the positions.
Further work
Some potential avenues for further work are, in my descending order of importance:
- Determining and quantifying the effect of the major drivers of the multiplier of OM, such as the size, maturity, and cause area of the organisation, and scope, seniority and personal fit to the specific position.
- Collecting additional estimates for the multiplier of OM, as a sample size of 11 may well not be representative.
- Thinking about ways to estimate the multiplier of OM based on metrics which organisations have readily available (e.g. number of opportunities and applicants).
- Understanding how the multiplier of OM will vary in the future. In general, I wonder about:
- How much the ranking of 80,000 Hours’ highest-impact career paths is going to change in the future.
- The extent to which expected future variations of the effectiveness of these career paths have already been taken into account, and integrated into the current ranking.
- To follow up on the potentially outdated EA talent needs survey - 2018 used in method 2, asking the leaders of organisations aligned with effective altruism about the cost-effectiveness of the 4 EA Funds. For example, one could request estimates for the 90 % CI:
- Either in terms of increasing the expected value of the future.
- Or as a multiple of the mean cost-effectiveness of the LTFF, and this mean in terms of increasing the expected value of the future, similarly to what Linchuan Zhang did here.
- Reflecting about how much weight to give to methods 1 and 2 in method 3, instead of defaulting to giving the same weight to both.
- Defining the significance level of the truncation case in a systematic way. I selected 99 % because its complementary of 1 % is one order of magnitude below 10 %, which is the complementary of 90 %, and most distributions I modelled are based on 90 % CIs.
- Assessing the influence of the FTX crisis on the results/interpretation of my analysis, which was conducted before it started. I expect:
- The cost-effectiveness of longtermism and catastrophic risk prevention, and effective altruism infrastructure to increase relative to global health and development, and animal welfare[16].
- The cost-effectiveness of the effective altruism community to increase[17].
- The multiplier of OM to decrease in the short-term, due to less funding being available to start and scale projects.
- ^
For 10 M random samples, each truncation and non-truncation case takes me 5 min to run and save the results.
- ^
In this text, cost-effectiveness refers to marginal cost-effectiveness.
- ^
This is a link to my copy, which contains data last updated on August 15. You can find Tyler’s Sheet here.
- ^
If X and X_pre_trunc are the truncated and pre-truncation distributions, and p is the probability of X_pre_trunc being between the minimum and maximum of X, the probability of X being between a and b is 1/p times as large as that of X_pre_trunc being between a and b, which are 2 values between the minimum and maximum of X.
- ^
These 2 values consider a wide moral weight distribution whose 95th percentile is 60 k (= 17.2 / (270 μ)) times as large as the 5th percentile.
- ^
According to Jaime:
When the data includes poorly calibrated outliers, if it's possible exclude them and take the geometric mean. If not, we should use a pooling method resistant to outliers. The median is one such popular aggregation method.
- ^
The median significantly attenuates the effect of the outliers. For the truncation (non-truncation) case, the mean multiplier with all estimates is 1.54 (1.97) times that without the 4th and 8th estimates using the median, but 11.0 (12.5) times using the mean.
- ^
According to Stien van der Ploeg, Animal Charity Evaluators’ Executive Director:
Some organisations consider any non-program related positions to fall under operations, including communications, strategy, HR, finance, and fundraising roles. Other groups only consider specific administrative jobs like finance, personnel, and organisational support as operations, and some interpret it even narrower.
- ^
The mean person working in OM has a much better fit than the respective mean applicant, but there may still be material variation amongst workers.
- ^
The 1st/2nd distribution represents the marginal impact/cost per unit time of OM relative to all positions.
- ^
The respondent mentioned this type of distribution was an approximation.
- ^
The 1st/2nd distribution represents the marginal impact/cost per unit time of OM, and the 3rd/4th that of all positions.
- ^
The 1st distribution represents the marginal impact per unit time of OM as a multiple of the mean marginal impact of all positions, and the 2nd/3rd the marginal cost per unit time of OM / all positions.
- ^
The 1st/2nd distribution represents the marginal impact/cost per unit time of OM, and the 3rd/4th that of all positions.
- ^
Calculated in J2 of the last 2 tabs of the Sheet.
- ^
According to the data collected by Tyler Maule, the spending as a fraction of the total of the FTX Foundation between January 1 and August 15 on longtermism and catastrophic risk prevention, and effective altruism infrastructure was 73.5 % and 26.5 %.
- ^
According to the data collected in July 2021 by Benjamin Todd here, the “FTX team” represented 35.8 % (= 16.5/46.1) of the funds committed to effective altruism. Assuming the cost-effectiveness is inversely proportional to the committed funds, losing those from FTX leads to it being 1.56 (= 1/(1 - 35.8 %)) times as high.
Hey there! Really appreciate you doing work on this!
As someone who is not well-versed in cost-effectiveness analysis, but is very keen learning about this work - could you make the summary a bit more accessible? When reading it I was like: 1) what the hell is bp/g$
(I know there is a wiki page linked, but I think most people don't want to click on hyperlinks during the reading of a summary, they just want to decide whether to commit to reading the post.
After I checked the wiki link I realised bp means 0.0001 but after a quick glance I'm still unsure what giga is (note that I'm writing this comment at 1 am, so the fault can definitely be mine)
Hi CB,
Thanks for asking, and being keen to learn about this work!
I understand this notation may not be the most easily comprehensible at first sight. Using it more often will arguably make it more understandable in the long-run.
As you say, 1 bp = 0.01 % = 0.0001. 1 G = 10^9, so 1 G$ means 1 billion dollars.
I see this is your 1st comment. Welcome to the EA Forum!
thanks!
Hi Vasco,
Thanks so much for all the effort put into attempting to do this calculation. I really appreciate it!
I have one main question (+ a meta comment) around the calculation of the cost-effectiveness of OM:
In the email you sent you asked for "(I_OM / C_OM) / (I_A / C_A), where:
What impact metric is meant by I? I read through your post, but maybe I missed something...
It would've helped a lot to see this information in the post itself to follow reasoning transparency.
Hi Cristina,
Thanks for the kind words!
Good point. I have not mentioned what is meant by impact. Some thoughts:
I would say I did not exactly ask people to use that formula. I asked:
Then, I gave that formula as as example:
Only 3 people explicitly used this formula (in agreement with the tables of this section).
Thanks for the feedback. I thought linking to it was fine, as the formula was just a suggestion intended to illustrate what I meant by ratio between the expected marginal cost-effectiveness of operations management and all positions.
So coming back and looking at this, one central mystery is: why is the multiplier so high? Some possible answers might be:
I'm also confused about whether operations roles are all similar enough that they can be modelled the same way.
So if I was working more on this, I'd probably:
Out of curiosity, what programs did you use for your calculations? Squiggle, sheets, other?
Hi Chana,
I used this Colab. The link was in the section Methods, but I have added a sentence to the Summary with it such that it is more visible now. Thanks.
Sorry, missed that! Thanks so much.
Really interesting post. Not to hijack it, but I didn't know about the EA Forecasting & Epistemics Slack. Can you point me to info on it or how to join?
Thanks, Max!
Regarding the Slack, from here (I appreciate it is not the most visible place; I do not know whether there is more information elsewhere):
I guess you can reach to Ozzie here.