Hide table of contents

This is a research report written in 2022. Three years have passed since then, and I don't necessarily endorse some of its conclusions now. That said, due to its mostly theoretical nature, much of it seem to have stood my personal test of time. I'm therefore making it public, with the due amount of caveats.

Epistemic status: Highly uncertain. These are more like "observations I find interesting" than confidence conclusions.


In this document I demonstrate that under many circumstances, one should spend lots of effort (and likely more than commonly assumed[1]) on exploration (i.e. looking for new projects) and prioritization (i.e. comparing known projects), rather than exploitation (i.e. directly working on a project). Furthermore, I provide heuristics that help one decide how much effort to spend on exploration and prioritization.    

I assume that the goal is to maximize expected altruistic impact, where by “expected” I’m referring to the mathematical expectation. The actor here can be an individual or a community, and a project can be anything ranging from a 1-week personal project to an entire cause area.

Section 1 will deal with exploration, section 2 will deal with prioritization, and section 3 will test and apply the theory in the real world. Mathematical details can be found in the appendix.

Key Findings

This is an explorative study, so the results are generally of low confidence, and everything below should be seens as hypotheses. The main goal is to suggest paths for future research, rather than to provide definitive answers.

Theoretical Results

  • Claim 3 (Importance of exploration): We, as a community and as individuals, should spend more than half of our effort on searching for potential new projects, rather than working on known projects.
    • Confidence: 45% / 20% [2]

  • Claim 6a (Importance of prioritization, uncorrelated case): When faced with k (k isn’t too small) equally-good longtermist projects that aren’t correlated, we should act as if the k tasks of evaluating every project are each as important as working on the best project that we finally identify, and allocate our effort evenly across the k+1 tasks, as long as we are able to identify the ex-post best project[3] by working on the evaluation tasks.

    • Confidence: 40% / 25%

  • Claim 6b (Importance of prioritization, correlated case): When the k projects are strongly correlated, prioritization becomes much less important than in the no-correlation case. However, one should still be willing to spend only a small portion (something like 1/sqrt(k) or 1/log(k); note that this is already much higher than the 1/(k+1) in Claim 6a) of one’s effort on direct work, and to spend the remaining portion on prioritization.
    • Confidence: 40% / 20%

Practical Results [Caveat: Written for the 2022 community]

  • Claim 7 (Applicability in real world): In real world, the most important factors deciding the applicability of our model are difficulty of reducing uncertainty[4] by exploration and prioritization (E&P), comparative scalability of E&P vs exploitation, heavy-tailedness of opportunities, fixed total budget, and utility-maximization objective.

    • Confidence: Moderate (hard to quantify)

  • Claim 8 (Applying the framework to different parts of EA): In EA, the top 3 areas where E&P deserves the largest portions of resources (relative to the total resources allocated to that area) are
    • identifying promising individuals[5] (relative to the budget of talent cultivation),

    • cause prioritization (relative to the budget of all EA research and direct work),

    • within-cause prioritization (relative to the budget of that cause).
    • Confidence: Low (hard to quantify)

Instrumental Results

Note that the following claims are results (based on mostly qualitative arguments) rather than assumptions.

  • From section 1.2: The distribution of TOC impact (impact stemming from a project’s TOC (theory of change), as opposed to e.g. flow-through effects) across different projects is strictly more heavy-tailed than log-normal.
    • Confidence: 70% / 55% [6]

  • Claim 2 (stronger version of previous claim): The distribution of TOC impact across different projects resembles the Pareto distribution in terms of heavy-tailedness.
    • Confidence: 40% / 30%
  • Claim 4: For any particular project, the distribution of its TOC impact across all potential scenarios is (roughly) at least as heavy-tailed as the Pareto distribution.[7]

    • Confidence: 70% / 55%

Important Limitations

  • Negative impact is ignored.
  • Pascal’s wager emerges in the model of prioritization, but we don’t have a very good way to handle this. (relevant)
  • The “log-normal world” is ignored, and the focus is primarily on the “power law world”. This exaggerates the importance of E&P (exploration and prioritization).
    • “Log-normal world” is the possibility that the distribution of impact is log-normal, and “power law world” is the possibility that the distribution obeys a power law (cf. claim 2, claim 4).
    • I think the power law world is somewhat more likely to be the realistic model for our purpose than the log-normal world is. See section 1.1, 1.2 and 2.1 on this.
  • Non-TOC impact of projects (e.g. flow-through effects) are ignored.
    • This is justifiable in some cases but not in all. (see the last parts of section 1.1)
  • Individual projects are assumed to consist purely of direct work (rather than a mixture of E&P and direct work, which is often the case in reality), and all parts of an individual project are homogeneous. This is especially problematic when we define projects to be larger in scope, e.g. when projects are entire cause areas.[8] Moreover, it’s sometimes hard to distinguish direct work and E&P, e.g. many kinds of direct work also provide insights on prioritization.

    • Despite this, findings in this document still tell us to optimize more for information value (E&P) when your project leads to both information value and direct impact.

  • Differences in scalability (diminishing returns) are mostly ignored.
  • Externalities (and, more generally, all social interactions) are ignored.
    • For example, if your own E&P also provides information value to others, then your E&P should be more important than what’s suggested in this document. Also it will be especially important for you to make your attempts & conclusions publicly known (e.g. writing about my job). On the other hand, you also benefit from other people’s E&P, which reduces the importance of doing E&P yourself.

 

Above are the key takeaways from this report. Please see this Google Document for the report itself. 

Huge thank-you to Daniel Kokotajlo for mentorship and Nuno Sempere for the helpful feedback. These people don't necessarily agree with the claims, and the report reflects my personal opinion only.

 

  1. ^

     In the current situation, the EA community as a whole seems to allocate approximately 9~12% of its resources to cause prioritization. But there’s some nuance to this - see the last parts of section 1.3 .

  2. ^

     Meaning I assign 0.45 probability to [the statement being true (in the real world)], and 0.2 probability to [the statement being true (in the real world) and my model being mostly right about the reason].

  3. ^

     Which, by the way, is quite an unrealistic assumption. This assumption is also shared by claim 6b.

  1. ^

     By taking this factor into account, we’ve dealt with the unrealistic assumption (“we are able to identify the ex-post best project”) in claim 6a and 6b.

  2. ^

     Including, for example, providing opportunities for individuals to test fit.

  3. ^

     Conditional on the heavy-tailedness comparison here being meaningful, which isn’t obvious. Same for similar comparisons elsewhere in this document.

  4. ^

     Subject to caveats about Pascal’s wager. See section 2.1 .

  5. ^

     A more realistic model might be a hierarchical one, where projects have sub-projects and sub-sub-projects, etc., and you need to do some amount of prioritization at every level of the hierarchy.

  6. ^

     Meaning I assign ≥0.8 probability to [the statement being true], and ≥0.8 probability to [the statement being true and my model being mostly right about the reason].

  7. ^

     For any technology X, assume that a constant amount of resources are spent each year on developing X. If we observe an exponential increase in X’s efficiency, we can infer that it always takes a constant amount of resources to double X’s efficiency, regardless of its current efficiency - which points to a Pareto distribution. This “amount of work needed to double the efficiency” may have been slowly increasing in the case of integrated circuits, but far slower than what a log-normal distribution would predict.

  8. ^

     Here the diminishing returns mean “saving 108 lives is less than 105 times as good as saving 103 lives, not because we’re scope-insensitive but because we’re risk averse and 108 is usually much more speculative and thus riskier than 103.”

  9. ^

     Isoelastic utility functions are good representatives of the broader class of HARA utility functions, which, according to Wikipedia, is “the most general class of utility functions that are usually used in practice”.

  10. ^

     Under the “fundamental assumptions” or “sense check with intuition” approach, the true distribution has finite mean, but the Pareto distribution used for approximating the true distribution has infinite mean. Under the “heuristics” approach, the true distribution itself has infinite mean.

  11. ^

     u is defined in section 2.2; it stands for the extra gain in “how good is the project that we work on” resulting from prioritization, compared to working only on an arbitrary project. For example, if prioritization increases the project quality from 1 DALY/$ to 2 DALY/$, then u=100%=1.0 .

  12. ^

     See the “⍺” column of the “Revenue” rows in table 1 of the paper.

  13. ^

     Copulas are used for modeling the dependence structure between multiple random variables. A reversed Clayton copula is a copula that shows stronger correlation when the variables take larger values. Mathematical knowledge about the (reversed) Clayton copula (and about copulas in general) isn’t needed for reading this section.

  14. ^

     This is a very crude guess, and my 90% confidence interval will likely be very, very wide.

  15. ^

     Note that by choosing r=⅓ I’m underestimating (to a rather small extent) the strength of correlation. I’ll briefly revisit this in a later footnote.

  16. ^

     Recall that we underestimated the strength of correlation by choosing r=⅓, so here u=(log k)-1 is an overestimation of the boost from prioritization, though I think the extent of overestimation is rather small.

  17. ^

     reversed because it’s negatively correlated with importance of E&P

  18. ^

     For GPT-3, 12% of compute is spent on training smaller models than the final 175B-parameter one, according to table D.1 of the GPT-3 paper, though it’s unclear whether that 12% is used for exploration/comparison, or simply checking for potential problems. Google’s T5 adopted a similar approach of experimenting on smaller models, and they made it clear that those experiments were used to explore and compare model designs, including network architectures. Based on the data in the paper I estimate that 10%-30% of total compute is spent on those experiments, with high uncertainty.

  19. ^

     I’m not counting referral fees into E&P, since they’re usually charged on the lawyer’s side while I’m mainly examining the client’s willingness to pay. Plus, it’s unclear what portion of clients use referral services, and how much referral services help improve the competence of the lawyer that you find.

  20. ^

     This is based on a simple ballpark estimate, and so I don’t provide details here.

  21. ^

     Including, for example, providing opportunities for individuals to test fit.

  22. ^

     The extent to which to prioritize promising individuals, is often discussed under the title of “elitism”. Also here’s some related research.

  23. ^

     Including, for example, providing opportunities for individuals to test fit.

  24. ^

     I know little about macroeconomics, and this claim is of rather low confidence.

Show all footnotes
Comments1


Sorted by Click to highlight new comments since:

Executive summary: The report argues that individuals and communities aiming to maximize expected altruistic impact should often allocate significantly more effort to exploration (searching for new opportunities) and prioritization (comparing known opportunities) than direct work, especially under assumptions of heavy-tailed impact distributions.

Key points:

  1. Exploration should often dominate effort allocation: In many scenarios, spending over half of one's resources on exploring new projects is more beneficial than focusing on known ones (Claim 3, confidence: 45%/20%).
  2. Prioritization is crucial, especially with uncorrelated options: When projects are uncorrelated, evaluating each project can be as valuable as directly working on the best one, justifying equal effort allocation across evaluation and execution (Claim 6a).
  3. Even with correlated projects, prioritization remains important: Although its relative importance decreases, prioritization still merits significantly more effort than direct work under plausible assumptions (Claim 6b).
  4. Impact distributions are likely heavy-tailed: Theories of change (TOC) impact across projects likely follow heavy-tailed distributions, such as Pareto, making exploration and prioritization especially valuable for identifying high-impact opportunities (Claims 2 & 4).
  5. Real-world applicability depends on several contextual factors: These include difficulty of reducing uncertainty, relative scalability of E&P, and fixed budgets; exploration and prioritization are most valuable in cause prioritization, within-cause prioritization, and identifying promising individuals (Claims 7 & 8).
  6. Caveats and limitations: The model ignores negative impacts, externalities, diminishing returns, and non-TOC impacts; it also assumes idealized conditions and treats exploration, prioritization, and direct work as cleanly separable, which is often unrealistic.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Curated and popular this week
 ·  · 13m read
 · 
Notes  The following text explores, in a speculative manner, the evolutionary question: Did high-intensity affective states, specifically Pain, emerge early in evolutionary history, or did they develop gradually over time? Note: We are not neuroscientists; our work draws on our evolutionary biology background and our efforts to develop welfare metrics that accurately reflect reality and effectively reduce suffering. We hope these ideas may interest researchers in neuroscience, comparative cognition, and animal welfare science. This discussion is part of a broader manuscript in progress, focusing on interspecific comparisons of affective capacities—a critical question for advancing animal welfare science and estimating the Welfare Footprint of animal-sourced products.     Key points  Ultimate question: Do primitive sentient organisms experience extreme pain intensities, or fine-grained pain intensity discrimination, or both? Scientific framing: Pain functions as a biological signalling system that guides behavior by encoding motivational importance. The evolution of Pain signalling —its intensity range and resolution (i.e., the granularity with which differences in Pain intensity can be perceived)— can be viewed as an optimization problem, where neural architectures must balance computational efficiency, survival-driven signal prioritization, and adaptive flexibility. Mathematical clarification: Resolution is a fundamental requirement for encoding and processing information. Pain varies not only in overall intensity but also in granularity—how finely intensity levels can be distinguished.  Hypothetical Evolutionary Pathways: by analysing affective intensity (low, high) and resolution (low, high) as independent dimensions, we describe four illustrative evolutionary scenarios that provide a structured framework to examine whether primitive sentient organisms can experience Pain of high intensity, nuanced affective intensities, both, or neither.     Introdu
 ·  · 7m read
 · 
Article 5 of the 1948 Universal Declaration of Human Rights states: "Obviously, no one shall be subjected to torture or to cruel, inhuman or degrading treatment or punishment." OK, it doesn’t actually start with "obviously," but I like to imagine the commissioners all murmuring to themselves “obviously” when this item was brought up. I’m not sure what the causal effect of Article 5 (or the 1984 UN Convention Against Torture) has been on reducing torture globally, though the physical integrity rights index (which “captures the extent to which people are free from government torture and political killings”) has increased from 0.48 in 1948 to 0.67 in 2024 (which is good). However, the index reached 0.67 already back in 2001, so at least according to this metric, we haven’t made much progress in the past 25 years. Reducing government torture and killings seems to be low in tractability. Despite many countries having a physical integrity rights index close to 1.0 (i.e., virtually no government torture or political killings), many of their citizens still experience torture-level pain on a regular basis. I’m talking about cluster headache, the “most painful condition known to mankind” according to Dr. Caroline Ran of the Centre for Cluster Headache, a newly-founded research group at the Karolinska Institutet in Sweden. Dr. Caroline Ran speaking at the 2025 Symposium on the recent advances in Cluster Headache research and medicine Yesterday I had the opportunity to join the first-ever international research symposium on cluster headache organized at the Nobel Forum of the Karolinska Institutet. It was a 1-day gathering of roughly 100 participants interested in advancing our understanding of the origins of and potential treatments for cluster headache. I'd like to share some impressions in this post. The most compelling evidence for Dr. Ran’s quote above comes from a 2020 survey of cluster headache patients by Burish et al., which asked patients to rate cluster headach
 ·  · 2m read
 · 
A while back (as I've just been reminded by a discussion on another thread), David Thorstad wrote a bunch of posts critiquing the idea that small reductions in extinction risk have very high value, because the expected number of people who will exist in the future is very high: https://reflectivealtruism.com/category/my-papers/mistakes-in-moral-mathematics/. The arguments are quite complicated, but the basic points are that the expected number of people in the future is much lower than longtermists estimate because: -Longtermists tend to neglect the fact that even if your intervention blocks one extinction risk, there are others it might fail to block; surviving for billions  (or more) of years likely  requires driving extinction risk very low for a long period of time, and if we are not likely to survive that long, even conditional on longtermist interventions against one extinction risk succeeding, the value of preventing extinction (conditional on more happy people being valuable) is much lower.  -Longtermists tend to assume that in the future population will be roughly as large as the available resources can support. But ever since the industrial revolution, as countries get richer, their fertility rate falls and falls until it is below replacement. So we can't just assume future population sizes will be near the limits of what the available resources will support. Thorstad goes on to argue that this weakens the case for longtermism generally, not just the value of extinction risk reductions, since the case for longtermism is that future expected population  is many times the current population, or at least could be given plausible levels of longtermist extinction risk reduction effort. He also notes that if he can find multiple common mistakes in longtermist estimates of expected future population, we should expect that those estimates might be off in other ways. (At this point I would note that they could also be missing factors that bias their estimates of