Hide table of contents

with .

Figure 1: total utility vs. total expenditure

You've probably seen a curve like Figure 1 before: We can do more good by expending more resources, but the marginal cost-effectiveness tends to decrease as we run out of low-hanging fruit. This post is about the relationship between that utility vs. expenditure curve and the set of possible opportunities we can work on.

In 2021, OpenPhil wrote that they model GiveWell's returns to scale as isoelastic with η=0.375. In a recent blogpost, OpenPhil wrote that they "tend to think about returns to grantmaking as logarithmic by default"[1]. In the model they cite for logarithmic returns, @Owen Cotton-Barratt models opportunities as having independent distributions of cost and benefit and works out approximately logarithmic utility curves from some reasonable assumptions. What follows is a simpler but less general approach to the same problem.

Think of the distribution of opportunities as a density curve with cost-effectiveness on the x axis and available scale at that level of cost-effectiveness on the y axis (see Figure 2). By cost-effectiveness, i mean the utils-per-dollar of an opportunity. And by available scale, i mean how many dollars can be spent at a given level of cost-effectiveness[2]. Equivalently, you can think of the y axis as the density of ways to spend one dollar at a given cost-effectiveness.

Figure 2: density of available scale at each level of cost-effectiveness

A univariate distribution of opportunities is easier to reason about than a bivariate one, but at the cost of losing information that might affect the order in which we fund them, so we can't represent something like difficulty-based selection in Owen Cotton-Barratt's model.[3]

Suppose that we start at the positive infinity cost-effectiveness end of the opportunity distribution and work our way left towards zero[4]. In reality, some low-hanging fruit has already been picked, but that's OK because it just means that in the final answer we shift our position on the utility vs. expenditure graph by however many dollars have already been spent.

Cost-effectiveness is the derivative of utility with respect to expenditure. And available scale density is the derivative of expenditure with respect to cost-effectiveness. Letting  be cost-effectiveness,  be the scale density function, and  be the utility function, we have the following differential equation:

where  is cost-effectiveness as a function of total expenditure,  is total expenditure as a function of cost-effectiveness, and  is the derivative of expenditure with respect to cost-effectiveness. Solving this differential equation lets us convert between two different pretty intuitive[5] ways of thinking about diminishing returns to scale.

I think it makes sense to model the distribution of opportunities as a power law:

  • First and foremost, it makes the math easy.
  • Rapidly approaching 0 at infinity makes sense.
  • Going to infinity at 0 makes sense because there's a kajillion ways to spend a ton of resources inefficiently.
  • A lot of stuff actually is pretty Pareto-distributed in real life.
  • And, of course, cost-effectiveness of opportunities having a Pareto-like distribution is EA dogma[6]

And so that the integral converges on the positive infinity side, the exponent must be less than negative one.

It turns out that if you work this out (see appendix) for a power law opportunity distribution , you wind up with an isoelastic  where

This seems like a pretty neat and satisfying result that hopefully will make it easier to think about this stuff. I suspect that some EAs have been, like me, explicitly or implicitly modelling the distribution of cost-effectiveness of opportunities as a power law and modelling diminishing returns to scale as isoelastic without thinking about both of those things at the same time and realizing that, when we do interventions in the optimal order, those two things are mathematically equivalent.

appendix

derivation

import sympy as sp

q = sp.symbols("q", positive=True)
eta = sp.symbols("eta", positive=True)
k = sp.symbols("k", positive=True)
S_tot_0 = sp.symbols("S_tot_0", positive=True)

# This needs to be written as -1 - something positive
# to enforce that p is less than -1 so that the integral converges
# and sympy is able to make some necessary simplifications.
# And then i went back and changed that something to 1 / eta
# once i worked out the answer and saw that it was an isoleastic utility function.
p = -1 - 1 / eta

S = k * q**p

q_0 = sp.solve(sp.integrate(S, (q, q, sp.oo)) - S_tot_0, q)[0]

# What we actually want to do here is evaluate this
# integral from q_0 to infinity, but sympy can't handle that.
# So, instead, we use the following trick:
# We know that the integral of f(x) from q_0 to sp.oo
# equals something - F(q_0), so define that something as a variable.
# And utility is a torsor, so adding some constant changes nothing.

C = sp.symbols("C", real=True)
U = sp.simplify(C - sp.integrate(q * S, q).subs({q: q_0}))
U = sp.simplify(sp.integrate(U.diff(S_tot_0), S_tot_0))

order = sp.O(U.args[0][0], S_tot_0).args[0]
assert order.equals(S_tot_0 ** (1 - eta))
assert sp.O(U.args[1][0], S_tot_0).equals(sp.O(sp.log(S_tot_0)))
  1. ^

    Logarithmic utility is the special case of isoelastic utility where η=1

  2. ^

    Because this is a density, it has weird dimensions: dollars per (utils per dollar)  = dollars squared per util

  3. ^

    But the information about the opportunities' cost and benefit is still there: *waves hands* If you zoom in on the scale vs. cost-effectiveness curve — that is, reduce the bin width on the histogram to epsilon — you'll see a bunch of Dirac deltas representing individual discrete interventions whose cost is their integral and whose value is their cost times their cost-effectiveness.

  4. ^

    I think there's a case to be made that this assumption is less silly than it sounds: If, in everything here, you replace the words "cost-effectiveness" and "utility" with "expected cost-effectiveness" and "how much good we think we did", then all the math still works out the same and the result still makes sense unless there's learning or bias involved, which would both make things too complicated anyway.

  5. ^

    to me, at least

  6. ^

    "The top x% of interventions are z times more effective than the median intervention!"

Comments7


Sorted by Click to highlight new comments since:

Great post!

Here is a demonstration without using code. If the probability density function (PDF) of the available expenditure for a given cost-effectiveness follows a Pareto distribution (power law), it is , where  is the cost-effectiveness,  is the minimum cost-effectiveness, and  is the tail index. The total expenditure required for the marginal cost-effectiveness to drop to a given value  is . So the marginal cost-effectiveness is , which is an isoelastic function.

If the total utility  gained until the marginal cost-effectiveness drops to a given value  is an isoelastic function of the aforementioned total expenditure, with elasticity . Comparing this with the last expression above for .

with .

I think you mean .

What you wrote looks clean and correct and, indeed, i used the Pareto distribution  parameter incorrectly and will change that line of the post. Thank you!

Thanks! I think this is really helpful.

[Warning: this comment is kind of thinking-out-loud; the ideas are not yet distilled down to their best forms.]

The only thing I want to quibble about so far is your labelling my model as more general. I think it isn't really -- I had a bit of analysis based on the bivariate distribution, but really this was just a variation on the univariate distribution I mostly thought about.

Really the difference between our models is in the underlying distribution they assume. I was assuming something roughly (locally) log-uniform. You assume a Pareto distribution.

When is the one distribution a more reasonable assumption than the other? This is a question which is at the heart of things, and I expect to want to think more about. At a first pass I like your suggestive analysis that (something like) the Pareto distribution is appropriate when there are many many ways to spend money in ways that are a little effective but not very. I still feel drawn to the log-uniform model when thinking about the fundamental difficulty of finding important research breakthroughs. But perhaps something like Pareto ends up being correct if we think about opportunities to fund research? There could be lots and lots of opportunities to fund mediocre research (especially if you advertise that you're willing to pay for it).

Actually the full version of this question should wrestle with needing to provide other distributions at times. In an efficient altruistic market all the best opportunities have been taken, so the top tier of remaining opportunities are all about equally good. Even if I dream up a new research area, it may to some extent funge against other types of research, so the distribution may be flatter than it would absent the work done already by the rest of the world. (This is something I've occasionally puzzled over for several years; I think your post could provide another helpful handhold for it.)

Howdy. I appreciate your reply.

By the difference in generality i meant the difficulty-based problem selection. (Or the possibility of some other hidden variable that affects the order in which we solve problems.)

 

I was assuming something roughly (locally) log-uniform. You assume a Pareto distribution.

On a closer examination of your 2014 post, i don't think this is true. If we look at the example distribution

Assume that an area has 100 problems, the first of difficulty 1, and each of difficulty 1.05 times the previous one. Assume for simplicity that they all have equal benefits.

and try to convert it to the language i've used in this post, there's a trick with the scale density concept: Because the benefits of each problem are identical, their cost-effectiveness is the inverse of difficulty, yes. But the spacing of the problems along the cost-effectiveness axis decreases as the cost increases. So the scale density, which would be the cost divided by that spacing, ends up being proportional to the inverse square of cost-effectiveness. This is easier to understand in a spreadsheet. And the inverse square distribution is exactly where i would expect to see logarithmic returns to scale.

 

As for what distributions actually make sense in real life, i really don't know. That's more for people working in concrete cause areas to figure out than me sitting at home doing math. I'm just happy to provide a straightforward equation for those people to punch their more empirically-informed distributions into.

Of course you're right; my "log uniform" assumption is in a different space than your "Pareto" assumption. I think I need to play around with the scale density notion a bit more until it's properly intuitive.

This principle has seemingly strange implications:

  • If and nothing has been done yet, then the first thing you do produces infinite utility (assuming you start by doing the best thing possible and then move to progressively worse things).
  • If , then a randomly-chosen opportunity has infinite expected utility.

For me this seems more useful as an implication in the other direction: economists generally treat utility functions as isoelastic[1], which implies that opportunities are Pareto-distributed.

But it's also useful as a sanity check: it's intuitive to me that utility is isoelastic, and also that opportunities are Pareto-distributed, so it's nice that these two intuitions are consistent with each other.

[1] Although this might be more out of convenience than anything else, since isoelastic utility functions have some nice mathematical properties.

Curated and popular this week
 ·  · 11m read
 · 
Confidence: Medium, underlying data is patchy and relies on a good amount of guesswork, data work involved a fair amount of vibecoding.  Intro:  Tom Davidson has an excellent post explaining the compute bottleneck objection to the software-only intelligence explosion.[1] The rough idea is that AI research requires two inputs: cognitive labor and research compute. If these two inputs are gross complements, then even if there is recursive self-improvement in the amount of cognitive labor directed towards AI research, this process will fizzle as you get bottlenecked by the amount of research compute.  The compute bottleneck objection to the software-only intelligence explosion crucially relies on compute and cognitive labor being gross complements; however, this fact is not at all obvious. You might think compute and cognitive labor are gross substitutes because more labor can substitute for a higher quantity of experiments via more careful experimental design or selection of experiments. Or you might indeed think they are gross complements because eventually, ideas need to be tested out in compute-intensive, experimental verification.  Ideally, we could use empirical evidence to get some clarity on whether compute and cognitive labor are gross complements; however, the existing empirical evidence is weak. The main empirical estimate that is discussed in Tom's article is Oberfield and Raval (2014), which estimates the elasticity of substitution (the standard measure of whether goods are complements or substitutes) between capital and labor in manufacturing plants. It is not clear how well we can extrapolate from manufacturing to AI research.  In this article, we will try to remedy this by estimating the elasticity of substitution between research compute and cognitive labor in frontier AI firms.  Model  Baseline CES in Compute To understand how we estimate the elasticity of substitution, it will be useful to set up a theoretical model of researching better alg
 ·  · 7m read
 · 
Crossposted from my blog.  When I started this blog in high school, I did not imagine that I would cause The Daily Show to do an episode about shrimp, containing the following dialogue: > Andres: I was working in investment banking. My wife was helping refugees, and I saw how meaningful her work was. And I decided to do the same. > > Ronny: Oh, so you're helping refugees? > > Andres: Well, not quite. I'm helping shrimp. (Would be a crazy rug pull if, in fact, this did not happen and the dialogue was just pulled out of thin air).   But just a few years after my blog was born, some Daily Show producer came across it. They read my essay on shrimp and thought it would make a good daily show episode. Thus, the Daily Show shrimp episode was born.   I especially love that they bring on an EA critic who is expected to criticize shrimp welfare (Ronny primes her with the declaration “fuck these shrimp”) but even she is on board with the shrimp welfare project. Her reaction to the shrimp welfare project is “hey, that’s great!” In the Bible story of Balaam and Balak, Balak King of Moab was peeved at the Israelites. So he tries to get Balaam, a prophet, to curse the Israelites. Balaam isn’t really on board, but he goes along with it. However, when he tries to curse the Israelites, he accidentally ends up blessing them on grounds that “I must do whatever the Lord says.” This was basically what happened on the Daily Show. They tried to curse shrimp welfare, but they actually ended up blessing it! Rumor has it that behind the scenes, Ronny Chieng declared “What have you done to me? I brought you to curse my enemies, but you have done nothing but bless them!” But the EA critic replied “Must I not speak what the Lord puts in my mouth?”   Chieng by the end was on board with shrimp welfare! There’s not a person in the episode who agrees with the failed shrimp torture apologia of Very Failed Substacker Lyman Shrimp. (I choked up a bit at the closing song about shrimp for s
 ·  · 4m read
 · 
This post presents the executive summary from Giving What We Can’s impact evaluation for the 2023–2024 period. At the end of this post we share links to more information, including the full report and working sheet for this evaluation. We look forward to your questions and comments! This report estimates Giving What We Can’s (GWWC’s) impact over the 2023–2024 period, expressed in terms of our giving multiplier — the donations GWWC caused to go to highly effective charities per dollar we spent. We also estimate various inputs and related metrics, including the lifetime donations of an average 🔸10% pledger, and the current value attributable to GWWC and its partners for an average 🔸10% Pledge and 🔹Trial Pledge.  Our best-guess estimate of GWWC’s giving multiplier for 2023–2024 was 6x, implying that for the average $1 we spent on our operations, we caused $6 of value to go to highly effective charities or funds.  While this is arguably a strong multiplier, readers may wonder why this figure is substantially lower than the giving multiplier estimate in our 2020–2022 evaluation, which was 30x. In short, this mostly reflects slower pledge growth (~40% lower in annualised terms) and increased costs (~2.5x higher in annualised terms) in the 2023–2024 period. The increased costs — and the associated reduction in our giving multiplier — were partly due to one-off costs related to GWWC’s spin-out. They also reflect deliberate investments in growth and the diminishing marginal returns of this spending. We believe the slower pledge growth partly reflects slower growth in the broader effective altruism movement during this period, and in part that GWWC has only started shifting its strategy towards a focus on pledge growth since early 2024. We’ve started seeing some of this pay off in 2024 with about 900 new 🔸10% Pledges compared to about 600 in 2023.  All in all, as we ramp up our new strategy and our investments start to pay off, we aim and expect to sustain a strong (a