Researcher @ Founders Pledge

2

19

This is an interesting question. My answer is that I think of this exercise as a rough kind of meta-analysis, where results are combined in a (weighted) arithmetic mean.

I think the reason geometric means don’t work well in these kinds of exercises is that there are all sorts or differences and errors in individual studies that make it very likely that some of them will show zero (or negative) effect. Once this happens your geometric mean goes to zero (or breaks). I don’t think it makes sense to say something like “if because of noise the effect size on one of my many studies happens to show 0% instead of 1%, my meta analysis effect should be 0% instead of 10%.”

I think GW has been increasingly considering income gains as an important part of their interventions. Significant parts of the cost-effectiveness of most of their health interventions are now estimated to derive from income gains. My argument is that if we try to be somewhat consistent with how we treat the income gains from health interventions and education interventions, the education interventions end up looking very good because they likely produce much more income gain with a better evidence base.

Michael,

That is entirely fair. It's reasonable to not accept the cross-sectional results as having any information value for your prior. So I should have have said we can start with a prior from the HLI meta-analysis results (which if I remember correctly are pretty statistically significant). Then when we get the information from the Easterlin and O'Connor paper, where the results are the same as our prior, but not statistically significant, to say that the new information does not shift our prior results at all. So even though the Easterlin and O'Connor paper does not give us much information one way or the other, it still seems reasonable to say there is no reason to think that the results are likely to be much lower than the HLI results?

If I understand correctly, it sounds like we now agree on the math of my post, and on my arguments around which coefficients from cross-sectional vs longitudinal regressions seem to match? But I think we still disagree about whether the impacts of a gradual increase in gdp across time should be compared to cross-sectional differences?

My first thought on our disagreement is that an income doubling is a fairly arbitrary metric. I think it would be equally reasonable to zoom in on the cross sectional graph, and look at the impact of a 1% increase in income. We can imagine country Y on the cross-section graph which lies a little higher than Ethiopia on the regression line in my post. This country would have $1010 per capital GDP and a SWB of 4+1*.007=4.007, versus Ethiopia at $1000 and 4. If we compare this to what we would expect from a .007 coefficient in one of your alternative regressions, it looks like it’s exactly what we would expect from one year of 1% growth vs the counterfactual for Ethiopia? In this case we don’t need to worry about the amount of time it takes to double income, and TS and CS become more intuitively comparable?

My second thought is that if we assume that TS results are not comparable to CS results because they take a long time, wouldn’t that make the existence of the Easterlin Paradox irrelevant for making any judgements about the world? Isn’t the Easterlin Paradox a paradox precisely because we expect the coefficients to match between CS and TS, but they don’t seem to in some specifications?

“we are talking about the Gallup results and ignoring the EVS/WVS results. They are preferred for long-run periods.”

Agreed. I haven’t looked at the EVS/WVS results at all, so there is a good chance that they are less sensitive to the kinds of alternative specifications I tried for the Gallup results.

“It’s possible that many people on the lower end of the income distribution benefit greatly – indeed many economists, even happiness ones, believe this in their bones. We just need more evidence at scale.”

I share the same intuition, and find this an interesting area for further exploration. I would be curious to hear your thoughts on why the “Growth X LDC” coefficients in all of your regressions are negative (which is a surprise to me). This seems to imply that people lower down the income distribution are actually benefiting less from % income increases? Re-running your regressions on just the less-developed countries in your Gallup dataset, I also get smaller coefficients than those for the whole dataset.

Thanks again for the response!

Yes, I am definitely talking about WELLBYs. I meant to say that there are two ways of looking at both income and SWB, a level at a point in time, and the sum of the levels per year (we can think of those as the area under the curve plotted across time). We can call the summed versions INCYs and WELLBYs, and the point in time estimates Income and SWB. So I think in year 13.5, we can say that we get .2 WELLBYs for 1 INCY. Or alternatively, we can say that we get .027 SWBs for 14% Income gain. I don't think that we should be comparing SWBs (a point in time estimate) to INCYs (a summation estimate).

To illustrate I’ll try to go back to the example of boosting Ethiopia’s growth by 1pp, using your coefficient of 0.002. For simplicity, let’s say that Ethiopia starts with a per capita GDP of $1000, a SWB of 4, and a real growth rate of 0%. It seems like we agree that “The population in year 13.5 reports .027 greater SWB points after an increase in growth by one percent.” So if we boost growth to 1% I think we agree that in year 71 Ethiopia would have a per capita GDP of $2000 (versus the counterfactual $1000) and a SWB = 4 + 71*.002=4.14.

Now to address our discussion on (3) in the below thread, you say: "As you point out, our results include larger coefficient estimates using different specifications, yet we still argue they are not economically significant," and then in response to my comment that "those coefficients seem to be close to what we would expect from the cross-sectional data," you comment "I don’t agree that the results are similar in size."

Let’s assume we accept the coefficient from your regression in table 3, column 5: 0.007. That would imply that in year 71 Ethiopia would have roughly twice the GDP than it would have had counterfactually (compared to the 0% growth world), and a SWB = 4+71*.007 = 4.5. This is 0.5 points higher than the counterfactual.

Now let's imagine that in the cross section regression Ethiopia and country X are both exactly on our regression line. Ethiopia is at $1000 and SWB of 4, country X is at $2000 and SWP of 4.5 (That is roughly where the cross sectional regression lines fall as I argue in my post, and as you can see from the graph I include). If there were no Easterlin Paradox, we would expect that if Ethiopia gradually got to $2000 GDP, it would move up the regression line to where country X currently is. But it seems like that is exactly what the .007 regression coefficient implies in the preceding paragraph? If so, is this at odds with your response on discussion (3) in the below thread?

Alternatively, don’t the coefficients from Sacks, Stevenson, and Wolfers 2012 roughly correspond to the larger coefficient estimates in your regressions (since both include 10 year short-term fluctuations)? So if Sacks et. al. convincingly reran their analysis to focus on the same countries and longer time series that you use, and got the same coefficients they did in their paper, would that not update us towards thinking that longitudinal and cross-sectional results might be similar?

I think we could also use a similar argument about the Ethiopian counterfactual SWB = 4 + 71*.002=4.14 to argue that it matches the cash transfer results that I cite in my post.

“In your spreadsheet, you multiplied 0.002 by the number of years, assuming a larger increase in SWB per year (i.e., 0.004 in year 2), which is not correct.”

I meant the .004 to represent how much happier a person is after two years of faster growth than they would have been counterfactually (if growth had been 1pp lower). Since their annual change in SWB would have been .002 higher, they would have gotten .004 better off by year 2.

In other words, I think your formulas (4)-(3) represent the impact of additional growth (versus the counterfactual) on life satisfaction at time t (SWBt). So using your: 0.002*(∆G)*t = .002*1*2=.004 happier than the counterfactual. This is only .002 happier than the counterfactual after 1 year, but .004 happier than the counterfactual if there had been no additional growth at all. So since the person was .002 happier in year 1 and .004 happier in year 2, I would consider that a cumulative .006 happier across the two years.

I think for the cumulative life satisfaction gain to be .027, you would have to expect the person in year 13.5 to only be .002 happier than he would have been without the additional growth (that way he would only be .002 happier each year, for a total of .027 life satisfaction points summed across the 13.5 years). But that would imply that our SWB measure wasn’t annualized, and that it shouldn’t matter whether you’ve been growing for one year or 1000, you would still be happier by the same amount?

Perhaps our difference is in how we are using the word cumulative? By cumulative, I mean actually summing across the counterfactual SWB gains in each of the 13.5 years. I think this is the correct thing to look at if we are comparing it to the income gains in each year summed across the 13.5 years. Perhaps by cumulative you meant just the total counterfactual impact on life satisfaction in year 13.5? But then it seems like we need to add the counterfactual impacts at each of the preceding years?

Perhaps one useful intuition pump would be to compress the whole income doubling into 1 year. Lets say annual growth increases by 100pp. Then we counterfactually double income in the first year. The impact on SWB is 100*.002=.2 life satisfaction points. Which is a bit higher than the estimates from cash transfers.

Thanks so much for taking the time to engage in this discussion! I am going to try to reply to where we have interesting areas of disagreement, and to number the points for easier response.

- “For alternative policies that similarly cover a long period of time, see recent work by me and Easterlin, "Explaining happiness trends in Europe." "

Thank you for sending this. It’s encouraging that we may have levers to move that have larger impacts than economic growth. It definitely updates me away from believing the results of the social safety net regression I outline in my post (although as I mentioned in the post, those results were never that compelling). I used OurWordInData’s __“Adequacy of Social Safety Net Programs”__. There were only 30 countries, and they were mostly LMICs, so I am not surprised that the results differ from yours. I would be curious what you think of that dataset, and whether the data you use looks like it avoids some of the noise in mine. I would definitely love to see more analysis on this with bigger datasets than both of ours’ if there are any ways to create them. I wonder if the implication might be something like: large social safety nets are effective in European states which have a lot of state capacity to deliver services, and less effective in the LMICs in my dataset.

2. “Fundamentally, you cannot compare doubling one’s income at a point of time (e.g., due to lottery and investment returns or cash transfers) to doubling one’s income in 71 years… Empirically, the growth-happiness relation depends upon the time horizon; it gets smaller as the duration increases. We discuss this in the paper conceptually and in reference to the two data sets we use. The longer period in the WVS/EVS data results in lower growth- subjective well-relations.”

I think this is an interesting point. If we believe in hedonic adaptation, then we would expect the results of cash transfer RCTs to be much higher than the results over 14 or 40 years like in the two datasets you use in your paper. So the fact that the implied impacts seem to be very similar seems to be (very weak) evidence against adaptation in this context? Am I right in thinking that the results in your two sets of regression implicitly factor in adaptation, since those countries became wealthier slowly? If so, I think we should be comfortable applying the results with 14-40 years (gallup-wvs/evs) of adaptation factored in, to an estimate that looks at benefits spanning from 1-40 years for Ethiopia?

3. "Your replication / robustness tests are not so surprising. As you point out, our results include larger coefficient estimates using different specifications, yet we still argue they are not economically significant, implying we would argue your alternative results are still too small to prioritize growth.”

But those coefficients seem to be close to what we would expect from the cross-sectional data? If that is the case, are you suggesting that even if the Easterlin paradox turned out to not hold, we would not update towards thinking more of economic growth? That would imply that a low income country could increase their life satisfaction from around 4 to around 6 if they could figure out a way to enable the kind of catch-up growth that some East-Asian countries have managed.

4) " I’m reasonably assured you can find much more effective policies for short-run gains. See Table 1 of P. Frijters, A. E. Clark, C. Krekel, R. Layard, A happy choice: Wellbeing as the goal of government. Behav. Public Policy, 1–40 (2020). “

Thank you for sending this. Reducing fear of violent crime stands out as especially promising to me as a potential intervention. However, it does look like doubling income is still one of the larger results here, and is not obviously harder to achieve than some of the other large-effect interventions. I definitely hope that there are more tractable interventions than boosting growth that we can find. Also, even if we don’t, I think we can probably find ways to do a lot of good by just saving lives, rather than boosting well-being.

5) "Your robustness test results do not overturn our results; they fall within the range we estimate and only apply to one data set, indeed the one that is based on a shorter period, which is less preferred for reasons explained in the text"

I agree. I only meant to try a couple of easy alternative specifications to see how sensitive the results are to them. The Gallup World Poll Data had more countries and was easier to download so I just decided to look at that dataset. If my results are correct, they are not meant to invalidate the Easterlin Paradox. I just think we should be aware that it seems sensitive to specification (even after accepting the exclusion of transition economies, and countries with less than 12 years of data).

6) "Perhaps you can explain to me how the GiveWell team determined the “Value assigned to increasing ln(consumption) by one unit for one person for one year” and why this is used in determining the value of subjective well-being benefits.”

GiveWell assigns one unit to an income doubling, so boosting ln(consumption) by one unit is simply =1/ln(2). They then try to estimate the value of saving a life relative to an income doubling by looking at surveys of recipients, Global Burden of Disease estimates, value of statistical life approaches, internal surveys, and other sources. For the purposes of my estimation, you wouldn’t need to accept any of their assumptions except for the fact that it is difficult to find ways to help people that is more than ten times more cost-effective than simply giving cash to the very poorest people in the world.

Thanks again for the interesting exchange.

Thanks again for the discussion!

I agree that it’s very reasonable to look at the cumulative “cost” in terms of income doublings, rather than just the final number. But I think then you also need to look at the cumulative well-being gains. You don’t just get the life satisfaction gain of the doubling on your 71st year, you also get smaller gains every year before that, just like you do for the costs.

I’ve set up a __spreadsheet __based on your example of looking at the first 13.5 years to see when one cumulative income doubling has occurred. In that case, on the first year you get .002 life satisfaction points, on the second .004, until the 13.5th when you get .027. When you sum them you get a total of 0.2 life satisfaction points. You get those at a cost of one income doubling. This is actually larger than would be predicted by my approach of multiplying .002 by 71, which would imply 0.142 life satisfaction points. (The reason it’s larger is that when we are looking at time horizons like 13.5 years, the income doublings don’t really benefit that much from compounding yet, so the cost hasn’t grown quickly enough to get to the .142 threshold, which I believe happens closer to the 100 year mark).

Thanks for the thoughtful comment. I am a little torn about valuing pure intelligence effects. On one hand it seems silly to only focus on the income effect when we know that education likely increases intelligence, quality of democratic participation, socialization, wisdom, etc. but on the other hand, when I tried to find evidence for education increasing health or life satisfaction beyond what we would expect from the income effects, I did not find much (I mention this briefly towards the end of the post). I would want to be wiser and more intelligent partly because I would expect to be able to live a more satisfied life, and to be able to make better choices that would make me happier and healthier. If the additional intelligence doesn’t seem to be actually increasing heath or life satisfaction, it makes me more suspicious of the claim that it is really producing a valuable kind of intelligence or wisdom. On the other hand, I do believe that life satisfaction is only one of many morally valuable things. Maybe the (overly convenient) reconciliation of these intuitions is to say that health interventions likely have these other effects too, where a healthier person gains the ability to make more free decisions, and potentially live a more social and fuller life.