In your post, I think your concerns are in two categories:
Issue A. Not tracking the effects of recipients (or more likely, initially trying to track but not finding no positive statistical effects and dropping data collection).
Indeed, AMF had a plan to monitor malaria case rates before and after distributions to prove their effectiveness. However, when they actually collected the data they concluded the data was of poor quality and so abandoned this plan...I find this very worrying. Maybe the data was of poor quality, but that is a reason for working harder in this area rather than abandoning it altogether. In general, if we only have poor quality data about malaria in a region, doesn’t that mean we do not know how effective a bednet distribution will be?
Issue B. At the country level (not monitoring recipients of AMF nets but malaria levels in countries), there is no/limited/mixed evidence for malaria reduction:
Taking a step back from the Against Malaria Foundation to look at the malaria problem more generally, there is mixed evidence that bed net distributions reduce malaria case rates. GiveWell has a macro review of the evidence which shows at the nation-level you cannot demonstrate any impact from all malaria control initiatives.
...Malaria rates in Benin, DRC, Ghana, Mali & Sierra Leone increased as net coverage increased, which is more evidence that the malaria data being used is not great. In central Africa malaria was trending downwards before bednet coverage was scaled up, further muddying the waters when trying to measure impact.
“Available data and studies appear to show some cases of apparent malaria control success, and also seem to indicate that the overall burden of malaria in Africa is more likely to be falling than rising. However, in most cases it is difficult to link changes in the burden of malaria to particular malaria control measures, or to malaria control in general, and the data remains quite limited and incomplete, such that we cannot confidently say that the burden of malaria has been falling on average.”
What you wrote and the reasoning is a complete and well reasoned line of thought from careful study of the AMF website.
However, this is not sufficient evidence for strong updates against AMF.
For me, it's not even enough evidence that would cause me to investigate this issue further.
The root issue/crux is that the "causal inference"/"causal identification" or the information you can get from statistics you collected here is very low, and far from a model of impact or finding the Truth.
Some perspectives:
Issue A: For the first issue, where tracking recipients was ineffective (or as you suggest and I would also find plausible, they found no statistical effect and then data collection was dropped), I don't know more than what you wrote, but finding no effects is plausible, even common, in highly successful interventions.
- The statistical power may be very low. To get intuition for this, remember that a life saved costs $5000 in expectation and a bednet costs ~$2. In some real statistical sense, you literally need thousands of bednets to get an "observation" of a death or life saved. So you may need many, hundreds of thousands, or really millions of bednets to get enough observations for statistical power. But that's just one layer of the difficulty and assumes perfectly balanced groups of treatment/control, demographics—you may need an order of magnitude more observations to do a proper observational study. Even generously, that's a large fraction of all the bednets distributed in a year. From this problem alone, my prior would be to find no effect and also I would expect it to impose large operational costs that many donors would find unacceptable (I would).
- The above implies a pretty clean, controlled test environment. E.g. two villages, one with bednets or one without, or really, two children in the same household, where one gets a treatment with one bednet and one without. This isn't going to happen in the actual program and the effects are wildly different if not controlled.
- Examples of random stories that's going to mess up inference: a principled bednet distributor might give nets to poorer families, families that have sicker children and adults. Since everyone probably knows bednets are effective, wealthier families might get their own (which is good, AMF can give to the really poor), and these wealthy families might get more premium bednets and treatments (e.g. $10 instead of $2), so you don't have a comparison group.
- There's even more pathological stories that mess up your inference: if you were a skilled implementor, working in this program on the group for many years, and you know you only have 100 bednets for 1000 people (maybe because the EAs got captured by the AI/futurist memes which diverted all the billionaire funds), it's possible that you know, working on the ground, who gets the bednets is very important, like by a factor of 2 or 4. That is, if you give the right bednets to the right people you can increase cost effectiveness by 200-400%. By definition, this skill isn't legible by some survey. So your very skill in giving bednets to the worst families, more afflicted by malaria, means that someone looking at the data will go "hey when we collect data for recipients of malaria nets, these families don't look better worse off, let's cancel this."
Issue B: Cross country effects
- The cross country sort of examinations suffers from all of the issues above, but is even weaker. For example, climate trends, poverty, institutional change are all going forces that will mess up results, and even this is description is a crude gesture at the realities of what is going on. What about other ways malaria can be contracted, besides sleeping in an bednet eligible bed?
- These confounding effects mean that nation studies might never find an effect at all, even with very effective interventions. One new major crux is how much coverage of bednets there is in a country. Again, I don't know anything about this more than reading your post, but if bednet distribution is 10% or even 30%, that is may not be enough to find an effect even if bednets were 100% effective.
- That's assuming that bednets were 100% effective. If bednets were even 1% effective (which by the way still makes them completely worth it and is consistent with the CEA of $5000 per life for ~$2 bednet), you may never be able to find an effect from an observational study.
Basically, cross country regressions aren't good without being embedded with a strong model/context and this domain is sort of an "also ran" in economics.
Again, what you wrote and the reasoning is a complete and well reasoned line of thought from careful study of the AMF website.
You said:
we may be ignoring evidence that the world is more complex than we thought, something which effective altruists ignore at their peril.
Like, to be clear, let's flip the evidence another way around:
Imagine someone who came to you for money for a new project or new business. This person didn't understand the intervention, didn't understand the country or people. All they present is an argument they read from papers, with just country level observational data, or data from someone who they didn't know, who collected some data giving nets to families.
If you were being asked to give money to this person, this information is not enough to trust them, (and it may even be wise to distrust them if this was the only argument they were able to present.)
(I only skimmed your post, and it has been some time since I've read either the GiveWell intervention reports or the studies they draw from)
I appreciate attempts to criticize/red-team existing EA organizations and EA evaluations of interventions. That said, this argument mostly falls flat for me.
My understanding is that the structure of the GiveWell recommendation for the Against Malaria Foundation (AMF) is really quite simple:
These arguments are not iron-clad. For example, for #1, maybe you think insecticidal bednets are so a priori implausible as an anti-malaria intervention that you would not trust any level of RCT evidence? But this just falls flat to me, as "bednets that prevent/kill mosquitoes makes it harder for malarial mosquitoes to sting kids at night" passes some very simple sanity checks, at least for me. (Or perhaps you think drawing GiveWell's conclusion from the RCTs is statistically wrong, because of reasons? If so, it'd be good to list the reasons!)
Another reason you might doubt #2 is relevant is if you're suspicious that AMF can confer similar results as would be implied by the RCTs. For example, if you think the places AMF works in is so "out-of-distribution" relative to the RCTs, because of lower malarial load[1]. But my understanding is that a) the GiveWell analysis accounts for this and b) the malarial loads aren't that different.
There are a number of other reasons that I would not go into that engages with the argument structure.
However, your critique does not engage with the structure of the argument, and instead[2] argues that because there's no direct empirical evidence of AMF's specific distribution of bednets saving lives, we cannot assume that AMF's bednets save lives.
I currently think your post is an overly myopic treatment of the evidence. For a better extension by my lights, I'd be interested to see more engagement from you on whether the structure of the original argument is wrong, or alternatively, why you think your alternative formulation/framework of the problem ought to be the preferred one. I would also be interested in a very different critique of AMF that takes GiveWell's structure as a given but argues that by those lights, AMF is not a good donation target (eg because the intervention research is actually shoddy, or because AMF is actually bad at delivering bednets).
[1] My understanding is that, in contrast, substantially lower worm load is a serious reason to be skeptical of the present-day impact of deworming interventions.
[2] You also argue that there's observational data against AMF's effectiveness because the countries they work in don't have obviously lower malarial loads. However I think causality is just pretty hard to determine from observational data, for reasons Charles mentions here.
Thanks Linch, interesting thoughts.
To clarify, my point is not just there's no direct empirical evidence of AMF's specific distributions saving lives. My point is that there is no direct evidence of any non-RCT/"real world" distributions saving lives.
Further, this is not because nobody is looking for such evidence. GiveWell's macro review of the evidence suggests every time somebody has looked for evidence of non-RCT/"real world" distributions saving lives they've come up with nothing.
I agree with your summary of the GiveWell argument (strong RCT evidence + AMF as competent distributor). However, in order to turn these two facts into a prediction about future we need to add the assumption that the RCT evidence applies to future distributions. This is the weak link in the chain. As you say, differences in malarial load could distort things. Differences in the underlying health of the population, differences in net usage and increasing insecticide resistance are other contenders, along with many more I'm sure. If we can't see any evidence of impact after distributing hundreds of millions of bednets then it seems reasonable to question if this key assumption is leading us astray.