(Content warning: this post mentions a question from the 2024 EA Survey. If you haven't answered it yet and plan to do so, please do that first before reading on)
The 2024 EA survey asks people which of the following interventions they prefer:
- An intervention that averts 1,000 DALYs with 100% probability
- An intervention that averts 100,000 DALYs with 1.5% probability
This is a simple question in theory: (2) has 50% more expected value.
In practice, I believe this is an absurd premise, the kind that never happens in real life. How would you know that the probability that an intervention works is 1.5%?
My rule of thumb is that most real-world probabilities could be off by a percentage point or so. Note that this is different from it being 1% too low or too high; it is an entire percentage point. For the survey question, it might well be that intervention (1)'s success rate is only 99%, and intervention (2) could have a success rate anywhere in the low percentages.
I don't have a good justification for this rule of thumb[1]. Part of it is probably psychological: humans are most familiar with concepts like "rare". We occasionally use percentages but rarely (no pun intended) use permilles or smaller units. Parts of it is technical: probabilities that are small are harder to directly measure, so they are derived from a model. The model is imperfect, and the model inputs are likely to be imprecise.
For intervention (1), my rule of thumb does not have a large effect on the overall impact. For intervention (2), the effect is very large[2]. This makes the survey question so hard to answer, and the answers so hard to interpret.
There are, of course, established ways to deal with this mathematically. For example, one could use a portfolio approach that allocates some fraction of resources to intervention (2). Such strategies are valuable, even necessary, to deal with this type of question. As a survey respondent, I felt frustrated with having just two options. I feel that the survey question creates a false sense of "all you need is expected value"; it asks for a black-and-white answer where the reality has lots of shades.[3]
My recommendation and plea: Please communicate humbly, especially when using very low probabilities. Consider that all your numbers, but low probabilities especially, might be inaccurate. When designing thought experiments, keep them as realistic as possible, so that they elicit better answers. This reduces misunderstandings, pitfalls, and potentially compounding errors. It produces better communication overall.
- I welcome pointers to research about this! ↩︎
- The effect is large, in the sense that the expected intervention value could be 500 DALYs or 2500 DALYs. However, the expected expected intervention value does not change if we just add symmetric error margins. ↩︎
- Caveat: I don't know what the survey question was intended to measure. It might well be a good question, given its goal. ↩︎
OP here :) Thanks for the interesting discussion that the two of you have had!
Lukas_Gloor, I think we agree on most points. Your example of estimating a low probability of medical emergency is great! And I reckon that you are communicating appropriately about it. You're probably telling your doctor something like "we came because we couldn't rule out complication X" and not "we came because X has a probability of 2%" ;-)
You also seem to be well aware of the uncertainty. Your situation does not feel like one where you went to the ER 50 times, were sent home 49 times, and have from this developed a good calibration. It looks more like a situation where you know about danger signs which could be caused by emergencies, and have some rules like "if we see A and B and not C, we need to go to the ER".[1]
Your situation and my post both involve low probabilities in high-stakes situations. That said, the goal of my post is to remind people that this type of probability is often uncertain, and that they should communicate this with the appropriate humility.