When modeling with uncertainty we often care about the expected value of our result. In CEAs, in particular, we often try to estimate . This is different from both and (which are also different from each other). [1] The goal of this post is to make this clear.
One way to simplify this is to assume that the cost is constant. So we only have uncertainty about the effect. We will also assume at first that the effect can only be one of two values, say either 1 QALY or 10 QALYs with equal probability.
Expected Value is defined as the weighted average of all possible values, where the weights are the probabilities associated with these values. In math notation, for a random variable ,
where are all of the possible values of .[2] For non-discrete distributions, like a normal distribution, we'll change the sum with an integral.
Coming back to the example above, we seek the expected value of effect over cost. As the cost is constant, say dollars, we only have two possible values:
In this case we do have , but as we'll soon see that's only because the cost is constant. What about ?
which is not , a smaller amount.
The point is that generally In fact, we always have with equality if and only if is constant.[3]
Another common and useful example is when is lognormally distributed with parameters . That means, by definition, that is normally distributed with expected value and variance respectively. The expected value of itself is a slightly more complicated expression:
Now the fun part: is also lognormally distributed! That's because . Its parameters are (why?) and so we get
In fact, we see that the ratio between these values is
- ^
See Probability distributions of Cost-Effectiveness can be misleading for relevant discussion. There are arguably reasons to care about the two alternatives or rather than , which are left for a future post.
- ^
One way to imagine this is that if we sample many times we will observe each possible value roughly of the times. So the expected value would indeed generally be approximately the average value of many independent samples.
- ^
Due to Jensen's Inequality.
Have you figured out exactly when "E[costs]/E[effects] or E[effects]/E[costs]" is called for? I have historically agreed with the point you are making, but my beliefs have been shaken recently. Here's an example that has made me think twice:
You are donating $100 to a malaria charity and can choose between charity A and B. Charity A gets bednets for $1 each. Charity B does not yet know the cost of its bednets, but they will cost either $0.50 or $1.50 with equal probability.
Donating to charity A has a value of 100 bednets. Donating to charity B has expected value 133 bednets (equal chance of buying 200 or 66). But "E[costs]/E[effects] or E[effects]/E[costs]" is the same for each charity. In this case, E[effect/cost] seems like the right metric.
So is, the difference the fact that total costs are fixed? Someone deciding whether to start an organization or to commit to fully-funding a new intervention would have to contend with variable, unknown total costs.
Is it because funding charity A involves buying 100 "shares" in an intervention, and funding charity B involves buying either 200 or 66 "shares", which "E[costs]/E[effects] or E[effects]/E[costs]" fails to capture?