8

22

The overgeneralisation is*extremely* easy to make. Just search "effective altruism" on twitter right now. :'( (n.b., not recommended if you care about your own emotional well-being.)

I doubt that any effective altruists would say that our wellbeing (as benefactors) doesn't matter. Nor is there any incompatibility between the basic ideas (or practice) of effective altruism on one hand, and that there are limits on our duties to help others on the other hand.

Ah, I think we've got different notions of probability in mind: the subjective credence of the agent (OpenPhil grantmakers) versus something like the objective chances of the thing actually happening, irrespective of anyone's beliefs.

OpenPhilanthropy's "hits based giving" approach seems like it doesn't fall prey to your argument, because they are willing to ignore the "Don't Prevent Impossible Harms" constraint.

For what it's worth, I don't think this is true (unless I'm misinterpreting!). Preferring low-probability, high-expected value gambles doesn't require preferring gambles with probability 0 of success.

Thanks for a brilliant post! I really enjoyed it. And in particular, as someone unfamiliar with the computational complexity stuff, your explanation of that part was great!

I have a few thoughts/questions, most of them minor. I'll try to order them from most to least important.

**The recommendation for Good-Old-Fashioned-EA**

If I'm understanding the argument correctly, it seems to imply that real-world agents can't assign fully coherent probability distributions over Σ *in general*. So, if we want to compare actions by their prospects of outcomes, we just can't do so. (By any plausible decision theory, not just expected value theory.) The same goes for the action of saving a drowning child--we can't give the full prospect of how that's going to turn out. And, at least on moral theories that say we should sometimes promote the good (impartially wrt time, etc), consequentialist theories especially, it seems that it's going to be NP-hard to say whether it's better to save the child or not save the child. (*cf* Greaves' suggestion wrt cluelessness that we're *more clueless* about the effects of near-term interventions than those of long-term interventions) So, why is it that the argument doesn't undermine those near-term interventions too, at least if we do them on 'promoting-the-good' grounds?

** 2. Broader applications**

On a similar note, I wonder if there are much broader applications of this argument than just longtermism (or even for promoting the good in general). Non-consequentialist views (both those that* sometimes* recommend against promoting the good and those that place *no *weight on promoting the good) are affected by uncertainty too. Some rule-absolutist theories in particular can have their verdicts swayed by extremely low-probability propositions--some versions say that if an action has *any *non-zero probability of killing someone, you ought not do it. (Interesting discussion here and here) And plausible versions of views that recognise a harm-benefit asymmetry run into similar problems of many low-probability risks of harm (see this paper). Given that, just how much of conventional moral reasoning do you think your argument undermines?

(FWIW, I think this is a really neat line of objection against moral theories!)

** 3. Characterising longtermism**

The definition of Prevent Possible Harms seemed a bit unusual. In fact, it sounds like it might violate Ought Implies Can just by itself. I can imagine there being some event *e *that might occur in the future, for which there's no possible way we could make that *e *less likely or mitigate its impacts.

On a similar note, I think most longtermist EAs probably wouldn't sign up to that version of PPH. Even when *e can* be made less likely or less harmful, they wouldn't want to say we should take costly steps to prevent such an *e* *regardless* of how costly those steps are, and regardless of how much they'd affect *e*'s probability/harms.

Also, how much more complicated would it be to run the argument with the more standard definition of "deontic strong longtermism" from p26 of Greaves & MacAskill? (Or even just their definition of "axiological strong longtermism" on p3?)

Related: the line "*a worldview that seeks to tell us what we ought to do, and which insists that extreme measures may need to be taken to prevent low-probability events with potentially catastrophic effects*" seems like a bit of a mischaracterization. A purely consequentialist longtermist might endorse taking extreme measures, but G&M's definition is compatible with having absolute rules against doing awful things--it allows that we should only do what's best for the long term in decision situations where we don't need to do awful things to achieve it, or even just in decisions of which charity to donate to. (And in *What We Owe The Future*, Will explicitly advocates against doing things that commonsense morality says are wrong.)

**4. Longtermism the idea vs. what longtermists do in practice**

On your response to the first counterargument ("...*the imperative to avoid and mitigate the possibility of catastrophic climate change is not uniquely highlighted by longtermist effective altruists...Indeed, we have good evidence that we are already experiencing significant negative impacts of climate change (Letchner 2021), such that there is nothing especially longtermist about taking steps now to reduce climate change...*" etc), this doesn't seem like an objection to longtermism actually being true (at least as Greaves & MacAskill define it). It sounds like potentially a great objection to working on AI risk or causes with even more speculative evidence bases (some wild suggestions here). But for it to be *ex ante* better for the far future to work on climate change seems perfectly consistent with the basic principle of longtermism; it just means that a lot of self-proclaimed longtermists aren't actually doing what longtermism recommends.

** 5. What sort of probabilities?**

One thing I wasn't clear on was what sort of probabilities you had in mind.

If they're objective chances: The probabilities of lots of things will just be 0 or 1, perhaps including the proposition about AI risk. And objective chances already don't seem action-guiding---there are plenty of decision situations where agents just won't have any clue what the objective chances are (unless they're running all sorts of quantum measurements).

If they're subjective credences: It seems pretty easy for agents to figure out the probability of, say, AI catastrophe. They just need to introspect about how confident they are that it will/won't happen. But then I think (but am unsure) that the basic problem you identify is that it would take way too much computation (more than any human could ever do) to figure out if those credences are actually coherent with all of the agent's other credences. And, if they're not, you might think that all possible decision theories just break down. Which is worrying! But it seems like, if we can put together a decision theory for incoherent probability distributions / bounded agents, then the problem could be overcome, maybe?

If they're evidential probabilities (of the Williamson sort, relative to the agent's evidence): These seem like the best candidate for being the normatively relevant sort of probabilities. And, if that's what you have in mind, then it makes sense that agents can't do all the computation necessary to work out what all the evidential probabilities are (which maybe isn't a new point---it seems pretty widely recognised that doing Bayesian updating on everything would be way too hard for human agents).

** 6. "for all"**

I think you've mostly answered this with the first counterargument, but I'll ask anyway.

In the definitions of No Efficient Algorithm, PIBNETD-Harms, Independence of Bad Outcomes, and the statement of Dagum & Luby's result, I was confused about the quantifiers. Why are we interested in the computational difficulty of this *for any *value of δ , *for any *belief network*, for any *proposition*/*variable *V, *and (for estimation) *for any *assignment of *w* to variables? Not just the actual value of δ, the agent's actual belief network, and the actual propositions we're trying do figure out whether they have non-zero probability? I don't quite understand how general this needs to be to say something very specific like "There's a non-zero probability that a pandemic will wipe out humanity".

Here's my more general confusion, I think: I don't quite intuitively understand why it's computationally hard to look up the probability of something if you've already got the full probability distribution over possible outcomes. Is it basically that, to do so, we have to evaluate Δu(V) across lots and lots of different possible states? Or is it the difficulty of thinking up every possible way the proposition could be true and every possible way it could be false and checking the probability of each of those? (Apologies for the dumb question!)

** 7. Biting the fanaticism bullet**

(Getting into the fairly minor comments now)

I don't think you need to bite the fanaticism bullet for your argument. At least if I'm roughly understanding the argument, it doesn't require that we care about *all* propositions with non-zero probability, no matter how low their probability. Your response to the 3rd counterargument seems to get at this: we can just worry about propositions with absolute harms/benefits below some bound (and, I'm guessing, with probabilities above some bound) and we still have an NP-hard problem to solve. Is this right?

This is mainly a dialectical thing. I agree that fanaticism has good arguments behind it, but still many decision theorists would reject it and so would most longtermist EAs. It'd be a shame to give them the impression that, because of that, they don't need to worry about this result!

**8. Measuring computation time**

I was confused by this: "In general, this means we can make efficient gains in the accuracy of our inferences. Setting δ=10^−4, if it takes takes approximately 1 minute to generate an estimate with a margin for error of ϵ=.05, then achieving a margin for error of ϵ=.025 will take four minutes."

To be able to give computation times like 1 minute, do you have a particular machine in mind? And can you make the general point that "the time it takes goes up by a factor of 4 if we reduce the margin of error from x to y"?

** Typos/phrasing**

In the definition of Don't Prevent Impossible Harms, I initially misread "*For any event e that will not occur in the future*" as being about what will actually happen, as against what it's possible/impossible will happen. Maybe change the phrasing?

On the Ought Implies Can point, specifically "*Moreover, Don't Prevent Impossible Harms follows from the idea that "ought implies can" (Kant, 1781); if e won't occur, then it's not possible for us to make it any less likely, or to mitigate negative outcomes that occur because e occurs, and so we cannot be compelled to attempt to do so. To illustrate, if Venus were to suddenly deviate from its orbit tomorrow and collide with Earth, this would presumably lead to a very large aggregate reduction in utility on Earth. But Venus won't do that...*": Ought Implies Can implies the version of Don't Prevent Impossible Harms that you give (put in terms of reducing the probability), but it doesn't imply that we shouldn't *prevent* such harms. After all, if Venus is definitely not going to do that, then any action we take might (arguably) be said to 'prevent' it!

When you say "*If P(V=1)>0, then there is real number δ such that, if Δu(V)<δ, then those agents ought to take costly steps now to make it less likely that V=1*" (and mention Δu(V) elsewhere), shouldn't it be "*Δ*u(V)<δ" since *Δ*u(V) is a measure of the difference in value, and only if that difference is great enough should agents take costly steps?

Typo: "*Dagum and Luby's result shows that this cannot be done in efficiently in the general case."*

Thanks again for the post!

Yep, we've got pretty good evidence that our spacetime will have infinite 4D volume and, if you arranged happy lives uniformly across that volume, we'd have to say that the outcome is better than any outcome with merely finite total value. Nothing logically impossible there (even if it were practically impossible).

That said, assigning value "∞" to such an outcome is pretty crude and unhelpful. And what it means will depend entirely on how we've defined ∞ in our number system. So, what I think we should do in such a case is not say V equals such and such. Instead, ditch the value function when you've left the domain where it works. Instead, just deal with your set of possible outcomes, your lotteries (probability measures over that set), and a betterness relation which might sometimes follow a value function but might also extend to outcomes beyond the function's domain. That's what people tend to do in the infinite aggregation literature (including the social choice papers that consider infinite time horizons), and for good reason.

That'd be fine for the paper, but I do think we face at least some decisions in which EV theory gets fanatical. The example in the paper - Dyson's Wager - is intended as a mostly realistic such example. Another one would be a Pascal's Mugging case in which the threat was a moral one. I know I put P>0 on that sort of thing being possible, so I'd face cases like that if anyone really wanted to exploit me. (That said, I think we can probably overcome Pascal's Muggings using other principles.)

Even without precisely quantifying the harms each way, I think we can be pretty confident that the harms on one side are greater than on the other. It seems pretty clear that the harms of letting a non-trivial number of people experience sexual harassment and assault (or even the portion of those harms prevented by implementing a strong norm about this) are greater than the harms of preventing (even 100x as many) people from sleeping around within the community. The latter is just a far, far smaller harm per person--far less than 1/100 as great. And I think the same verdict holds even if the latter harm is concentrated mainly on neurodivergent people. And it holds even more clearly if we add on (to the first type of harm) the further harms of making the community less welcoming or uncomfortable for many more people than just those who directly experience harassment or assault.

(But, if there are at-least-as-effective ways to prevent the former harms, without imposing the latter harms, then this isn't very relevant.)