9461 karmaJoined May 2016


Welfare and moral weights


Topic contributions

Now suppose that there's only you, and you're about to flip a coin to decide if you'll go to study bednets or deworming. You'd prefer to commit to not then switching to the other thing.

Maybe? I'm not sure I'd want to constrain my future self this way, if it won't seem best/rational later. I don't very strongly object to commitments in principle, and it seems like the right thing to do in some cases, like Parfit's hitchhiker. However, those assume the same preferences/scale after, and in the two envelopes problem, we may not be able to assume that. It could look more like preference change.

In this case, it looks like you're committing to something you will predictably later regret either way it goes (because you'll want to switch), which seems kind of irrational. It looks like violating the sure-thing principle. Plus, either way it goes, it looks like you'll fail to follow your own preferences later, and it will seem irrational then. Russell and Isaacs (2021) and Gustafsson (2022) also argue similarly against resolute choice strategies.

I'm more sympathetic to acausal trade with other beings that could simultaneously exist with you (even if you don't know ahead of time whether you'll find bednets or deworming better in expectation), if and because you'll expect the world to be better off for it at every step: ahead of time, just before you follow through and after you follow through. There's no expected regret. In an infinite multiverse (or a non-negligible chance of one), we should expect such counterparts to exist, though, so plausibly should do the acausal trade.

Also, I think you'd want to commit ahead of time to a more flexible policy for switching that depends on the specific evidence you'll gather.[1]

Now if you only think about it later, having studied bednets, I'm imagining that you think "well I would have wanted to commit earlier, but now that I know about how good bednets are I think deworming is better in expectation, so I'm glad I didn't commit". Is that right? (I prefer to act as though I'd made the commitment I predictably would have wanted to make.)

Ya, that seems mostly right on first intuition.

However, acausal trade with counterparts in a multiverse still seems kind of compelling.

Also, I see some other appeal in favour of committing ahead of time to stick with whatever you study (and generally making the commitment earlier, too, contra what I say above in this comment): you know there's evidence you could have gathered that would tell you not to switch, because you know you would have changed your mind if you did, even if you won't gather it anymore. Your knowledge of the existence of this evidence is evidence that supports not switching, even if you don't know the specifics. It seems like you shouldn't ignore that. Maybe it doesn't go all the way to support committing to sticking with your current expertise, because you can favour the more specific evidence you actually have, but maybe you should update hard enough on it?

This seems like it could avoid both the ex ante and ex post regret so far. But, still you either:

  1. can't be an EU maximizer, and so you'll be vulnerable to money pump arguments anyway or abandon completeness and often be silent on what to do (e.g. multi-utility representations), or
  2. have to unjustifiably fix a single scale and prior over it ahead of time.


The same could apply to humans vs aliens. Even if we're not behind the veil of ignorance now and never were, there's information that we'd be ignoring: what real or hypothetical aliens would believe and the real or hypothetical existence of evidence that supports their stance.

But, it's also really weird to consider the stances of hypothetical aliens. It's also weird in a different way if you imagine finding out what it's like to be a chicken and suffer like a chicken.

  1. ^

    Suppose you're justifiably sure that each intervention is at least not net negative (whether or not you have a single scale and prior). But then you find out bednets have no (or tiny) impact. I think it would be reasonable to switch to deworming at some cost. Deworming could be less effective than you thought ahead of time, but no impact is as bad as it gets given your credences ahead of time.

I basically agree with all of this, and make some similar points in my sections Multiple possible reference points and Conscious subsystems. I think there are still two envelopes problem between what we actually access, and we don't have a nice way of uniquely fixing comparisons. But, I think it's defensible to do everything human-relative or relative to your own experiences (which are human, so this is still human-relative), what's accessed. You'll need to use multiple reference points.

(Replying back at the initial comment to reduce thread depth and in case this is a more important response for people to see.)

I understand that you're explaining why you don't really think it's well modelled as a two-envelope problem, but I'm not sure whether you're biting the bullet that you're predictably paying some utility in unnecessary ways (in this admittedly convoluted hypothetical), or if you don't think there's a bullet there to bite, or something else?

Sorry, yes, I realized I missed this bit (EDIT: and which was the main bit...). I guess then I would say your options are:

  1. Bite the bullet (and do moral trade).
  2. Entertain both the human-relative stance and the alien-relative stance even after finding out which you are,[1] say due to epistemic modesty. I assume these stances won't be comparable on a common scale, at least not without very arbitrary assumptions, so you'd use some other approach to moral uncertainty.
  3. Make some very arbitrary assumptions to make the problem go away.

I think 1 and 2 are both decent and defensible positions. I don't think the bullet to bite in 1 is really much of a bullet at all.

From your top-level comment:

Then it's revealed which you are, you remember all your experiences and can reason about how big a deal they are — and then you will predictably pay some utility in order to benefit the other species more. It similarly looks like it's a mistake to predictably have this behaviour (in the sense that, if humans and aliens are equally likely to be put in this kind of construed situation, then the world would be predictably better off if nobody had this behaviour), and I don't really feel like you've addressed this.

The aliens and humans just disagree about what's best, and could coordinate (moral trade) to avoid both incurring unnecessary costs from relatively prioritizing each other. They have different epistemic states and/or preferences, including moral preferences/intuitions. Your thought experiment decides what evidence different individuals will gather (at least on my bullet-biting interpretation). You end up with similar problems generally if you decide behind a veil of ignorance what evidence different individuals are going to gather (e.g. fix some facts about the world and decide ahead of time who will discover which ones) and epistemic states they'd end up in. Even if they start from the same prior.

Maybe one individual comes to believe bednets are the best for helping humans, while someone else comes to believe deworming is. If the bednetter somehow ends up with deworming pills, they'll want to sell them to buy bednets. If the dewormer ends up with bednets, they'll want to sell them to buy deworming pills. They could both do this at deadweight loss in terms of pills delivered, bednets delivered, cash and/or total utility. Instead, they could just directly trade with each other, or coordinate and agree to just deliver what they have directly or to the appropriate third party.

EDIT: Now, you might say they can just share evidence and then converge in beliefs. That seems fair for the dewormer and bednetter, but it's not currently possible for me to fully explain the human experience of suffering to an alien, or to give an alien access to that experience. If and when that does become possible, we'd be able to agree much more.

Another illustration: suppose you don't know whether you'll prefer apples or oranges. You try both. From then on, you're going to predictably pay more for one than the other. Some other people will do the opposite. Whenever an apple-preferrer ends up with an orange for whatever reason, they would be inclined to trade it away to get an apple. Symmetrically for the orange-preferrer. They might both do so together at deadweight loss and benefit from directly trading with each other.

This doesn't seem like much of a bullet to bite.

  1. ^

    Or your best approximations of each, given you'll only have direct access to one.

This all seems right to me.

(I wouldn't pick out the worldview bucket approach as the solution everyone should necessarily find most satisfying, given their own intuitions/preferences, but it is one I tend to prefer now.)

Alternatively, you might assume you actually already are a human, alien or chicken, have (and remember) experience with suffering as one of them, but are uncertain about which you in fact are. For illustration, let's suppose human or alien. Because you're uncertain about whether you're an alien or human, your concept of suffering points to one that will turn out to be human suffering with some probability, p, and alien suffering with the rest of the probability, 1-p. You ground value relative to your own concept of suffering, which could turn out to be (or revised to) the human concept or the alien concept with respective probabilities.

Let H_H be the moral weight of human suffering according to a human concept of suffering, directly valued, and A_H be the moral weight of alien suffering according to a human concept of suffering, indirectly valued. Similarly, let A_A and H_A be the moral weights of alien suffering and human suffering according to the alien concept of suffering. A human would fix H_H, build a probability distribution for A_H relative to H_H and evaluate A_H in terms of it. An alien would fix A_A, build a probability distribution for H_A relative to A_A and evaluate H_A in terms of it.

You're uncertain about whether you're an alien or human. Still, you directly value your direct experiences. Assume A_A and H_H specifically represent the moral value of an experience of suffering you've actually had,[1] e.g. the moral value of a toe stub, and you're doing ethics relative to your toe stubs as the reference point. You therefore set A_A = H_H. You can think of this as a unit conversion, e.g. 1 unit of alien toe stub-relative suffering = 10 units of human toe stub-relative suffering.

This solves the two envelopes problem. You can either use A_A or H_H to set your common scale, and the answer will be the same either way, because you've fixed the ratio between them. The moral value of a human toe stub, H, will be H_H with probability p, and H_A with probability 1-p. The moral weight of an alien toe stub, A, will be A_H with probability p and A_A with probability 1-p. You can just take expected values in either the alien or human units and compare.

We could also allow you to have some probability of being a chicken under this thought experiment. Then you could set A_A = H_H = C_C, with C_C representing the value of a chicken toe stub to a chicken, and C_A, C_H, A_C and H_C defined like above.

But if you're actually a chicken, then you're valuing human and alien welfare as a chicken, which is presumably not much, since chickens are very partial (unless you idealize). Also, if you're a human, it's hard to imagine being uncertain about whether you're a chicken. There's way too much information you need to screen off from consideration, like your capacities for reasoning and language and everything that follows from these. And if you're a chicken, you couldn't imagine yourself as a human or being impartial at all.

So, maybe this doesn't make sense, or we have to imagine some hypothetically cognitively enhanced chicken or an intelligent being who suffers like a chicken. You could also idealize chickens to be impartial and actually care about humans, but then you're definitely forcing them into a different normative stance than the ones chickens actually take (if any).

  1. ^

    It would have to be something "common" to the beings under consideration, or you'd have to screen off information about who does and doesn't have access to it or use of that information, because otherwise you'd be able to rule out some possibilities for what kind of being you are. This will look less reasonable with more types of beings under consideration, in case there's nothing "common" to all of them. For example, not all moral patients have toes to stub.

Similarly in this case we could set up an (admittedly construed) situation where you start by doing a bunch of reasoning about what's best, under a veil of ignorance about whether you're human or alien. Then it's revealed which you are, you remember all your experiences and can reason about how big a deal they are — and then you will predictably pay some utility in order to benefit the other species more.

In this case, assuming you have no first-person experience with suffering to value directly (or memory of it), you would develop your concept of suffering third-personally — based on observations of and hypotheses about humans, aliens, chickens and others, say — and could base your ethics on that concept. This is not how humans or the aliens would typically understand and value suffering, which is largely first-personally. The human has their own vague revisable placeholder concept of suffering on which they ground value, and the alien has their own (and the chicken might have their own). Each also differ from the hypothetical third-personal concept.

Technically, we could say the humans and aliens have developed different ethical theories from each other, even if everyone's a classical utilitarian, say, because they're picking out different concepts of suffering on which to ground value.[1] And your third-personal account would give a different ethical theory from each, too. All three (human, alien, third-personal) ethical theories could converge under full information, though, if the concepts of suffering would converge under full information (and if everything else would converge).[2]

With the third-personal concept, I doubt there'd be a good solution to this two envelopes problem that actually gives you exactly one common moral scale and corresponding prior when you have enough uncertainty about the nature of suffering. You could come up with such a scale and prior, but you'd have to fix something pretty arbitrarily to do so. Instead, I think the thing to do is to assign credences across multiple scales (and corresponding priors) and use an approach to moral uncertainty that doesn't depend on comparisons between them. (EDIT: And these could be the alien stance and human stance which relatively prioritize the other and result in a two envelopes problem.) But what I'll say below applies even if you use a single common scale and prior.

When you have first-person experience with suffering, you can narrow down the common moral scales under consideration to ones based on your own experience. This would also have implications for your credences compared to the hypothetical third-person perspective.

If you started from no experience of suffering and then became a human, alien or chicken and experienced suffering as one of them, you could then rule out a bunch of scales (and corresponding priors). This would also result in big updates from your prior(s). You'd end up in a human-relative, alien-relative or chicken-relative account (or multiple such accounts, but for one species only).

  1. ^

    A typical chicken very probably couldn't be a classical utilitarian.

  2. ^

    A typical chicken's concept of suffering wouldn't converge, but we could capture/explain it. Their apparent normative stances wouldn't converge either, unless you imagine radically different beings.

Is the two-envelope problem, as you understand it, a problem for anything except expectational utilitarianism?

I think it is or would have been a problem for basically any normative stance (moral theory + attitudes towards risk, etc.) that is at all sensitive to risk/uncertainty and stakes roughly according to expected value.[1]

I think I've given a general solution here to the two envelopes problem for moral weights (between moral patients) when you fix your normative stance but have remaining empirical/descriptive uncertainty about the moral weights of beings conditional on that stance. It can be adapted to different normative stances, but I illustrated it with versions of expectational utilitarianism. (EDIT: And I'm arguing that a lot of the relevant uncertainty actually is just empirical, not normative, more than some have assumed.)

For two envelopes problems between normative stances, I'm usually skeptical of intertheoretic comparisons, so would mostly recommend approaches that don't depend on them.

  1. ^

    (Footnote added in an edit of this comment.)

    For example, I think there's no two envelopes problem for someone who maximizes the median value, because the reciprocal of the median is the median of the reciprocal.

    But I'd take it to be a problem for anyone who roughly maximizes an expected value or counts higher expected value in favour of an act, e.g. does so with constraints, or after discounting small probabilities. They don't have to be utilitarian or aggregate welfare at all, either.

I agree that this approach, if you're something like a (risk neutral) expectational utilitarian, is very vulnerable to fanaticism / muggings, but that to me is a problem for expectational utilitarianism. To you and "to someone who objected to this approach because it seemed to give a fanatical weight to chickens in the human/chicken comparison", I'd say to put more weight on normative stances that are less fanatical than expectational utilitarianism.

I personally reserve substantial skepticism of expected value maximization in general (both within moral stances and for handling moral uncertainty between them), expected value maximization with unbounded value specifically, aggregation in general and aggregation by summation. I'd probably end up with "worldview buckets" based on different attitudes towards risk/uncertainty, aggregation and grounds for moral value (types of welfare, non-welfarist values, as in the problem of multiple (human) reference points). RP's CURVE sequence goes over attitudes to risk and their implications for intervention and cause prioritization. Then, I doubt these stances would be intertheoretically comparable. For uncertainty between them, I'd use an approach to moral uncertainty that didn't depend on intertheoretic comparisons, like a moral parliament, a bargain-theoretic approach, variance voting or just sizing worldview buckets proportionally to credences.

In practice, within a neartermist focus (and ignoring artificial consciousness), this could conceivably roughly end up looking like a set of resource buckets: a human-centric bucket, a bucket for mammals and birds, a bucket for all vertebrates, a bucket for all vertebrates + sufficiently sophisticated invertebrates, a bucket for all animals, and a ~panpsychist bucket.[1] However, the boundaries between these buckets would be soft (and softer), because the actual buckets don't specifically track a human-centric view, a vertebrate view, etc.. My approach would also inform how to size the buckets and limit risky interventions within them.

For example, fix some normative stance, and suppose within it:

  1. you thought a typical chicken had a 1% chance of having roughly the same moral weight (per year) as a typical human (according to specific moral grounds), and didn't matter at all otherwise.
  2. you aggregate via summation.
  3. you thought helping chickens (much) at all would be too fanatical.

Then that view would also recommend against human-helping interventions with at most a 1% probability of success.[2] Or, you could include some chicken interventions with many more roughly statistically risky independent human-helping interventions, because many independent risky (positive expected value) bets together don't look as risky. Still, this stance shouldn't bet everything on an intervention helping humans with only a 1% chance of success, because otherwise it could just bet everything on chickens with a similar payoff distribution. This stance would limit risky bets. Every stance could limit risky bets, but the ones that end up human-centric in practice would tend to do so more than others.

  1. ^

    Or, maybe some of the later buckets are just replaced with longtermist buckets, if and because longtermist bets could have similar probabilities of making a difference, but better payoffs when they succeed.

  2. ^

    Depending on how the nature of your attitudes to risk. This could follow from difference-making risk aversion or probability difference discounting of some kind. On the other hand, if you maximized the expected utility of the arctan of total welfare, a bounded function, then you'd prioritize marginal local improvements to worlds with small populations and switching between big and small populations, while ignoring marginal local improvements to worlds with large populations. This could also mean ignoring chickens but not marginal local improvements for humans, because if chickens don't count and we go extinct soon (or future people don't count), then the population is much smaller.

Hi Emily. I've written a post about how to handle moral uncertainty about moral weights across animals, including humans: Solution to the two envelopes problem for moral weights. It responds directly to Holden's writing on the topic. In short, I think Open Phil should evaluate opportunities for helping humans and nonhumans relative to human moral weights, like comparison method A from Holden's post. This is because we directly value that with which we're familiar, e.g. our own experiences, and we have just regular empirical (not moral) uncertainty about its nature and whether and to what extent other animals have similar relevant capacities.

I suppose my main objection to the collective decision-making response to the no difference argument is that there doesn't seem to be any sufficiently well-motivated and theoretically satisfying way of taking into account the collective probability of making a difference, especially as something like a welfarist consequentialist, e.g. a utilitarian. (It could still be the case that probability difference discounting is wrong, but it wouldn't be because we should take collective probabilities into account.)

Why should I care about this collective probability, rather than the probability differences between outcomes of the choices actually available to me? As a welfarist consequentialist, I would compare probability distributions over population welfare values[1] for each of my options, and use rules that rank them pairwise, or within subsets of options, or selects maximal options from subsets of options. Collective probabilities don't seem to be intrinsically important here. They wouldn't be part of the descriptions of the probability distributions of population welfare values for options actually available to you or calculable from them, if and because you don't have enough actual influence over what others will do. You'd need to consider the probability distributions of unavailable options to calculate them. Taking those distributions into account in a way that would properly address collective action problems seems to require non-welfarist reasons. This seems incompatible with act utilitarianism.

The differences in probabilities between options, on the other hand, do seem potentially important, if the probabilities matter at all. They're calculable from the probability measures over welfare for the options available to you.

Two possibilities Kosonen discusses that seem compatible with utilitarianism are rule utilitarianism and evidential decision theory. Here are my takes on them:

  1. For rule utilitarianism, you could assume you should follow universalizable procedures that lead to the best outcomes (or best distributions of outcomes) if everyone (had the same beliefs and) followed them (e.g. Greaves et al., 2022 (section 4.5), Kant (his first formulation of the categorical imperative), Parfit and others). Pairwise probability difference discounting without collective thresholds leads to collective defeat, so it would be ruled out.
    1. I'm not really convinced that collective defeat under these unrealistic universalizations is a big deal at all. I don't get to choose what procedures everyone else follows (or what they believe), so why should I care about that?
    2. I expect that changing my preferences or procedures to satisfy this constraint will lead to worse outcomes according to my current preferences. That seems self-defeating in a way, too.
    3. Mendola (1986, p.158, and responses to objections to his argument in the rest of the paper) argues that this is too strong as a formal constraint, based on aliens or machines that harshly punish you and ensure your actions are net negative for following your theory, because even consequentialists couldn't meet it. That being said, I'm not entirely convinced by this. Maybe you just need to follow the right decision theory. Or, we could still think avoiding collective defeat under universalization counts in favour of some procedures over others, and we should try to (more approximately) satisfy it in more situations than fewer.
    4. You'd probably need to be only boundedly sensitive to stakes, at least in many conceivable decision situations, e.g. my post and Pruss, 2022. Depending on how exactly you are sensitive to the stakes, this could undermine longtermism.
  2. Evidential decision theory and other acausal decision theories seem pretty plausible to me, but I'm very skeptical they make enough difference unless you take into account correlations with agents across a large (e.g. infinite) multiverse. I do in fact think it's pretty likely we live in the right kind of multiverse that longtermism seems pretty plausible (or at least worthy of being a big part of my portfolio) even with probability difference discounting, though!


  1. ^

    Or aggregate welfare, or pairwise differences in individual welfare between outcomes.

Load more