MSJ

Michael St Jules 🔸

Animal welfare grantmaking and advising
12791 karmaJoined Working (6-15 years)Vancouver, BC, Canada

Bio

Philosophy, global priorities and animal welfare research. My current specific interests include: philosophy of mind, moral weights, person-affecting views, preference-based views and subjectivism, moral uncertainty, decision theory, deep uncertainty/cluelessness and backfire risks, s-risks, and indirect effects on wild animals.

I've also done economic modelling for some animal welfare issues.

Want to leave anonymous feedback for me, positive, constructive or negative? https://www.admonymous.co/michael-st-jules

Sequences
3

Radical empathy
Human impacts on animals
Welfare and moral weights

Comments
2639

Topic contributions
15

It depends on the case. Do you think my answer to the above should influence which interventions I prioritise? My current top recommendations are research on i) the welfare of soil animals and microorganisms, and ii) comparisons of (expected hedonistic) welfare across species and digital systems. Could you see these changing if I thought EVs were imprecise instead of precise at a fundamental level?

 

I think there's a lot that could change if you very seriously weighed others' actual or possible direct impressions/intuitions without heavily privileging your own, before we even get into the question of precise vs imprecise credences. Epistemic modesty is going to do a lot of work first.

  1. Holding your current normative views ~constant, with precise credences, then epistemic modesty would make infinite expected values (and possibly cardinally larger infinities) your focus, as long as there are well-defined consistent ways to handle them without always getting infinity minus infinity errors in practice. With imprecise credences, you could plausibly justify ignoring them on some versions of bracketing (also see here), say because they're so speculative and you're clueless about the direction of your impacts on infinities, including possibly even the effects of research into infinite effects (because the research could be used in ways you'd judge to be very bad).
  2. (Independently of precise vs imprecise) If you're a moral realist, then you wouldn't privilege your own direct normative intuitions just for being yours either, and this would plausibly mean not privileging consequentialism, utilitarianism, hedonism, risk neutrality, etc.. This could have important implications. Your current priorities might still be among your top priorities, but your list of priorities could expand a lot.
    1. It might be impossible to compare these priorities; there's no universal common standard/unit across all normative stances. You might go for a portfolio of interventions.
    2. If you're not a moral realist, or for the part of you that isn't, you can just not care about views that conflict too much with your most important intuitions.
  3. If you're doing some version of bracketing with imprecise credences, some vertebrate welfare work could be worth prioritizing. I'm clueless about whether crops or nature is better for wild animals, even though I'm suffering-focused, so I ignore conversions between nature and crops. Far future effects and acausal influence could guide some priorities unless you're clueless about them and bracket them away.
    1. Again, potentially impossible comparisons + portfolio.
  4. With imprecise credences, I think you would also be more pessimistic about the marginal value of research to compare welfare ranges and sentience across types of possible moral patients. You should also be more pessimistic about the value of further research into the sign of the welfare of moral patients. That doesn't mean no such research is worth doing, but I think it would focus on scoping out possibilities and their implications and gathering evidence that could basically rule out the more extreme hypotheses (e.g. for (near-)constant welfare ranges and for welfare ranges with the most extreme ratios between potential moral patients). Arguments like the two envelopes problem, conscious subsystems, how moral weights could scale with neuron counts, gradations/vagueness, looking for more ways to assign welfare ranges with very different implications from the ones we have now. If you're gathering empirical evidence, you would aim it at shifting or ruling out extremes.
    1. Personally, I've decided to draw some lines in practice, and basically leave out nematodes and simpler systems as priorities. This depends largely on my normative views (and I'm not a moral realist, so I'm more willing to make some judgement calls about this). I think what counts as consciousness is largely normative and subjective, I have some objections to aggregation (e.g. torture vs dust specks) and I'm not entirely risk neutral or ambiguity neutral. The capacities I've observed in them don't seem so compelling. Maybe some of it is motivated reasoning, though. And maybe some sentience research on nematodes would be worth doing. If they met some of the standards here or here or we found evidence for some of the most sophisticated cognitive capacities we observe in fruit flies, I might take them pretty seriously.

I think you've simplified the problem too much. There can be special cases where we can use symmetry and just take simple averages, but many practical cases are not like that. Indeed, that's the point of the distinction between complex and simple cluelessness in the first place.

I think, ideally, we should look for and exploit as much evidential symmetry as possible, but I don’t think we'll always find enough of it to land on a unique precise distribution, I'd guess in principle impossible in many cases (probably almost all cases of intervention and cause area research) without further evidence.

 

It's true that direct impressions (e.g. internal states about the plausibility of the probabilities) could be considered evidence, but to the extent that for the same objective external evidence, these direct impressions can vary between people or depending on how or when you present the evidence, they seem arbitrary.

Would you take the fact that a direct impression came from your brain — from an inscrutable process, prone to cognitive biases of various kinds, and whose reliability you can at best verify by track records in limited domains where feedback is practical, and where track records may not generalize across tasks and domains well — is better evidence than a direct impression from another person's brain (with similar problems), with access to the same objective external evidence?

Or, what if there are multiple people with different distributions and different track records in relevant domains? How do you weigh them? How much should track record be worth? EDIT: What if their track records are measured in different ways, e.g. you have forecasters with Brier scores, investors or betters with measures of their gains and losses, researchers and grantmakers of various seniorities at different organizations?

And what's the range of direct impressions humans or other semi-rational agents could have, and how would you weigh them all?

 

I'd also be keen to get your response to this (and also this, if you have the time.)

Do you think it's reasonable for two people with all of the same evidence to disagree on precise probabilities and expected values? If so, how would you justify picking your own precise probabilities over someone else's, if you think theirs are just as defensible? 

Or would you just average yours and theirs in some way to get a new distribution? How?

 

And how far would you go, if you consider all the defensible precise probability distributions anyone could assign (whether or not anyone actually does so)? How do you weigh them all if there are infinitely many of them and no uniform distribution over them?

 

Here's another example I like.

How would you choose the distributions for the model weights in a way that's not itself arbitrary? E.g. how do you choose their forms and parameters in a way that's not arbitrary?

I do think imprecise credences have a similar problem of deciding which distributions to include in their representor. I think ultimately we need to make some arbitrary choices and should accept some, but we can be more or less arbitrary, or stop when it's no longer decision-relevant. Maybe sometimes we can hit a fixed point or see some kind of convergence in the extra steps we're taking.

 

On there potentially being no fact of the matter, this may be helpful. It goes further than the issue of imprecise credences/EVs.

On the nematode example, it could go further than that: we might assign an imprecise credence between X and 100% to a set of standards for sentience that nematodes don't meet (see my other post on gradations of moral weight). So, the ratio could be anywhere between 0 and 1 (assuming we're taking the absolute value, or only consider same-sign valence).

If the ratio is anywhere between 0 and 1, then whenever we're looking at affecting nematode-seconds relative to their welfare ranges more than human-seconds relative to our welfare ranges, it would be indeterminate which is affected more. I think that would be every time in practice.

If we don't need to deal with gradations/vagueness like this, then I would probably assign expected welfare ranges (conditional on sentience) between constant and roughly proportional to the number of neurons, and this could give many more practically useful comparisons. EDIT: although conscious subsystems makes me more inclined towards approximately proportional, if we’re entertaining nematode sentience.

Hi Vasco, thanks for the comment and sorry for not seeing this and responding earlier.

I agree that the weights/coefficients in the model could end up quite arbitrary, and I would expect them to if someone tried to set them precisely. My sense is that:

  1. We may be able to give some arguments for some bounds on the weights, and some structural constraints on how the weights relate to each other (e.g. it would be odd if we had a function of some measure of complexity like neuron counts that was looked very jagged, but this may be more supported by aesthetics or simplicty than evidence).
  2. Within these constraints, the choices are very subjective and highly arbitrary. I think the situation is even worse than with gravity, because there may be no way to gather evidence one way or the other, and there may be no fact of the matter at all.

Mood et al., 2023 had an earlier estimate:

The FAO (2022c) describes a system in which 4,500 hatchling carp are concurrently stocked to feed each mandarin fish. If this is typical then mandarin farming overall consumes an estimated 3,000 billion feed fishes, i.e. 3.0 × 1012, based on an estimated 674 million mandarin fish (Table 3).

 

From the FAO source (archived):

According to field practice, live foods should be at a density of 3 800–4 500 fish per m2. Moreover, the size of the live food should be well controlled to keep pace with the growth rate of the cultured fish. In order to provide an appropriate quantity of live food, provision procedures are suggested as: 1 to 4 days after pond fertilization, silver carp and bighead carp hatchlings are stocked as live foods at 1 500 per m2; one week later, the same quantity is again stocked; and again an additional week later, stocking of the same number is performed. The third stocking should be accompanied by the stocking of mandarin fish of size 3–4 cm at an average rate of 1 fish per m2.

So 1 mandarin fish per m2 vs 3 800–4 500 feed fish per m2?

Why do you think it was naive instead of a good bet that happened to not work out?

Another two:

  1. Rethink Priorities, but could be hard to evaluate. You'd also have to restrict funding to the animal department.
  2. THL UK, which fundraises separately from THL and gets limited funding from THL.

I had Perplexity do a review and compare PETA's report with Welfare Footprint's research here, and also critique and review the report Perplexity generated here. Summary from the second link:

Where Models Agree

 

Finding   Evidence
The 80 m² EFSA error identification is correct✓✓✓EFSA presentation slide confirms "Minimum area: For group >30 birds: 80 m²" alongside "Max stocking density: 4 laying hens/m²" — a total enclosure area, not per-bird
PETA's own HPAI data does contradict its thesis✓✓✓PETA white paper footnote 84 states 60% caged / 40% cage-free culls with ~45% cage-free flock share, confirming disproportionate caged impact​
KBF cherry-picking critique is well-supported✓✓✓Danish study confirms 86% overall KBF prevalence across all systems; 50–98% in enriched cages
WFI's Open Philanthropy funding concern is valid and accurately stated✓✓✓EA Forum confirms $980K+ as of July 2022; additional $1.25M contract in 2023
WFI mortality meta-analysis publication in Nature is verified✓✓✓Published in Scientific Reports (Nature) covering 6,040 flocks across 16 countries​
The report's overall assessment — PETA's paper is advocacy, WFI's is substantially more rigorous — is well-supported✓✓✓PETA's paper is not peer-reviewed, contains verified factual error, and uses advocacy framing; WFI publishes parameters and invites sensitivity testing
Load more