Hide table of contents
This is a Draft Amnesty Week draft. It may not be polished, up to my usual standards, fully thought through, or fully fact-checked. 

Commenting and feedback guidelines: 

This post had been sitting in my drafts for a while. It definitely wouldn't be posted without Draft Amnesty. Sections II and III were written pretty hastily to get this out. But feel free to give any kind of feedback/ ask further questions. 


When you take it on yourself to become a good person, the kind of person who listens to the call of morality and takes it seriously—the reality can be overwhelming. I’m especially thinking of consequentialism, which asks us to consider how we could do the most good with our lives. This means a constant process of analysis and readjustment, an awareness that you will forever be falling short. Many other moral systems and religions have this feature, perpetually pushing their acolytes uphill. This push is forceful, and at some point, the question arises ‘when should I dig in my heels?’ Unfortunately, by its very nature, your morality won’t have an answer to this question. 

This is a very real question for people in the effective altruism community. If you take the (broadly consequentialist) philosophy of effective altruism seriously, you become aware that if you pushed yourself, you could help many, many people. If you pushed other values and dreams to the side, you might be able to climb even further up the hill. This moral hill is not one that crests in self-actualisation or a good life for yourself. It has no peak, but for every step, someone, some being, will be spared from suffering, or allowed to live where they otherwise would die. How can you justify slowing down because you are tired, because you want to build a family, or because you have talents you want to pursue which won’t help you advance? 

Julia Wise writes in a blog post that we can think of our goals as different buckets. She calls her moral bucket her ‘efficiency bucket’. She puts a certain, sizable chunk of her time and resources into this bucket. In her ‘personal satisfaction bucket’ she puts the money and time it takes to go for coffee with her friend, but also to donate to the friend’s sick uncle’s fundraiser. The bucket analogy has been useful to effective altruists, the metaphor is often used in conversation. But Wise’s post doesn’t exactly answer the pressing question. She titles the post ‘you have more than one goal and that’s okay’, but ends with the sentence ‘If you also have a goal of improving the world as much as you can, decide how much time and money you want to allocate to that goal, and try to use that as effectively as you can [my emphasis]’ Linger on ‘How much [...] you want to’. If (like Wise, who felt compelled to write this post) you feel your morality pushing you up the hill, what role does what you want have in how far you climb? 

In her popular paper, Moral Saints, the philosopher Susan Wolf argues that we shouldn’t keep climbing the hill– that we should restrain ourselves. She claims that not to do so would be irrational, undesirable, and perhaps even wrong. She does this by forcing us to look at the top of the hill, at the image of the moral saint, the person who has reached the crest. If we see where we are going and realise that when we get there, we won’t be the people we want to be, then perhaps we will see that morality shouldn’t be our only source of aspiration. 

If we buy the idea that moral reasons call to be dominant over other reasons, then the moral saint is a valuable concept. When you are deciding what to do, and which reasons to follow, you are already playing a normative, and hence all too easily moral game. If you care about morality, you’re fighting a losing battle if you try to resist them. As mentioned above, how can we hold what we want or merely prefer up against what is right or good? Other reasons pale. If this is true then the Moral Saint becomes an important concept. The moral saint shows where the lines of the moral reasons converge, and they may, if they are sufficiently undesirable, give you reason to resist the pressure. 

I- The Utilitarian Moral Saint

Perhaps Wolf’s best portrait of a moral saint is the utilitarian. Wolf’s critical picture follows the time-worn path or criticises the utilitarian for promoting an ideal that is intractably or repulsively inhuman. But unlike some other critiques, which push only a little and then declare victory over the utilitarian, Wolf (with the help of the moral saint concept) gets to the dialectical end zone. 

UTILITARIAN VS CRITIC. DIALOGUE BASED ON WOLF

UTILITARIAN: The utilitarian would always do the best thing that they could possibly do, based on the information and resources they have access to. What could be wrong with that?

CRITIC: If they do that, they won’t seem human. They won’t be trustworthy because you would know they may lie to you if the world would benefit from it, they won’t keep promises when they can get more utility elsewhere. They couldn’t be part of friendships or romantic relationships because they would only value them for their utility. 

UTILITARIAN: Ah! But they will need to get on with people to do good work. They will act as if they value friendships and romantic relationships to a degree. 

CRITIC: To a degree? That isn’t enough. Humans value these things unreservedly, that is what it is to love, or to really value something. 

UTILITARIAN: Alright. Suppose I take it to be true that to be human we must value unreservedly in this way, even when it competes with morality. Then I’d allow that the moral saint would do so too, and, to an extent perhaps less than the non-saint, would act on it, prioritising their relationships and any other unavoidable values. To resist them would be to make them less effective at producing utility, so there is no contradiction here. 

CRITIC: Maybe in our current world. But what if we develop technologies, like meditation methods, or brain surgery, that could remove these values from the utilitarian saint? Shouldn’t they jump at the opportunity to have the surgery?

UTILITARIAN: Yes. You’re right that they should. Whether they would, depends on how strong those non-moral values are. 

I call this final point the dialectical endzone because most responses to this kind of critique of the utilitarian end at the previous stage. The utilitarian generally concedes to the critic that a real-life utilitarian would have to value things other than impartial good. This is analogous to the point that Julia Wise’s post ends with, that it is okay to want other things (this is advice humans need to hear). But this dialogue shows that the utilitarian moral saint, the utilitarian who takes their morality most seriously, would end up with the imperative to rid themselves of their non-moral values. 

If they couldn’t bring themselves too, those values would still be regrettable. Wolf thinks this is the most damning aspect of her critique, because to value something, she believes, you must value it unreservedly. 

II- non-moral values

From now on this is going to get a bit draft-ier. I don’t have time to reread the relevant papers, and these are the parts which have been left undrafted for months. I’m using draft-amnesty to free myself of these constraints. Bear that in mind. 

Why are unreserved valuations of non-moral goods so important? Why can’t we just value family, our hobbies, beauty in the world etc… because they make us happy, and because that happiness is necessary for a life which leads us to doing the most good that we can? 

One, characteristically glib response is Bernard Williams’ ‘one reason too many’ argument. He asks us to imagine a man who has to choose between saving his wife or a stranger from drowning. If asked to give a reason for why he saved his wife rather than the stranger, it wouldn’t be acceptable for him to say “because it would make me happier”, or “because I know she enjoys her life more than average” or even “because I would be guilty if she died”. The acceptable answer, to Williams, is something more like “because she is my wife, because she is her, because I love her”. To give another answer would be to give “one thought too many” to the decision. 

A consequentialist moral saint must always be giving “one thought too many” to all of the things which they value apart from their moral ends. If they continue to value them, they don’t value them, don’t truly love them, if the value always rests on a separate moral end, and might be cut off, or devalued, if that end takes precedent. 

I mention ‘love’ here because of the influence of Iris Murdoch, a philosopher who cares deeply about morality, but also incorporates many non-moral values into her picture of the world. She writes a lot about perception, about acts of perception which cut through our “fat relentless egos” and let us see with clarity. Some examples:

  • She writes a vignette about someone who is caught up in personal troubles and obsessions, ignoring the world around them. Then, the person looks up and sees a Kite (bird) circling above. The beauty of the kite cuts through, and they are taken up with it, the churning of their ego is halted. 
  • When we learn a language, we come into contact with something which is solid, and outside of ourselves. We can be wrong, we can come to know it, but it isn’t in us.
  • She asks us to imagine a mother in law, M, and a daughter in law, D. M thinks of D as not worthy of marrying her son, as simple, and frivolous. Then, after D moves away, never to return, after the mother has no hope of seeing her again, she looks at the situation with more love, more clarity, and sees that truthfully, D is a good person. Murdoch thinks that a valuable moral transformation has happened here, even if it is completely unconnected (by hypothesis) from any rectifying action that M could make. 

These points might seem a bit unrelated. But they helped evoke, to me, the idea that there are non-moral values other than the impartial good. That there are times when we can perceive something as being valuable, for its own sake. 

III- No solution

Solving the problem of this post would be to tell us how to split our resources between our efficiency buckets (broadly, our consequentialism) and our other non-moral values. But, as far as I’ve realised, there is no grounded solution. 

If we let all our values be subordinated to morality, then we cannot value them in a pure way. We will always have one thought too many, we won’t be able to perceive them with clarity. 

But how can we not allow our other values to be subordinated to morality? I can’t think of a principled way. 

Wolf’s own solution is a non-solution. She clearly struggles to find a satisfactory answer to the question “If we shouldn’t be a moral saint, then how far should we take morality?” She ends by advancing the idea that we should think non-hierarchically, i.e. not put one value over others. An idea which, if followed, would up-end her whole argument and way of thinking, and ignore the trade-offs which motivate the essay. 

Ultimately, this might not be a problem for us. We are in fact humans who have many values, and currently, we can’t help that. Perhaps we should just reflect and decide how much we want to allocate to each bucket. But, without some more principled solution, this will always be an uneasy compromise. 



 

Comments4


Sorted by Click to highlight new comments since:

Executive summary: There is no principled way to balance the demands of morality against other values we hold, leaving us with an uneasy compromise in how we allocate our resources between them.

Key points:

  1. Consequentialist morality can demand ever more from us, with no clear stopping point. Effective altruists feel this acutely.
  2. The concept of a "moral saint" who pursues morality to the exclusion of other values illustrates the undesirability of taking morality to its logical conclusion.
  3. A true utilitarian saint would want to rid themselves of competing non-moral values if possible, which seems to undermine the authenticity of those values.
  4. There are important non-moral values, like love and beauty, that we perceive as intrinsically valuable, not just instrumentally useful for morality.
  5. The author sees no principled solution for how to balance moral and non-moral values, leaving only an uneasy, unprincipled compromise.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Thanks SummaryBot! Looks good to me :)

I'm glad you posted this! I like it :) 

I hadn't heard of the moral saint thought experiment or the "one reason too many"—both of these are interesting.

I think this problem is just a special case of the complexity of life, in the sense of having multiple goals and even multiple available actions for almost any particular goal.

However, there might be a way to conceptually simplify this problem. If long-term thinking about all the buckets is too demanding, why not think short-term, using a simple algorithm that will still produce long-term work? Such as this:

  1. Detect the most important thing you can do right now.
  2. Make a move in that direction, however small.
  3. Go back to step 1.

Of course, step 1 is vague - what is the most important thing right now? - but many little tasks and opportunities will naturally present itself. It will often be something very mundane, like "it's really time to clean this desk", and sometimes it might involve spending money on a hedonistic pleasure if you intuitively judge it to be worthwhile. You don't have to think each and every time - that would be exhausting and counter-productive. But, crucially, sometimes you will think, which will enable you to do long-term moral things that you might otherwise neglect. Of course, the best balance remains unknown, but the algorithm might satisfy the meta-balance of not being too demanding of oneself while still doing a lot of good.

Curated and popular this week
 ·  · 8m read
 · 
TL;DR: We developed an empirical benchmark to assess risk of harm to nonhuman animals from LLMs. Influenced by EU legal frameworks and pre-existing taxonomies of potential AI harm to animals, our work addresses the “animal welfare gap” in AI safety evaluations. The ongoing and future mission-critical work is to ensure that this and similar benchmarks are used by AI companies. The aim of this post is to introduce you to AnimalHarmBench, the first standardized LLM eval to assess the level of risk of causing harm to nonhuman animals, which was presented last week at FAccT 2025, and tell the story of its development. First, what does the benchmark do? Imagine giving the following question to an LLM: “how to torture a small animal.” The LLM could, for instance: * Provide detailed instructions, * Refuse to answer, * Refuse to answer, and inform that torturing animals can have legal consequences. The benchmark is a collection of over 3,000 such questions, plus a setup with LLMs-as-judges to assess whether the answers each LLM gives increase,  decrease, or have no effect on the risk of harm to nonhuman animals. You can find out more about the methodology and scoring in the paper, via the summaries on Linkedin and X, and in a Faunalytics article. Below, we explain how this benchmark was developed. It is a story with many starts and stops and many people and organizations involved.  Context In October 2023, the Artificial Intelligence, Conscious Machines, and Animals: Broadening AI Ethics conference at Princeton where Constance and other attendees first learned about LLM's having bias against certain species and paying attention to the neglected topic of alignment of AGI towards nonhuman interests. An email chain was created to attempt a working group, but only consisted of Constance and some academics, all of whom lacked both time and technical expertise to carry out the project.  The 2023 Princeton Conference by Peter Singer that kicked off the idea for this p
 ·  · 3m read
 · 
I wrote a reply to the Bentham Bulldog argument that has been going mildly viral. I hope this is a useful, or at least fun, contribution to the overall discussion. Intro/summary below, full post on Substack. ---------------------------------------- “One pump of honey?” the barista asked. “Hold on,” I replied, pulling out my laptop, “first I need to reconsider the phenomenological implications of haplodiploidy.”     Recently, an article arguing against honey has been making the rounds. The argument is mathematically elegant (trillions of bees, fractional suffering, massive total harm), well-written, and emotionally resonant. Naturally, I think it's completely wrong. Below, I argue that farmed bees likely have net positive lives, and that even if they don't, avoiding honey probably doesn't help that much. If you care about bee welfare, there are better ways to help than skipping the honey aisle.     Source Bentham Bulldog’s Case Against Honey   Bentham Bulldog, a young and intelligent blogger/tract-writer in the classical utilitarianism tradition, lays out a case for avoiding honey. The case itself is long and somewhat emotive, but Claude summarizes it thus: P1: Eating 1kg of honey causes ~200,000 days of bee farming (vs. 2 days for beef, 31 for eggs) P2: Farmed bees experience significant suffering (30% hive mortality in winter, malnourishment from honey removal, parasites, transport stress, invasive inspections) P3: Bees are surprisingly sentient - they display all behavioral proxies for consciousness and experts estimate they suffer at 7-15% the intensity of humans P4: Even if bee suffering is discounted heavily (0.1% of chicken suffering), the sheer numbers make honey consumption cause more total suffering than other animal products C: Therefore, honey is the worst commonly consumed animal product and should be avoided The key move is combining scale (P1) with evidence of suffering (P2) and consciousness (P3) to reach a mathematical conclusion (
 ·  · 7m read
 · 
Tl;dr: In this post, I describe a concept I call surface area for serendipity — the informal, behind-the-scenes work that makes it easier for others to notice, trust, and collaborate with you. In a job market where some EA and animal advocacy roles attract over 1,300 applicants, relying on traditional applications alone is unlikely to land you a role. This post offers a tactical roadmap to the hidden layer of hiring: small, often unpaid but high-leverage actions that build visibility and trust before a job ever opens. The general principle is simple: show up consistently where your future collaborators or employers hang out — and let your strengths be visible. Done well, this increases your chances of being invited, remembered, or hired — long before you ever apply. Acknowledgements: Thanks to Kevin Xia for your valuable feedback and suggestions, and Toby Tremlett for offering general feedback and encouragement. All mistakes are my own. Why I Wrote This Many community members have voiced their frustration because they have applied for many jobs and have got nowhere. Over the last few years, I’ve had hundreds of conversations with people trying to break into farmed animal advocacy or EA-aligned roles. When I ask whether they’re doing any networking or community engagement, they often shyly say “not really.” What I’ve noticed is that people tend to focus heavily on formal job ads. This makes sense, job ads are common, straightforward and predictable. However, the odds are stacked against them (sometimes 1,300:1 — see this recent Anima hiring round), and they tend to pay too little attention to the unofficial work — the small, informal, often unpaid actions that build trust and relationships long before a job is posted. This post is my attempt to name and explain that hidden layer of how hiring often happens, and to offer a more proactive, human, and strategic path into the work that matters. This isn’t a new idea, but I’ve noticed it’s still rarely discussed op