WJA

Wladimir J. Alonso

475 karmaJoined

Posts
9

Sorted by New

Comments
28

Thanks a lot, Vasco — and thanks for the upvote!

You’re absolutely right to push us toward the practical question of how to compare affective capacity across species. That’s ultimately where this line of work needs to go. At the same time, we’ve been deliberately cautious here, because we think this is one of those cases where moving too quickly to numbers or rankings risks making the waters muddier rather than clearer.

Our sense is that the comparison of affective capacity across species hinges on a set of upstream scientific questions that are still poorly articulated- especially around when sentience arises at all, and when it plausibly extends to very intense affective states. The aim of this piece was to stress-test a way of structuring those questions before turning them into quantitative tools.

That said, we do see this as complementary to RP’s research agenda on valuing impacts across species. In fact, we think cost–benefit reasoning about sentience and affective intensity can help discipline some of the assumptions that go into moral-weight or welfare-capacity estimates, rather than replacing them.

We’re currently working on a follow-up that moves closer to a practical comparative framework, and we’re very much treating the present work as groundwork for that. Happy to loop back and share it once it’s ready — and we’d be keen to hear your thoughts then as well.

Thanks a lot for the kind words, Jim — and for the thoughtful pushback.

I think your point holds if we assume that the only way to implement a very strong alarm is via extreme felt intensity — but that assumption is exactly what we’re questioning.

I agree that in genuinely catastrophic situations, evolution should tolerate very “loud” alarms. The open question, though, is whether those alarms need to be implemented as extreme affective states, rather than through non-affective or lower-intensity control mechanisms.

On the benefit side, there seem to be two distinct roles a very strong signal could play. First, triggering an immediate reaction in life-or-death situations. But this doesn’t require affect at all: many organisms (including very simple ones) already show robust threat responses via non-felt control. Even in sentient organisms, immediate escape could in principle be driven by low-intensity affect if thresholds are set low enough, especially where behavioral options are limited.

Second, overriding other ongoing motivations in organisms with richer behavioral repertoires. Here, stronger affective signals become more plausibly useful, as they can reliably dominate competing drives (foraging, mating, self-maintenance, etc.). One way to achieve this is by expanding affective range rather than relying only on finer discrimination within a narrow range.

On the cost side, generating and sustaining very high-intensity affective states may plausibly require substantial architectural capacity of the kind we discuss above. In systems with limited computational or neural resources—and especially in organisms with few available behavioral options—extreme felt states may therefore be difficult or unnecessary to implement, regardless of how valuable a very loud alarm would be.

Thanks, Becca — really glad you took a look and liked it.

On your point about how this relates to certifications and similar tools: we see this as strongly complementary to them, not an alternative. When Welfare Footprint estimates become available, our hope is that they’ll be usable in many different ways by different stakeholders — including certification initiatives themselves — rather than being tied to a single interface or application.

This app is best understood as an early exploratory step: a way of seeing how people actually engage with welfare information, what resonates or causes confusion, and how different framings influence choices. We hope these insights can be useful not just for us, but for anyone thinking about how WFF-style estimates might be effectively deployed beyond a single app.

Thanks for spelling this out, Vasco — yes, that’s a fair clarification.

When we say that pain intensities are defined as “absolute” in WFF, this is meant in a conceptual and operational sense within a shared intensity vocabulary, not as a claim that no interspecific adjustments are needed in practice. The statement you quote is explicitly conditional (“if shrimps were capable of experiencing Excruciating pain”) and is held as a temporary, simplifying assumption to allow measurement of time spent in different intensity categories, while recognizing that the true placement of experiences on an absolute scale across taxa remains an open scientific problem.
 

At a personal scientific level, I find it very implausible that the affective capacity of a shrimp and that of a human are comparable. However, because this remains an unresolved empirical question, the framework itself is intentionally agnostic and requires that any interspecific adjustments be made explicitly and post-quantification, rather than being implicitly embedded in the core estimates.


 

Thanks, Vasco. We recognize that for most specific interspecies comparisons, affective capacity (not probability of sentience) is indeed crucial, but this remains an open scientific question. For that reason, the Welfare Footprint Framework is intentionally agnostic about correction values for interspecific scaling: welfare estimates are produced without such corrections, and any assumptions about differences in affective capacity must be applied explicitly and transparently as optional post-quantification adjustments when particular comparisons require them, rather than being implicitly folded into the core estimates.


 

Hi Vasco,

In the Welfare Footprint Framework, pain and pleasure intensities are defined as absolute categories conditional on an experience being affective, and uncertainty about sentience is treated upstream as a separate epistemic issue rather than folded into intensity probabilities. The closest point of contact between these questions is affective capacity—since different organisms may plausibly reach different intensity ranges or resolutions, as discussed in our article—but probability of sentience is not part of the equation, because the definition of sentience we adopt is itself conditioned on the capacity to experience affective states.

Hi Itamar — congratulations on all these initiatives.

As promised in our private exchange, I wanted to lay out an architectural idea I’ve been exploring for LLM-based applications, which may be useful to others building similar tools. I don’t know how novel this is, but in a world where many tools will increasingly rely on AI, I think it’s a good general practice.

The core idea is simple: all AI prompts live in a dedicated, human-readable folder, separate from application logic.

There are two main reasons for this.

First, radical transparency. If an application makes claims, recommendations, or interpretations that matter ethically or scientifically, then the instructions guiding the AI are part of what should be open to scrutiny. Keeping prompts in a clearly accessible place makes the system legible not only to developers, but also to researchers, ethicists, and communities like EA or academia who may want to understand how conclusions are being generated, not just what the interface shows.

Second, a clean separation between scientific or ethical content and engineering plumbing. Prompts often reflect underlying assumptions, value choices, and ways of thinking about a problem. Keeping them visible and separate from the rest of the code helps ensure that changes to how the AI reasons or frames an issue are intentional and easy to review, rather than happening quietly as a side effect of technical work. In practice, this folder is meant to be the main reference for what the AI is told to do, while the surrounding code simply handles running it.

In our  Welfare Food Explorer app, for example, this structure allows researchers and non-developers to easily find, read, and reason about what the AI is being instructed to do, without needing to navigate the rest of the codebase.

We adopted this approach because in applications that touch science, ethics, welfare, or normative interpretation, how the AI reasons is part of the substance of the system itself. Making prompts visible, inspectable, and discussable helps treat AI behavior as something that can be examined, debated, and improved by a broader audience.

I hope this perspective is useful to others. Cheers.

Hi Itamar That sounds great — I’d be happy to connect

Thanks for the kind words! I am glad you found the tool useful.

A quick update: we’ve now expanded the system so it doesn’t just quantify negative affective experiences, but also positive ones. Because of this broader scope, the new version is called Hedonic-Track GPT, and it is gradually replacing the earlier Pain-Track GPT.

We may need to update this article soon so readers are directed to the most current tool. In the meantime, you can find the link to the Hedonic-Track GPT here:

https://chatgpt.com/g/g-cpXxeWtgg-hedonic-track

Thank you very much for this thorough analysis and for the constructive comments.
Cynthia will address the points related to the results of the study, while I’ll focus here on the methodological aspects.

One of the most important points you raise touches on the core of the Welfare Footprint Framework itself: we recognize that inferring the affective states of other beings is enormously challenging—both in scope and depth. This task can never be complete; it will always require revisions and corrections as new evidence becomes available. The Welfare Footprint Framework is, in essence, an attempt to structure this challenge into as many workable, auditable pieces as possible, so that the process of inference can be progressively improved and openly scrutinized.

You are absolutely right that several painful conditions in chickens were not included in this initial analysis. This was a conscious decision—not because those harms are unimportant, but because we had to start with a subset that we judged to be among the most influential and best documented. The framework is designed precisely so that others can build upon it by incorporating additional conditions, refining prevalence estimates, or reassessing intensities. In that sense, this work should be seen as a living model, not a closed dataset.

Regarding the concern about the lack of use of high-quality statistical techniques, our approach is pragmatic. Where robust statistical analyses are feasible—such as in estimating prevalence or duration—they are of course welcome and encouraged. But in areas where measurement is currently impossible—most notably the intensity of affective states—we deliberately avoid mathematical sophistication for its own sake. No amount of elegant equations can compensate for the fact that subjective experience is, for now, beyond direct measurement. What we can do is gather convergent evidence from different sources - e.g. behavior, physiology, neurology, evolutionary reasoning - and generalize that evidence into transparent, revisable estimates, and make every assumption explicit so that others can challenge and adjust them.

As for the legitimacy of this approach, we believe that, while imperfect and always improvable, quantifying affective experiences is vastly more informative than relying solely on indirect indicators such as mortality. Animals can live long, physically healthy lives that are nevertheless filled with frustration, chronic pain, fear, or monotony—forms of suffering invisible to metrics that focus only on death or disease. By directing efforts toward gathering as much evidence as possible to infer the intensity and duration of each stage spent in negative and positive affective states, we can begin to capture what actually matters to the animal.

The framework has also evolved since this analysis was first produced. At that time, we focused primarily on negative affective states, but we have now extended the methodology to include Cumulative Pleasure alongside Cumulative Pain. Positive affective states are now being systematically quantified using the same operational principles, creating a fuller picture of animal welfare.

Finally, we are developing an open, collaborative platform where Pain-Tracks and Pleasure-Tracks can be published, discussed, and iteratively improved by the broader scientific community. Each component of a track—for example, the probability assigned to a certain intensity within a phase of an affective experience—could be challenged and refined, potentially even through expert voting or consensus mechanisms. The aim is to make welfare quantification transparent, dynamic, and collective rather than proprietary.

Thanks again for putting our work under the microscope—this is exactly what it needs. The Framework is meant to evolve, and feedback like yours helps it grow in the right direction.

Load more