Are moral preferences stable? 
"Ends versus Means: Kantians, Utilitarians, and Moral Decisions" – an Unjournal evaluation

david_reinstein

This is a linkpost for https://unjournal.pubpub.org/pub/evalsumendsversusmeans/

TLDR

In the paper "Ends versus Means: Kantians, Utilitarians, and Moral Decisions" (n.d.). ... participants a standard university subject pool) make a series of incentivized choices in games involving ends-vs-means tradeoffs, including a “real stakes… trolley dilemma” and rule-following, as well as standard self-versus-other choice tasks. Their goals: to measure moral behavior and preferences in “the… dimension of consequentialism versus deontological ethics”, the prevalence of these behaviors, whether they reflect stable personal traits, and how they relate to self/other social preferences.

The Unjournal prioritized this for evaluation because of its potential relevance for efforts to instill human values in artificial intelligence/AGI models, for governments and institutions aiming to make policy choices reflecting democratic majorities (e.g., for medical triage situations and the “Rule of Rescue”), and for utilitarian/EA organizations considering side-constraints because of public opinion concerns.

Insights from the paper and evaluations: To the extent that this paper’s stated results are true, actionable insights could include:

Deontological preferences are real and common; few people are purely consequentialist
Don’t assume moral preferences are stable or reflect a simple and consistent moral belief system.
To understand preferences in a given setting, you need to consider the specific context, because context matters
These don't strongly correlate with self-vs-other preferences; other-regarding people (~altruists) have a mix of preferences over truth-telling, using people as means-to-an-end, etc.

Caveat: The paper’s evidence alone is probably not sufficient to even be confident about the above. Please see the evaluators' concerns below and linked – these include experimental design limitations and conceptual and intepretation issues –and specific suggestions for improvements and followups.

Note: The content below is basically a cut-and-paste from the evaluation summary, with some small adjustments and rearrangements. You may also want to read the evaluations themselves, by David Hugh-Jones and Valerio Capraro.

Abstract

We organized two evaluations of the paper: "Ends versus Means: Kantians, Utilitarians, and Moral Decisions"(n.d.). In the authors’ experiments, participants (a standard university subject pool) make a series of incentivized choices in games involving ends-vs-means tradeoffs, including a “real stakes… trolley dilemma” and rule-following, as well as standard self-versus-other choice tasks. The goals: to measure moral behavior and preferences in “the… dimension of consequentialism versus deontological ethics”, the prevalence of these behaviors, whether they reflect stable personal traits, and how they relate to self/other social preferences. Measuring these preferences can inform democratic policy choices (e.g., for medical triage situations) as well as the choices and values of powerful artificial intelligence models. The authors report a 20%-44% rate of apparently nonconsequentialist choices, “no evidence of stable individual [consequentialist/deontological] preference types across situations”, and little correlation to self/other choices. The evaluators generally praise the approach and much of the execution, finding this paper interesting and innovative. However, they raise substantial doubts about several of the ends-vs-means games’ designs, suggesting that the participants’ choices may not meaningfully represent deontological/moral behavior as the authors claim. Both evaluators characterize this work as falling short of definitive general evidence for ‘the instability of deontological preferences’. Hugh-Jones offers further conceptual critiques of this general approach to eliciting moral preferences, noting the importance of “innate intuitions” and “communication with others”.

Evaluations

1. David Hugh-Jones
2. Valerio Capraro

Evaluation manager’s discussion

The paper, evaluations, and critiques — a take^[1]

In the authors’ experiments, participants (a standard university subject pool) make a series of incentivized choices in games involving ends-vs-means tradeoffs, including a “real stakes… trolley dilemma” and rule-following, as well as standard self-versus-other choice tasks. The goals: to measure moral behavior and preferences in “the… dimension of consequentialism versus deontological ethics”, the prevalence of these behaviors, whether they reflect stable personal traits, and how they relate to self/other social preferences. The authors report a 20%-44% rate (context-dependent) of apparently nonconsequentialist choices, “no evidence of stable individual [consequentialist/deontological] preference types across situations”, and little correlation to self/other choices.

The evaluators generally praise the approach and much of the execution, finding this paper interesting and innovative. However, they raise substantial doubts about several of the ends-vs-means games’ designs, suggesting that the participants’ choices may not meaningfully represent deontological/moral behavior as the authors claim. Both evaluators characterize this work as falling short of definitive general evidence for ‘the instability of deontological preferences’. Hugh-Jones offers further conceptual critiques of this general approach to eliciting moral preferences, noting the importance of “innate intuitions” and “communication with others”.

I (David Reinstein) find the design criticisms rather plausible and they seem to cast substantial doubt over some of the main claims, particularly about the instability of deontological behavior and its lack of correlation to standard prosocial preferences. Capraro notes “there are many deontological rules and the fact that preferences over one or some of these rules are not stable does not imply that preferences over each of these rules are unstable.” Indeed, if consequentialist choices in these games elicit moral concerns, these are diverse concerns (lying, causing harm as a side-effect of doing more good, ‘bribery’), and not everyone may hold each of these equally. For preferences to be "stable" in the sense they are looking for, people would need to all have the same values over a range of these moral norms.

And to my impression, some of the stylized choices in these lab games don't really trigger the moral norms implied, as Hugh-Jones point out. Giving a lab participant two euros so that more is passed to a charity doesn't really seem to me anything like bribing a public official. And even the "lying” situation they engineer in the lab feels more to me like bluffing in a poker game than an important real-world lie.

Why we chose this paper; impact

The Unjournal has not generally evaluated much ‘classic experimental economics’ work, by which I mean small-scale trials with modest incentives where participants know they are in an experiment. Nor have we prioritized research trying to uncover deep preferences or test fundamental behavioral models. We’ve largely favored policy-relevant projects using meaningful field data and large-scale interventions (see our public database of prioritized work as well as our collection of evaluation packages). This paper is a bit of an exception — so why did we think it was potentially high-value?

While large philanthropic groups associated with Effective Altruism consider moral tradeoffs in their funding choices, these are generally explicitly consequentialist (utilitarian); practical tradeoffs involve ‘more people versus happier people’, and ‘saving infants’ lives versus adult lives’, etc.^[2] Still, they make some considerations of worldview diversification, including deontological "side constraints," and have some cause to be concerned with how their approaches might be viewed by the general (voting) public.

This tends to be more of a concern for democratic governments.^[3] For policy choices to reflect the will of the people, we might want to know how people weigh maximizing the total good against other concerns, such as fairness, lying, and treating people as a means to an end. Examples could include medical triage situations and the “Rule of Rescue”, overstating the risks of drug use to children, putting people or groups under quarantine during pandemics, bribery in foreign aid, vaccine mandates, using torture or targeted killings for counterterrorism, criminal sentencing evidence thresholds, and organ donation defaults.

But the largest factor in our prioritizing this was (probably) the movement to instill human values in artificial intelligence (perhaps AGI) models, which are seen as likely to make profound social decisions. Projects like the Moral Machine Experiment (Wikipedia) sought to measure the preferences of broad segments of the global population over choices autonomous vehicles would need to make in accident situations, ~swerving one direction or another and trading off the lives of specific numbers of people with different characteristics.

This can be seen as part of a broader project of trying to quantify human moral intuitions so they can be encoded into the decision rules of AI systems. Both OpenAI and Anthropic have at least made noises about this (constitutional AI and ‘collective alignment’), while academic and research organizations (e.g., the Center for Human-Compatible AI) have pursued a range of approaches, e.g., “Moral Graph Elicitation”, the “Multi-Human-Value Alignment Palette”, and “Value Learning”.

Insights (a quick and tentative stab)^[4]

To the extent that this paper’s stated results are true, actionable insights could include:

Deontological preferences are real and common; few people are purely consequentialist
Don’t assume moral preferences are stable or reflect a simple and consistent moral belief system.
To understand preferences in a given setting, you need to consider the specific context, because context matters
These do not strongly correlate with self-vs-other preferences; other-regarding people (~altruists) have a mix of preferences over truth-telling, using people as means-to-an-end, etc.

However, this paper’s evidence alone is probably not sufficient to even be confident about the above. The authors’ followup, and more so, “a range of different evidence” (as Hugh-Jones notes) may help garner further insight. Evidence from choice trials like the aforementioned Moral Machine experiment may be particularly fruitful. While these don’t offer incentives quite as tangible as those in this paper, if participants’ choices make are used to inform (e.g.) the behavior of autonomous vehicles, they still offer a concrete impact, i.e., a ‘moral incentive’.

Author engagement, process

The authors were informed about this process, but chose not to respond (yet). Luca Henkel noted

We are currently working on a new version of the paper and will use the reports to improve our paper. Once we have a new version, we would be happy to write a short response.

As always, we will incorporate the authors’ response into this package (at unjournal.pubpub.org) when it is provided.

Overall ratings

We asked evaluators to provide overall assessments as well as ratings for a range of specific criteria.

I. Overall assessment (See footnote^[5])

II. Journal rank tier, normative rating (0-5): On a ‘scale of journals’, what ‘quality of journal’ should this be published in?^[6] Note: 0= lowest/none, 5= highest/best.

	Overall assessment (0-100)	Journal rank tier, normative rating (0-5)
David Hugh-Jones	70	4.0
Valerio Capraro	85	4.2

See “Metrics” below for a more detailed breakdown of the evaluators’ ratings across several categories. To see these ratings in the context of all Unjournal ratings, with some analysis, see our data presentation here.^[7]

See here for the current full evaluator guidelines, including further explanation of the requested ratings.

Evaluation summaries

David Hugh-Jones

This paper reports an experiment on deontological (ends versus means) preferences using incentivized experiments, including a “trolley problem” where subjects can choose to save real lives.^[8] Non-consequentialist preferences are common, but are situation-specific and not linked to conventional social preferences such as altruism. The experiment is well-executed, and results are credible and interesting. Questions remain whether economic lab experiments are the best tool to characterize subjects’ ethical views.

Valerio Capraro

I greatly appreciated the Ends vs Means paper, especially its effort to bring moral dilemmas into economics. I see two main limitations in the current version of the paper: (1) the Save a Life dilemma may mischaracterize Kant’s categorical imperative; (2) the finding of unstable deontological preferences may depend on the dilemmas used. I hope these issues will be resolved, as I think this paper has high potential.

Metrics

Ratings

See here for details on the categories below, and the guidance given to evaluators.

	Evaluator 1 David Hugh-Jones		Evaluator 1 Valerio Capraro
Rating category	Rating (0-100)	90% CI (0-100)*	Rating (0-100)	90% CI (0-100)*
Overall assessment^[9]	70	(45, 85)	85	(75, 95)
Claims, strength, characterization of evidence^[10]	75	(60, 90)	80	(70, 90)
Advancing knowledge and practice^[11]	65	(50, 80)	95	(93, 97)
Methods: Justification, reasonableness, validity, robustness^[12]	70	(60, 80)	80	(70, 90)
Logic & communication^[13]	65	(55, 75)	90	(83, 97)
Open, collaborative, replicable^[14]	70	(50, 80)	95	(92, 98)
Real-world relevance ^[15],^[16]	50	(40, 60)	85	(75, 95)
Relevance to global priorities^[17], ^[18]	50	(40, 60)	85	(75, 95)

Journal ranking tiers

See here for more details on these tiers.

	Evaluator 1 David Hugh-Jones		Evaluator 2 Valerio Capraro
Judgment	Ranking tier (0-5)	90% CI	Ranking tier (0-5)	90% CI
On a ‘scale of journals’, what ‘quality of journal’ should this be published in?	4.0	(3.5, 4.8)	4.2	(3.7, 4.7)
What ‘quality journal’ do you expect this work will be published in?	4.3	(2.5, 5.0)	4.5	(4.2, 4.8)
See here for more details on these tiers.	We summarize these as: 0.0: Marginally respectable/Little to no value 1.0: OK/Somewhat valuable 2.0: Marginal B-journal/Decent field journal 3.0: Top B-journal/Strong field journal 4.0: Marginal A-Journal/Top field journal 5.0: A-journal/Top journal

Claim identification and assessment (summary)

For the full discussions, see the corresponding sections in each linked evaluation.

	Main research claim^[19]	Belief in claim^[20]	Suggested robustness checks^[21]	Important ‘implication’, policy, credibility^[22]
Evaluator 1 David Hugh-Jones	About 25% of people make deontological decisions in a “trolley problem” with real consequences, i.e. they do not intervene to stop the saving of one statistical life (via money donations to a charity), even though doing so would enable the saving of three statistical lives elsewhere.	In the exact same experimental setup, but across a broader population e.g. a random sample of Europeans, I’d have a 80% credible interval of 10-40% deontologists. Across a more broader set of real-life means vs. end dilemmas with serious consequences, that would be wider, but I’d give an 80% probability that any given dilemma would have at least 5% of deontologists.	A similar experiment with a random population sample could estimate this parameter in a more interesting population. Ultimately, interaction with political and ethical philosophers might be more important for determining how useful the above number is.	In the exact same experimental setup, but across a broader population e.g. a random sample of Europeans, I’d have a 80% credible interval of 10-40% deontologists. Across a more broader set of real-life means vs. end dilemmas with serious consequences, that would be wider, but I’d give an 80% probability that any given dilemma would have at least 5% of deontologists.

References

(Bénabou, Falk, & Henkel, 2024) Bénabou, R., Falk, A., & Henkel, L. (2024). Ends versus Means: Kantians, Utilitarians, and Moral Decisions. National Bureau of Economic Research. https://doi.org/10.3386/w32073

(https://doi.org/10.48550/arXiv.2404.10636 ?) Klingefjord, O., Lowe, R., & Edelman, J. (2024). What are human values, and how do we align AI to them? (Version 2). Version 2. arXiv. https://doi.org/10.48550/ARXIV.2404.10636

(https://doi.org/10.48550/arXiv.2410.19198 ?) Wang, X., Le, Q., Ahmed, A., Diao, E., Zhou, Y., Baracaldo, N., … Anwar, A. (2024). MAP: Multi-Human-Value Alignment Palette (Version 1). Version 1. arXiv. https://doi.org/10.48550/ARXIV.2410.19198

(Awad, Edmond, et al. "The moral machine experiment." Nature 563.7729 (2018): 59-64. ?)

(n.d.).

Bénabou, R., Falk, A., & Henkel, L. (2024). Ends versus Means: Kantians, Utilitarians, and Moral Decisions. National Bureau of Economic Research. https://doi.org/10.3386/w32073

^{^}
This section largely remixes content from the abstract.
^{^}
For example, GiveWell makes choices based on models of moral tradeoffs fed by preferences from surveys of their own team, their stakeholders, and the beneficiary groups. The results of this paper a might be relevant to the extent that they inform us about people's consistency in reporting on moral dilemmas and moral choices in a general sense.
^{^}
Mainstream NGOs and charities may face similar concerns in trying to effectively help the world’s poorest people while being consistent with what they see as the fundamental ethical values of the people in those communities.

^{^}

This partially repeats the TLDR

^{^}

We asked them to rank this paper “heuristically” as a percentile “relative to all serious research in the same area that you have encountered in the last three years.” We requested they “consider all aspects of quality, credibility, importance to knowledge production, and importance to practice.

^{^}

See ranking tiers discussed here.

^{^}

Note: if you are reading this before, or soon after this has been publicly released, the ratings from this paper may not yet have been incorporated into that data presentation.

^{^}

Manager’s note: In a statistical-life sense — basically, through allocating or reallocating charitable donations.

^{^}

Judge the quality of the research heuristically. Consider all aspects of quality, credibility, importance to knowledge production, and importance to practice.

^{^}

“Do the authors do a good job of (i) stating their main questions and claims, (ii) providing strong evidence and powerful approaches to inform these, and (iii) correctly characterizing the nature of their evidence?”

This was on the newer form only

^{^}

To what extent does the project contribute to the field or to practice, particularly in ways that are directly or indirectly relevant to global priorities and impactful interventions?

^{^}

Are methods clearly justified and explained? Are methods and their underlying assumptions reasonable? Are the results likely to be robust to changes in the assumptions? Have the authors avoided bias and questionable research practices?

^{^}

Are concepts clearly defined? Is the reasoning transparent? Are conclusions consistent with the evidence (or formal proofs) presented? Are the data and/or analysis, including tables and figures, relevant to the argument?

^{^}

Would another researcher be able to replicate the analysis? Are the method and its details explained sufficiently? Is the source of the data clear? Is the data made as widely available as possible, with clear labeling and explanation? Do the authors provide resources that are likely to enable future research and meta-analysis?

^{^}

Does the paper consider real-world relevance and deal with policy and implementation questions? Are the setup, assumptions, and focus realistic and relevant to practitioners?

^{^}

The latter ratings were merged in the newer form

”Are the paper’s chosen topic and approach likely to be useful to global priorities, cause prioritization, and high-impact interventions?”

“Does the paper consider real-world relevance and deal with policy and implementation questions? Are the setup, assumptions, and focus realistic? Do the authors report results that are relevant to practitioners? Do they provide useful quantified estimates (costs, benefits, etc.)?”

^{^}

Are the paper’s chosen topic and approach likely to be useful to global priorities, cause prioritization, and high-impact interventions?

^{^}

The latter ratings were merged in the newer form

”Are the paper’s chosen topic and approach likely to be useful to global priorities, cause prioritization, and high-impact interventions?”

^{^}

The evaluator was given the following instructions:

Identify the most important and impactful factual claim this research makes – e.g., a binary claim or a point estimate or prediction.

Please state the authors’ claim precisely and quantitatively. Identify the source of the claim (i.e., cite the paper), and briefly mention the evidence underlying this. We encourage you to explain why you believe this claim is important, either here, or in the text of your report.

^{^}

Evaluators were asked: To what extent do you *believe* the claim you stated above? Feel free to express this either a. in terms of the probability of the claim being true, b. as a credible interval for the parameter being estimated, or c. however you feel comfortable.

^{^}

We asked:

[Optional] What additional information, evidence, replication, or robustness check would make you substantially more (or less) confident in this claim?

Feel free to refer to the main body of your evaluation here; you don't need to repeat yourself. Please specify how you would perform this robustness check (etc.) as precisely as you are willing. E.g., if you suggest a particular estimation command in a statistical package, this could be very helpful for future robustness replication work.

^{^}

We asked:

[Optional] Identify the important *implication* of the above claim for funding and policy choices? To what extent do you *believe* this implication? How should it inform policy choices?

Note: this ‘implication’ could be suggested by the evaluation manager in some cases. As an example of an 'implication' ... in a global health context, the 'main claim' might suggest that a vitamin supplement intervention, if scaled up, would save lives at a $XXXX per life saved.

We did not ask this in the ‘applied stream’ as it is most likely redundant.

Show all footnotes

EA Forum Bot Site
EA Forum

Are moral preferences stable? "Ends versus Means: Kantians, Utilitarians, and Moral Decisions" – an Unjournal evaluation

7

TLDR

Abstract

Evaluations

Evaluation manager’s discussion

The paper, evaluations, and critiques — a take^[1]

Why we chose this paper; impact

Insights (a quick and tentative stab)^[4]

Author engagement, process

Overall ratings

Evaluation summaries

David Hugh-Jones

Valerio Capraro

Metrics

Ratings

Journal ranking tiers

Claim identification and assessment (summary)

References

7

Reactions

More posts like this

Are moral preferences stable? "Ends versus Means: Kantians, Utilitarians, and Moral Decisions" – an Unjournal evaluation

7

TLDR

Abstract

Evaluations

Evaluation manager’s discussion

The paper, evaluations, and critiques — a take[1]

Why we chose this paper; impact

Insights (a quick and tentative stab)[4]

Author engagement, process

Overall ratings

Evaluation summaries

David Hugh-Jones

Valerio Capraro

Metrics

Ratings

Journal ranking tiers

Claim identification and assessment (summary)

References

7

Reactions

More posts like this

The paper, evaluations, and critiques — a take^[1]

Insights (a quick and tentative stab)^[4]