A fun game for avowed non-utilitarians is to invent increasingly exotic thought experiments to demonstrate the sheer absurdity of utilitarianism. Consider this bit from Tyler’s recent interview with SBF:
COWEN: Should a Benthamite be risk-neutral with regard to social welfare?
BANKMAN-FRIED: Yes, that I feel very strongly about.
COWEN: Okay, but let’s say there’s a game: 51 percent, you double the Earth out somewhere else; 49 percent, it all disappears. Would you play that game? And would you keep on playing that, double or nothing?
…
BANKMAN-FRIED: Again, I feel compelled to say caveats here, like, “How do you really know that’s what’s happening?” Blah, blah, blah, whatever. But that aside, take the pure hypothetical.
COWEN: Then you keep on playing the game. So, what’s the chance we’re left with anything? Don’t I just St. Petersburg paradox you into nonexistence?
Pretty damning! It sure sounds pretty naive to just take any bet with positive expected value. Or from a more academic context, here is FTX Foundation CEO Nick Beckstead alongside Teruji Thomas:
On your deathbed, God brings good news… he’ll give you a ticket that can be handed to the reaper, good for an additional year of happy life on Earth.
As you celebrate, the devil appears and asks “Won’t you accept a small risk to get something vastly better? Trade that ticket for this one: it’s good for 10 years of happy life, but with probability 0.999.”
You accept… but then the devil asks again… “Trade that ticket for this one: it is good for 100 years of happy life–10 times as long–with probability 0.9992–just 0.1% lower.”
An hour later, you’ve made 50,000 trades… You find yourself with a ticket for 1050,000 years of happy life that only works with probability 0.99950,000, less than one chance in 1021
Predictably, you die that very night.
And it’s not just risk! There are damning scenarios downright disproving utilitarianism around every corner. Joe Carlsmith:
Suppose that oops: actually, red’s payout is just a single, barely-conscious, slightly-happy lizard, floating for eternity in space. For a sufficiently utilitarian-ish infinite fanatic, it makes no difference. Burn the Utopia. Torture the kittens.
…in the land of the infinite, the bullet-biting utilitarian train runs out of track…
It’s looking quite bad for utilitarianism at this point. But of course, one man’s modus ponens is another man’s modus tollens, and so I submit to you that actually, it is the thought experiments which are damned by all this.
I take the case for “common sense ethics” seriously, meaning that a correct ethical system should, for the most part, advocate for things in a way that lines up with what people actually feel and believe is right.
But if your entire argument against utilitarianism is based on ginormous numbers, tiny probabilities, literal eternities and other such nonsense, you are no longer on the side of moral intuitionism. Rather, your arguments are wildly unintuitive, your “thought experiments” literally unimaginable, and each “intuition pump” overtly designed to take advantage of known cognitive failures.
The real problem isn’t even that these scenarios are too exotic, it’s that coming up with them is trivial, and thus proves nothing. Consider, with apologies to Derek Parfit:
Suppose that I am driving at midnight through some desert. My car breaks down. You are a stranger, and the only other driver near. I manage to stop you, and I ask for help.
As you are against utilitarianism, you have committed to the following doctrine: when a stranger asks for help at midnight in the desert, you will give them the help they need free of charge. Unless they are a utilitarian, in which case you will punch them in the face, light them on fire, and commit to spending the rest of your life sabotaging shipments of anti-malarial betnets.
Here is a case without any outlandish numbers in which being a utilitarian does not result in the best outcome. And yet clearly, it proves nothing at all about utilitarianism!
Look, I know this all sounds silly, but it is no sillier than Newcomb’s Paradox. As a brief reminder:
The player is given a choice between taking only box B, or taking both boxes A and B.
- Box A is transparent and always contains a visible $1,000.
- Box B is opaque, and its content has already been set by the predictor.
If the predictor has predicted that the player will take only box B, then box B contains $1,000,000. If the predictor has predicted the player will take both boxes A and B, then box B contains nothing.
Again, this initial looks pretty damning for standard decision theory …except that you can generate a similar “experiment” to argue against anything you don’t like. In fact, you can generate far worse ones! Consider:
The player is given a choice between only taking box B, or taking both boxes A and B.
- Box A is transparent and always contains $1,000.
- Box B is opaque, and its content has already been set by the predictor.
If the predictor has predicted the player acts in accordance with Theory I Like, Box B contains $1,000,000. If the predictor has predicted the player acts in accordance with Theory I Don’t Like, then box B contains a quadrillion negative QALYs.
The problem isn’t that decision theory is wrong, it’s that the setup has been designed to punish people who behave a certain way. And so it’s meaningless because we can trivially generate analogous setups that punish any arbitrary group of people, thus “disproving” their belief system, or normative theory, or whatever it is you’re trying to argue against… while at the same time providing no actual evidence one way or another.
Does this mean thought experiments are all useless and we just have to do moral philosophy entirely a priori? Not at all. But there are two particular cases where these fail, and a suspiciously large number of the popular experiments fall into at least one of them:
- The “moral intuition” is clearly not generated by reliable intuitions because it abuses:
a. Incomprehensibly large or small numbers
b. Known cognitive biases
c. Wildly unintuitive premises - The “moral intuition” proves too much because it can be trivial deployed against any arbitrary theory
In contrast, the best thought experiments are less like clubs beating you over the head, and more like poetry that highlights a playful tension between conflicting reasons. In this vein, Philippa Foot’s Trolley Problems are so lovely because they elegantly guide you around the contours of your own values. They allow you to parse out various objections, to better understand which particular aspects of an action make it objectionable, and play your own judgements against each other in a way that generates humility, thoughtfulness and comprehension.
So I love thought experiments. And I deeply appreciate the way make-believe scenarios can teach us about the real world. I just don’t care for getting punched in the face.
–––
Appendix
Nicolaus Bernoulli, Joe Carlsmith, Nick Beckstead, Teruji Thomas, Derek Parfit, Tyler Cowen, and Robert Nozick are all perfectly fine people and good moral philosophers.
I am also not a moral philosopher myself, and it’s likely that I’m missing something important.
Having said that, I will do the public service of risking embarassment to make my bullet biting explicit:
- I take the St. Petersburg gamble, and accept that a 0.5n probability of 2nx value is positive-EV.
- I also take the devil’s deal.
- I simply don’t believe that infinities exist, and even though 0 isn’t a probability, I reject the probabilistic argument that any possibility of infinity allows them to dominate all EV calculations. I just don’t think the argument is coherent, at least not in the formulations I’ve seen.
- Similarly, once you introduce a “reliable predictor”, everything goes out the window and the money is the least of your concern. But granting the premise, fine, I One Box.
EDIT: I didn’t discuss it here, but the original desert dillema just involves you being a selfish person who can’t lie, and then man refusing to help you because he knows you won’t actually reward him. This doesn’t fall into either of the “bad thought experiment” heuristics I outlined above, and is in fact, a seemingly reasonable scenario.
But I don’t think the lesson is “selfishness is always self-defeating”, I think the lesson is “if you’re unable to lie, having a policy of acting selfish is probably the wrong way to implement your selfish aims.” And so you should rationally determine to act irrationally (with respect to your short-term aims), but this is really no different than any other short-term/long-term tradeoff.
Parfit’s point, by the way, was a more abstract thing about the fact that some “policies” can be self-defeating, and that this results in some theoretically interesting claims. Which is good and clevel, but for our purposes my point is that the “Argument from getting punched in the face by an AI that hates your policy in particular” does a good job of demonstrating this doesn’t prove anything about any given policy in particular.
I don't know what "naive" utilitarianism is. Some possibilities include:
I would argue that (1) is basically an epistemic problem, not a moral one. If the major concern with utilitarian concepts is that it makes people make inaccurate predictions about how their behaviors will affect the future, that is an empirical psychological problem and needs to be dealt with separately from utilitarian concepts as tools for moral reasoning.
(2) is an argument from authority.
Please let me know if you were referencing some other concern than the two I've speculated about here; I assume I have probably missed your point!
I don't know what "be somewhat deontologist" means to you. I do think that if the same behavior is motivated by multiple contrasting moral frameworks (i.e. by deontology and utilitarianism), that suggests it is "morally robust" and more attractive for that reason.
However, being a deontologist and not a utilitarian is only truly meaningful when the two moral frameworks would lead us to different decisions. In these circumstances, it is by definition not the utility maximizing decision to be deontologist.
If I had to guess at your meaning, it's that "deontologist" is a psychological state, close to a personality trait or identity. Hence, it is primarily something that you can "be," and something that you can be "somewhat" in a meaningful way. Being a deontological sort of person makes you do things that a utilitarian calculus might approve of.
I agree that people do attempt to apply utilitarian concepts to make an argument for avoiding astronomical waste.
I agree that if a moral argument is directing significant human endeavors, that makes it important to consider.
This is where I disagree with (my interpretation of) you.
I think of moral questions as akin to engineering problems.
Occasionally, it turns out that a "really big" or "really small" version of a familiar tool or material is the perfect solution for a novel engineering challenge. The Great Wall of China is an example.
Other times, we need to implement a familiar concept using unfamiliar technology, such as "molecular tweezers" or "solar sails."
Still other times, the engineering challenge is remote enough that we have to invent a whole new category of tool, using entirely new technologies, in order to solve it.
Utilitarianism, deontology, virtue ethics, nihilism, relativism, and other frameworks all offer us "moral tools" and "moral concepts" that we can use to analyze and interpret novel "moral engineering challenges," like the question of whether and how to steer sentient beings toward expansion throughout the lightcone.
When these tools, as we apply them today, fail to solve these novel moral conundrums in a satisfying way, that suggests some combination of their limitations, our own flawed application of them, and perhaps the potential for some new moral tools that we haven't hit on yet.
Failure to fully solve these novel problems isn't a "critique" of these moral tools, any more than a collapsed bridge is a "critique" of the crane that was used to build it.
The tendency to frame moral questions, like astronomical waste, as opportunities to pit one moral framework against another and see which comes out the victor, strikes me as a strange practice.
Imagine that we are living in an early era, in which there is much debate and uncertainty about whether or not it is morally good to kill heathens. Heathens are killed routinely, but people talk a lot about whether or not this is a good thing.
However, every time the subject of heathen-killing comes up, the argument quickly turns to a debate over whether the Orthodox or the Anti-Orthodox moral framework gives weirder results in evaluating the heathen-killing question. All the top philosophers from both schools of thought think of the heathen-killing question as showing up the strengths and weaknesses of the two philosophical schools.
I propose that it would be silly to participate in the Orthodox vs. Anti-Orthodox debate. Instead, I would prefer to focus on understanding the heathen-killing question from both schools of thought, and also try to rope in other perspectives: economic, political, technological, cultural, and historical. I would want to meet some heathens and some heathen-killers. I would try to get the facts on the ground. Who is leading the next war party? How will the spoils be divided up? Who has lost a loved one in the battles with the heathens? Are there any secret heathens around in our own side?
This research strikes me as far more interesting, and far more useful in working toward a resolution of the heathen-killing question, than perpetuating the Orthodox vs. Anti-Orthodox debate.
By the same token, I propose that we stop interpreting astronomical waste and similar moral conundrums as opportunities to debate the merits of utilitarianism vs. deontology vs. other schools of thought. Instead, let's try and obtain a multifaceted, "foxy" view of the issue. I suspect that these controversial questions will begin to dissolve as we gather more information from a wider diversity of departments and experiences than we have at present.