Hide table of contents

I think it is almost always assumed that superintelligent artificial intelligence (SAI) disempowering humans would be bad, but are we confident about that? Is this an under-discussed crucial consideration?

Most people (including me) would prefer the extinction of a random species to that of humans. I suppose this is mostly due to a desire for self-preservation, but can also be justified on altruistic grounds if humans have a greater ability to shape the future for the better. However, a priori, would it be reasonable to assume that more intelligent agents would do better than humans, at least under moral realism? If not, can one be confident that humans would do better than other species?

From the point of view of the universe, I believe one should strive to align SAI with impartial value, not human value. It is unclear to me how much these differ, but one should beware of surprising and suspicious convergence.

In any case, I do not think this shift in focus means humanity should accelerate AI progress (as proposed by effective accelerationism?). Intuitively, aligning SAI with impartial value is a harder problem, and therefore needs even more time to be solved.

New Answer
New Comment

3 Answers sorted by

Historically the field has been focused on impartiality/cosmopolitanism/pluralism, and there's a rough consensus that "human value is a decent bet at proxying impartial value, with respect to the distribution of possible values" that falls from fragility/complexity of value. I.e., many of us suspect that embracing a random draw from the distribution over utility functions leads to worse performance at impartiality than human values. 

I do recommend "try to characterize a good successor criterion" (as per Paul's framing of the problem) as an exercise, I found thinking through it very rewarding. I've definitely taken seriously thought processes like "stop being tribalist, your values aren't better than paperclips", so I feel like I'm thinking clearly about the class of mistakes the latest crop of accelerationists may be making. 

Even though we're vaguing over any particular specification language that expresses values and so on, I suspect that a view of this based on descriptive complexity is robust to moral uncertainty, at the very least to perturbation in choice of utilitarianism flavor. 

Thanks, quinn!

Historically the field has been focused on impartiality/cosmopolitanism/pluralism, and there's a rough consensus that "human value is a decent bet at proxying impartial value

Could you provide evidence for these?

there's a rough consensus that "human value is a decent bet at proxying impartial value, with respect to the distribution of possible values" that falls from fragility/complexity of value

I am not familiar with the posts you link to, but it looks like they focus on human values (emphasis mine):

Complexity of value is the thesis that huma

... (read more)
4
quinn
cosmopolitanism: I have a roundup of links here. I think your concerns are best discussed in the Arbital article on generalized cosmopolitanism:  Re the fragility/complexity link:  My view after reading is that "human" is a shorthand that isn't doubled down on throughout. Fun theory especially characterizes a sense of what's at stake when we have values that can at least entertain the idea of pluralism (as opposed to values that don't hesitate to create extreme lock-in scenarios), and "human value" is sort of a first term approximate proxy of a detailed understanding of that.  This is a niche and extreme flavor of utilitarianism and I wouldn't expect it's conclusions to be robust to moral uncertainty.  But it's a nice question that identifies cruxes in metaethics.  I think this is just a combination of taking lock-in actions that we can't undo seriously along with forecasts about how aggressively a random draw from values can be expected to lock in. 
2
Vasco Grilo🔸
Thanks! I have just remembered the post AGI and lock-in is relevant to this discussion.
4
quinn
Busy, will come back over the next few days. 

Here's my just-so story for how humans evolved impartial altruism by going through several particular steps:

  1. First there was kin selection evolving for particular reasons related to how DNA is passed on. This selects for the precursors to altruism.
  2. With ability to recognise individual characteristics and a long-term memory allowing you to keep track of them, species can evolve stable pairwise reputations.
  3. This allows reciprocity to evolve on top of kin selection, because reputations allow you to keep track of who's likely to reciprocate vs defect.
  4. More advanced communication allows larger groups to rapidly synchronise reputations. Precursors of this include "eavesdropping", "triadic awareness",[1] all the way up to what we know as "gossip".
  5. This leads to indirect reciprocity. So when you cheat one person, it affects everybody's willingness to trade with you.
  6. There's some kind of inertia to the proxies human brains generalise on. This seems to be a combination of memetic evolution plus specific facts about how brains generalise very fast.
  7. If altruistic reputation is a stable proxy for long enough, the meme stays in social equilibrium even past the point where it benefits individual genetic fitness.
  8. In sum, I think impartial altruism (e.g. EA) is the result of "overgeneralising" the notion of indirect reciprocity, such that you end up wanting to help everybody everywhere.[2] And I'm skeptical a randomly drawn AI will meet the same requirements for that to happen to them.
  1. ^

    "White-faced capuchin monkeys show triadic awareness in their choice of allies":

    "...contestants preferentially solicited prospective coalition partners that (1) were dominant to their opponents, and (2) had better social relationships (higher ratios of affiliative/cooperative interactions to agonistic interactions) with themselves than with their opponents."

    You can get allies by being nice, but not unless you're also dominant.

  2. ^

    For me, it's not primarily about human values. It's about altruistic values. Whatever anything cares about, I care about that in proportion to how much they care about it.

Thanks for that story!

I'm skeptical a randomly drawn AI will meet the same requirements for that to happen to them.

I think 100 % alignement with human values would be better than random values, but superintelligent AI would presumably be trained on human data, so it would be aligned with human values somewhat. I also wonder about the extent to which the values of the superintelligent AI could change, hopefully for the better (as human values have).

3
rime
I also have specific just-so stories for why human values have changed for "moral circle expansion" over time, and I'm not optimistic that process will continue indefinitely unless intervened on. Anyway, these are important questions!

From Nate Soares' post Cosmopolitan values don't come free:

Short version: if the future is filled with weird artificial and/or alien minds having their own sort of fun in weird ways that I might struggle to understand with my puny meat-brain, then I'd consider that a win. When I say that I expect AI to destroy everything we value, I'm not saying that the future is only bright if humans-in-particular are doing human-specific things. I'm saying that I expect AIs to make the future bleak and desolate, and lacking in fun or wonder of any sort[1].

Comments18
Sorted by Click to highlight new comments since:

The first couple of paragraphs made me think that you consider the claim "SAI will be good by default", rather than "our goal should be to install good values in our SAI rather than to focus on aligning it with human values (which may be less good)". Is this a good reformulation of the question?

Thanks for the comment, Edo!

I guess the rise of humans has been good to the universe, but disempowered other animal species. So my intuition points towards SAI being good to the universe, even if it disempowers humans by default. However, I do not really know.

good to the universe

 

Good in what sense? I don't really buy the moral realist perspective - I don't see where could this "real morality" possibly come from. But on top of that, I think we all agree that disempowerment, genocide, slavery, etc. are bad; we also frown upon our own disempowerment of non-human animals. So there are two options:

 

  1. either the "real morality" is completely different and alien from the morality we humans currently tend to aspire to as an ideal (at least in this corner of the world and of the internet), or
  2. a SAI that would disempower us despite being vastly more smart and capable and having no need to would be pretty evil.

If it's 1, then I'm even more curious to know what this real morality is, why should I care, or why would the SAI understand it while we seem to be drifting further away from it. And if it's 2, then obviously unleashing an evil SAI on the universe is a horrible thing to do, not just for us, but for everyone else in our future light cone. Either way, I don't see a path to "maybe we should let SAI disempower us because it's the greater good". Any sufficiently nice SAI would understand well enough we don't want to be disempowered.

Thanks for commenting, dr_s!

Good in what sense?

In the sense of increasing expected total hedonistic utility, where hedonistic utility can be thought of as positive conscious experiences. For example, if universes A and B are identical in every respect except I am tortured 1 h more in universe A than in universe B, then universe A is worse than universe B (for reasonable interpretations of "I am tortured"). I do not see how one can argue against the badness (morally negative value) of torture when everything else stays the same. If it was not wrong to add torture keeping everything else the same, then what would be wrong?

I don't really buy the moral realist perspective - I don't see where could this "real morality" possibly come from.

I would say it comes from the Laws of Physics, like everything else. While I am being tortured, the particles and fields in my body are such that I have a bad conscious experience.

Either way, I don't see a path to "maybe we should let SAI disempower us because it's the greater good".

I think this may depend on the timeframe you have in mind. For example, I agree human extinction in 2024 due to advanced AI would be bad (but super unlikely), because it would be better to have more than 1 year to think about how to deploy a super powerful system which may take control of the universe. However, I think there are scenarions further in the future where human disempowerment may be good. For example, if humans in 2100 determined they wanted to maintain forever the energy utilization of humans and AIs below 2100 levels, and never let humans nor AIs leave Earth, I would be happy for advanced AIs to cause human extinction (ideally in a painless way) in order to get access to more energy to power positive conscious experiences of digital minds.

In the sense of increasing expected total hedonistic utility, where hedonistic utility can be thought of as positive conscious experiences.

 

I don't think total sum utilitarianism is a very sensible framework to me. I think it can work as a guideline within certain boundaries, but it breaks down as soon as you admit the potential for things like utility monsters, which ASI as you're describing it effectively is. Everyone only experiences one life, their own, regardless of how many other conscious entities are out there.

I would say it comes from the Laws of Physics, like everything else. While I am being tortured, the particles and fields in my body are such that I have a bad conscious experience.

That just kicks the metaphysical can down the road. Ok, suffering is physical. Who says that suffering is good or bad? Or that it is always good or bad? Who says that what's important is total rather than average utility, or some other more complex function? Who says how can we compare the utility of subjects A and B when their subjective qualia are incommensurate? None of these things can be answered by physics or really empirical observation of any kind alone, that we know of.

if humans in 2100 determined they wanted to maintain forever the energy utilization of humans and AIs below 2100 levels, and never let humans nor AIs leave Earth, I would be happy for advanced AIs to cause human extinction (ideally in a painless way) in order to get access to more energy to power positive conscious experiences of digital minds.

I disagree, personally. The idea that it's okay to kill some beings today to allow more to exist in the future does not seem good to me at all, for several reasons:

  1. at a first order level, because I don't subscribe to total sum utilitarianism, or rather I don't think it's applicable so wildly out of domain - otherwise you could equally justify e.g. genociding the population of an area that you believe can be "better utilized" to allow a different population to grow in it and use its resources fully. I hope we can agree that is in fact a bad thing; not merely worse than somehow allowing for coexistence, but just bad. We should not in fact do this kind of thing today, so I don't think AI should do it to us in the future;
  2. at a higher order, because if you allow such things as good you create terrible incentives in which basically everyone who thinks they can do better is justified in trying to kill anyone else.

So I think if your utility function returns these results as good, it's a sign your utility function is wrong; fix it. I personally think that the free choices of existing beings are supreme here; having a future is worth it insofar as present existing beings desire that there is a future. If humanity decided to go voluntarily extinct (assuming such a momentous decision could genuinely be taken by everyone in synchrony - bit of a stretch), I'd say they should be free to, without feeling bound by either the will of their now dead ancestors nor the prospect of their still hypothetical descendants. It's not that I can't imagine a situation in which in a war between humans and ASI I couldn't think, from my present perspective, that the ASI could be right, though I'd still hope such a war does not turn genocidal (and in fact any ASI that I agree with would be one that doesn't resort to genocide as long as it has the option to). But if the situation you described happened, I'd side with the humans, and I'd definitely say that we shouldn't build the kind of ASI that wouldn't either. Any ASI that can arbitrarily decide "I don't like what these humans are doing, better to kill them all and start afresh" is in fact a ticking time bomb, a paperclipper that is only happy to suffer us live as long as we also make paperclips.

Thanks for elaborating! As a meta point, as I see my comment above has been downvoted (-6 karma excluding my upvote), it would be helpful for me to understand:

  • How it could be improved.
  • Whether it would have been better for me not to have commented.

I want to contribute to a better world, and, if my comments are not helpful, I would like to improve them, or, if the time taken to make them helpful is too long, give more consideration to not commenting.

I don'tthink total sum utilitarianism is a very sensible framework to me. I think it can work as a guideline within certain boundaries, but it breaks down as soon as you admit the potential for things like utility monsters, which ASI as you're describing it effectively is. Everyone only experiences one life, their own, regardless of how many other conscious entities are out there.

I am not sure I understand your point. Are you suggesting we should not maximise impartial welfare because this principle might imply humans being a small fraction of the overall number of beings?

Who says that suffering is good or bad?

Whether suffering is good or bad depends on the situation, including the person assessing it.

Or that it is always good or bad?

Suffering is not always bad/good. However, do you think adding suffering to the world maintaining everything else equal can be bad? For example, imagine you have 2 scenarios:

  • A: one lives a certain life which ends in an isolated place where one does not have contact with anyone.
  • B: you live the same life as in A, but with 1 day of additional suffering (negative conscious experience) at the end. I set life A as ending in isolation, such that the additional suffering in B does not affect other people. In other words, I am trying to construct a scenario where the additional suffering does not have an indirect upside to other beings.

I think the vast majority of people would prefer B.

The idea that it's okay to kill some beings today to allow more to exist in the future does not seem good to me at all

The idea sounds bad to me too! The reason is that, in the real world, killing rarely brings about good outcomes. I am strongly against violence, and killing people.

you could equally justify e.g. genociding the population of an area that you believe can be "better utilized" to allow a different population to grow in it and use its resources fully

Genociding a population is almost always a bad idea, but I do not think one should reject it in all cases. Would you agree that killing a terrorist to prevent 1 billion human deaths would be good? If so, would you agree that killing N terrorists to prevent N^1000 billion human deaths would also be good? In my mind, if the upside is sufficiently large, killing a large number of people could be justified. You might get the sense I have a low bar for this upside from what I am saying, but I actually have quite a high bar for thinking that killing is good. I have commented that:

I am against violence to the point that I wonder whether it would be good to not only stop militarily supporting Ukraine, but also impose economic sanctions on it proportional to the deaths in the Russo-Ukrainian War. I guess supporting Ukrainian nonviolent civil resistance in the face of war might be better to minimise both nearterm and longterm war deaths globally, although I have barely thought about this. If you judge my views on this to be super wrong, please beware the horn effect before taking conclusions about other points I have made.

I very much agree it makes sense to be sceptical about arguments for killing lots of people in practice. For the AI extinction case, I would also be worried about the AI developers (humans or other AIs) pushing arguments in favour of causing human extinction instead of pursuing a better option.

at a higher order, because if you allow such things as good you create terrible incentives in which basically everyone who thinks they can do better is justified in trying to kill anyone else.

Total utilitarianism only says one should maximise welfare. It does not say killing weaker beings is a useful heuristic to maximise welfare. My own view is that killing weaker beings is a terrible heuristic to maximise welfare (e.g. it may favour factory-farming, which I think is pretty bad).

I am not sure I understand your point. Are you suggesting we should not maximise impartial welfare because this principle might imply humans being a small fraction of the overall number of beings?

 

I just don't think total sum utilitarianism maps well with the kind of intuitions I'd like a functional moral system to match. I think ideally a good aggregation system for utility should not be vulnerable to being gamed via utility monsters. I lean more towards average utility as a good index, though that too has its flaws and I'm not entirely happy with it. I've written a (very tongue-in-cheek) post about it on Less Wrong.

Whether suffering is good or bad depends on the situation, including the person assessing it.

Sure. So that actually backs my point that it's all relative to sentient subjects. There is no fundamental "real morality", though there are real facts about the conscious experience of sentient beings. But trade-offs between these experiences aren't obvious and can't be settled empirically.

The idea sounds bad to me too! The reason is that, in the real world, killing rarely brings about good outcomes. I am strongly against violence, and killing people.

But more so, killing people violates their own very strong preference towards not being killed. That holds for an ASI too.

Genociding a population is almost always a bad idea, but I do not think one should reject it in all cases. Would you agree that killing a terrorist to prevent 1 billion human deaths would be good? If so, would you agree that killing N terrorists to prevent N^1000 billion human deaths would also be good?

I mean, ok, one can construct these hypothetical scenarios, but the one you suggested wasn't about preventing deaths, but ensuring the existence of more lives in the future. And those are very different things.

Total utilitarianism only says one should maximise welfare. It does not say killing weaker beings is a useful heuristic to maximise welfare. My own view is that killing weaker beings is a terrible heuristic to maximise welfare (e.g. it may favour factory-farming, which I think is pretty bad).

But obviously if you count future beings too - as you are - then it becomes inevitable that this approach does justify genocide. Take the very real example of the natives of the Americas. By this logic, the same exact logic that you used for an example of why an ASI could be justified in genociding us, the colonists were justified in genociding the natives. After all, they lived in far lower population densities that the land could support with advanced agricultural techniques, and they lived hunter-gatherer or at best bronze-age style lives, far less rich of pleasures and enjoyments than a modern one. So killing a few millions of them to allow eventually for over 100 million modern Americans to make full use of the land would have been a good thing.

See the problem with the logic? As long as you have better technology and precommit to high population densities you can justify all sorts of brutal colonization efforts as a net good, if not maximal good. And that's a horrible broken logic. It's the same logic that the ASI that kills everyone on Earth just so it can colonize the galaxy would follow. If you think it's disgusting when applied to humans, well, the same standards ought to apply to ASI.

I just don't think total sum utilitarianism maps well with the kind of intuitions I'd like a functional moral system to match. I think ideally a good aggregation system for utility should not be vulnerable to being gamed via utility monsters.

In practice, I think smaller beings will produce welfare more efficiently. For example, I guess bees produce it 5 k times as effectively as humans. To the extent the way of producing welfare in the most efficient way involves a single or a few beings, they would have to be consuming lots and lots of energy of many many galaxies. So it would be more like the universe itself being a single being or organism, which is arguably more compelling than what the term "utility monster" suggests.

I mean, ok, one can construct these hypothetical scenarios, but the one you suggested wasn't about preventing deaths, but ensuring the existence of more lives in the future. And those are very different things.

How about this scenario. There is a terrorist who is going to release a very infectious virus which will infect all humans on Earth. The virus makes people infertile forever, thus effectively leading to human extinction, but it also makes people fully lose their desire to have children, and have much better self-assessed lives. Would it make sense to kill the terrorist? Killing the terrorist would worsen the lives of all humans alive, but it would also prevent human extinction.

But obviously if you count future beings too - as you are - then it becomes inevitable that this approach does justify genocide.

Yes, it can justify genocide, although I am sceptical it would in practice. Genocide involves suffering, and suffering is bad, so I assume there would be a better option to maximise impartial welfare. For example, ASI could arguably persuade humans that their extinction was for the better, or just pass everyone to a simulation without anyone noticing, and then shutting down the simulation in a way that no suffering is caused in the process.

See the problem with the logic? As long as you have better technology and precommit to high population densities you can justify all sorts of brutal colonization efforts as a net good, if not maximal good.

I agree you can justify the replacement of beings who produce welfare less efficiently by beings who produce welfare more efficiently. For example, replacing rocks by humans is fine, and so might be replacing humans by digitals minds. However, the replacement process itself should maximise welfare, and I am very sceptical that "brutal colonization efforts" would be the most efficient way for ASI to perform the replacement.

I think smaller beings will produce welfare more efficiently. For example, I guess bees produce it 5 k times as effectively as humans.

 

I just don't think it makes any sense to have an aggregated total measure of "welfare". We can describe what is the distribution of welfare across the sentient beings of the universe, but to simply bunch it all up has essentially no meaning. In what way is a world with a billion very happy people any worse than a world with a trillion merely okay ones? I know which one I'd rather be born into! How can a world be worse for everyone individually yet somehow better, if the only meaning of welfare is that it is experienced by sentient beings to begin with?

There is a terrorist who is going to release a very infectious virus which will infect all humans on Earth. The virus makes people infertile forever, thus effectively leading to human extinction, but it also makes people fully lose their desire to have children, and have much better self-assessed lives. Would it make sense to kill the terrorist? Killing the terrorist would worsen the lives of all humans alive, but it would also prevent human extinction.

It's moral because the terrorist is infringing the wishes of those people right now, and violating their self-determination. If the people decided to infect themselves, then it would be ok.

Genocide involves suffering, and suffering is bad, so I assume there would be a better option to maximise impartial welfare. For example, ASI could arguably persuade humans that their extinction was for the better, or just pass everyone to a simulation without anyone noticing, and then shutting down the simulation in a way that no suffering is caused in the process.

I disagree that the genocide is made permissible by making the death a sufficiently painless euthanasia. Sure, the suffering is an additional evil, but the killing is an evil unto itself. Honestly, consider where these arguments could lead in realistic situations and consider whether you would be okay with that, or if you feel like relying on a circumstantial "well but actually in reality this would always come out negative net utility due to the suffering" is protection enough. If you get conclusions like these from your ethical framework it's probably a good sign that it might have some flaws.

For example, replacing rocks by humans is fine, and so might be replacing humans by digitals minds. However, the replacement process itself should maximise welfare, and I am very sceptical that "brutal colonization efforts" would be the most efficient way for ASI to perform the replacement.

Rocks aren't sentient, they don't count. And your logic still doesn't work. What if you can instantly vaporize everyone with a thermonuclear bomb, as they are all concentrated within the radius of the fireball? Death would then be instantaneous. Would that make it acceptable? Very much doubt it.

Thanks for elaborating further!

I just don't think it makes any sense to have an aggregated total measure of "welfare". We can describe what is the distribution of welfare across the sentient beings of the universe, but to simply bunch it all up has essentially no meaning.

I find it hard to understand this. I think 10 billion happy people is better than no people. I guess you disagree with this?

It's moral because the terrorist is infringing the wishes of those people right now, and violating their self-determination. If the people decided to infect themselves, then it would be ok.

I think respecting people's preferences is a great heuristic to do good. However, I still endorse hedonic utilitarianism rather than preference utilitarianism, because it is possible for someone to have preferences which are not ideal to maximise one's goals. (As an aside, Peter Singer used to be a preference utilitarian, but now is a hedonistic utilitarianism.)

Sure, the suffering is an additional evil, but the killing is an evil unto itself.

No killing is necessary given an ASI. The preferences of humans could be modified such that everyone is happy with ASI taking over the universe. In addition, even if you think killing without suffering is bad in itself (and note ASI may even make the killing pleasant to humans), do you think that badness would outweigh an arbitrarity large happiness?

Rocks aren't sentient, they don't count.

I think rocks are sentient in the sense they have a non-null expected welfare range, but it does not matter because I have no idea how to make them happier.

What if you can instantly vaporize everyone with a thermonuclear bomb, as they are all concentrated within the radius of the fireball? Death would then be instantaneous. Would that make it acceptable? Very much doubt it.

No, it would not be acceptable. I am strongly against negative utilitarianism. Vaporising all beings without any suffering would prevent all future suffering, but it would also prevent all future happiness. I think the expected value of the future is positive, so I would rather not vaporise all beings.

I find it hard to understand this. I think 10 billion happy people is better than no people. I guess you disagree with this?

 

"No people" is a special case - even if one looks at e.g. average utilitarianism, that's a division by zero. I think a universe with no sentient beings in it does not have a well-defined moral value: moral value only exists with respect to sentients, so without any of them, the categories of "good" or "bad" stop even making sense. But obviously any path from a universe with sentients to a universe without implies extermination, which is bad.

However, given an arbitrary amount of sentient beings of comparable happiness, I don't think the precise amount matters to how good things are, no. No one experiences all that good at once, hence 10 billion happy people are as good as 10 million - if they are indeed just as happy.

because it is possible for someone to have preferences which are not ideal to maximise one's goals

I think any moral philosophy that leaves the door open to too much of "trust me, it's for your own good, even though it's not your preference you'll enjoy the outcome far more" is rife for dangerous derailments.

No killing is necessary given an ASI. The preferences of humans could be modified such that everyone is happy with ASI taking over the universe. In addition, even if you think killing without suffering is bad in itself (and note ASI may even make the killing pleasant to humans), do you think that badness would outweigh an arbitrarity large happiness?

Yes, because I don't care if the ASI is very very happy, it still counts for one. I also don't think you can reasonably conceive of unbounded amounts of happiness felt by a single entity, so much as to compensate for all that suffering. Also try to describe to anyone "hey what if a supercomputer that wanted to take over the universe brainwashed you to be ok with it taking over the universe", see their horrified reaction, and consider whether it makes sense for any moral system to reach conclusions that are obviously so utterly, instinctively repugnant to almost everyone.

I think rocks are sentient in the sense they have a non-null expected welfare range, but it does not matter because I have no idea how to make them happier.

I'm... not even sure how to parse that. Do you think rocks have conscious experiences?

No, it would not be acceptable. I am strongly against negative utilitarianism. Vaporising all beings without any suffering would prevent all future suffering, but it would also prevent all future happiness. I think the expected value of the future is positive, so I would rather not vaporise all beings.

The idea was that the vaporization is required to free the land for a much more numerous and technologically advanced populace, who can then go on to live off its resources a much more leisurely life with less hard work, less child mortality, less disease etc. So you replace, say, 50,000 vaporised indigenous people living like hunter gatherers with 5 million colonists living like we do now in the first world (and I'm talking new people, children that can only be born thanks to the possibility of expanding in that space). Does that make the genocide any better? If not, why? And how do those same arguments not apply to the ASI too?

No one experiences all that good at once, hence 10 billion happy people are as good as 10 million - if they are indeed just as happy.

This is the crux. Some questions:

  • Would 10^100 happy people be just as good as 1 happy person (assuming everyone is just as happy individually)?
  • Would 10^100 people being tortured be just as bad as 1 person being tortured (assuming everyone is feeling just as bad individually)?
  • Would you agree that a happy life of 100 years is just as good as a happy life of 1 year (assuming the annual happiness is the same in both cases)? If not, why is a person the relevant unit of analysis, and not a person-year?

Would 10^100 happy people be just as good as 1 happy person (assuming everyone is just as happy individually)?

 

Hypothetically, yes, if we take it into a vacuum. I find the scenario unrealistic in any real world circumstance though because obviously people's happiness tends to depend on having other people around, and also, because any trajectory that ended in there being only one person, happy or not, from the current situation, seems likely bad.

Would 10^100 people being tortured be just as bad as 1 person being tortured (assuming everyone is feeling just as bad individually)?

Pretty much the same reasoning applies. For "everyone has it equally good/bad" worlds, I don't think sheer numbers make a difference. What makes things more complicated is when inequality is involved.

Would you agree that a happy life of 100 years is just as good as a happy life of 1 year (assuming the annual happiness is the same in both cases)? If not, why is a person the relevant unit of analysis, and not a person-year?

I think length of life matters a lot; if I know I'll have just one year of life, my happiness is kind of tainted by the knowledge of imminent death, you know? We experience all of our life ourselves. For an edge scenario, there's one character in "Permutation City" (a Greg Egan novel) who is a mind upload and puts themselves into a sort of mental state loop; after a period T their mental state maps exactly to itself and repeats identically, forever. If you considered such a fantastical scenario, then I'd argue the precise length of the loop doesn't matter much.

I strongly believe 10^100 people being tortured is much much worse than 1 person being tortured (even assuming everyone is feeling just as bad individually). I do not know what to say more, but thanks for the chat!

I think the reason for those intuitions is that (reasonably enough!) we can't imagine there being 10^100 people without there also being a story behind that situation. A world in which e.g. some kind of entity breeds humans on purpose to then torture them, leading to those insane amounts, sounds indeed absolutely hellish! But the badness of it is due to the context; a world in which there exists only one person, and that person is being horribly tortured, is also extremely upsetting and sad, just in a different way and for different reasons (and all paths to there are also very disturbing; but we'll maybe think "at least everyone else just died without suffering as much" so it feels less horrible than the 10^100 humans torture world).

But my intuition on the situation alone is more along the lines of: imagine you know you're going to be born into this world. Would you like you odds? And in both the "one tortured human" and the "10^100 tortured humans" worlds, your odds would be exactly the same: 100% chance of being tortured.

But all of these are just abstract thought experiments. In any realistic situations, torture worlds don't just happen - there is a story leading to them, and for any kind of torture world, that story is godawful. So in practice the two things can't be separated. I think it's fairly correct to say that in all realistic scenarios the 10^100 world will be in practice worse, or have a worse past, though both worlds would be awful.

I think the reason for those intuitions is that (reasonably enough!) we can't imagine there being 10^100 people without there also being a story behind that situation.

At least for me, that does not matter. I would always pick a world where 1 person is tortured over a world where 10^100 are tortured (holding the amount of torture per person constant), regardless of the past history. Here is another question. You are just at the beginning of the universe, so no past history, and you can either click one button which would create 10^100 people who would be tortured for 100 years, or another button which would create 1 person who would be tortured for 100 years. If you had to then pick one world to live in, as an individual, you would suffer the same in both worlds, as you would have 100 % chance of being tortured for 100 years either way. So you could conclude which button you chose does not really matter. However, I certainly think such choice would matter! I care not only about my own welfare, but also that of others, so I would certainly pick the button leading to less total torture.

You said before that total utilitarianism is problematic because it can, at least in principle, lead one to endorse situations where a population is made extinct in order for its resources to be used more efficiently to produce welfare. However, average utilitarianism is way more problematic. It can, at least in principle, lead to a situation where the average welfare is kept constant, but we arbitrarily expand the amount of torture by increasing the number of beings. Even in practice this would be possible. For example, net global welfare accounting for animals may well be negative due to wild animal suffering (or even just farmed animal suffering; see this analysis), which means just replicating Earth's ecosystem in Earth-like planets across the universe may well be a way of expanding suffering (and average suffering per being can be kept roughly the same for the sake of a thought experiment). If net global welfare is indeed negative, I would consider this super bad!

I don't know if it makes a lot of sense because yes, in theory from my viewpoint all "torture worlds" (N agents, all suffering the same amount of torture) are equivalent. I feel like that intuition is more right than just "more people = more torture". I would call them equally bad worlds, and if the torture is preternatural and inescapable I have no way of choosing between them. But I also feel like this is twisting ourselves into examples that are completely unrealistic, to the point of almost uselessness; it is no wonder that our theories of ethics break down, same as most physics does at a black hole singularity.

Paul Christiano discussed the value of unaligned AI on The 80,000 Hours Podcast (in October 2018). Pretty interesting!

Curated and popular this week
Relevant opportunities