(I'm repeating something I said in another comment I wrote a few hours ago, but adapted to this post.)
On a basic level, I agree that we should take artificial sentience extremely seriously, and think carefully about the right type of laws to put in place to ensure that artificial life is able to happily flourish, rather than suffer. This includes enacting appropriate legal protections to ensure that sentient AIs are treated in ways that promote well-being rather than suffering. Relying solely on voluntary codes of conduct to govern the treatment of potentially sentient AIs seems deeply inadequate, much like it would be for protecting children against abuse. Instead, I believe that establishing clear, enforceable laws is essential for ethically managing artificial sentience.
That said, I'm skeptical that a moratorium is the best policy.
From a classical utilitarian perspective, the imposition of a lengthy moratorium on the development of sentient AI seems like it would help to foster a more conservative global culture—one that is averse towards not only creating sentient AI, but also potentially towards other forms of life-expanding ventures, such as space colonization. Classical utilitarianism is typically seen as aiming to maximize the number of conscious beings in existence, advocating for actions that enable the flourishing and expansion of life, happiness, and fulfillment on as broad a scale as possible. However, implementing and sustaining a lengthy ban on AI would likely require substantial cultural and institutional shifts away from these permissive and ambitious values.
To enforce a moratorium of this nature, societies would likely adopt a framework centered around caution, restriction, and a deep-seated aversion to risk—values that would contrast sharply with those that encourage creating sentient life and proliferating this life on as large of a scale as possible. Maintaining a strict stance on AI development might lead governments, educational institutions, and media to promote narratives emphasizing the potential dangers of sentience and AI experimentation, instilling an atmosphere of risk-aversion rather than curiosity, openness, and progress. Over time, these narratives could lead to a culture less inclined to support or value efforts to expand sentient life.
Even if the ban is at some point lifted, there's no guarantee that the conservative attitudes generated under the ban would entirely disappear, or that all relevant restrictions on artificial life would completely go away. Instead, it seems more likely that many of these risk-averse attitudes would remain even after the ban is formally lifted, given the initially long duration of the ban, and the type of culture the ban would inculcate.
In my view, this type of cultural conservatism seems likely to, in the long run, undermine the core aims of classical utilitarianism. A shift toward a society that is fearful or resistant to creating new forms of life may restrict humanity’s potential to realize a future that is not only technologically advanced but also rich in conscious, joyful beings. If we accept the idea of 'value lock-in'—the notion that the values and institutions we establish now may set a trajectory that lasts for billions of years—then cultivating a culture that emphasizes restriction and caution may have long-term effects that are difficult to reverse. Such a locked-in value system could close off paths to outcomes that are aligned with maximizing the proliferation of happy, meaningful lives.
Thus, if a moratorium on sentient AI were to shape society's cultural values in a way that leans toward caution and restriction, I think the enduring impact would likely contradict classical utilitarianism's ultimate goal: the maximal promotion and flourishing of sentient life. Rather than advancing a world with greater life, joy, and meaningful experiences, these shifts might result in a more closed-off, limited society, actively impeding efforts to create a future rich with diverse and conscious life forms.
(Note that I have talked mainly about these concerns from a classical utilitarian point of view. However, I concede that a negative utilitarian or antinatalist would find it much easier to rationally justify a long moratorium on AI.
It is also important to note that my conclusion holds even if one does not accept the idea of a 'value lock-in'. In that case, longtermists should likely focus on the near-term impacts of their decisions, as the long-term impacts of their actions may be impossible to predict. And I'd argue that a moratorium would likely have a variety of harmful near-term effects.)
Given your statement that "a 50-year delay in order to make this monumentally importance choice properly would seem to be a wise and patient decision by humanity", I'm curious if you have any thoughts on the comment I just wrote, particularly the part arguing against a long moratorium on creating sentient AI, and how this can be perceived from a classical utilitarian perspective.
On a basic level, I agree that we should take artificial sentience extremely seriously, and think carefully about the right type of laws to put in place to ensure that artificial life is able to happily flourish, rather than suffer. This includes enacting appropriate legal protections to ensure that sentient AIs are treated in ways that promote well-being rather than suffering. Relying solely on voluntary codes of conduct to govern the treatment of potentially sentient AIs seems deeply inadequate, much like it would be for protecting children against abuse. Instead, I believe that establishing clear, enforceable laws is essential for ethically managing artificial sentience.
However, it currently seems likely to me that sufficiently advanced AIs will be sentient by default. And if advanced AIs are sentient by default, then instituting a temporary ban on sentient AI development, say for 50 years, would likely be functionally equivalent to pausing the entire field of advanced AI for that period.
Therefore, despite my strong views on AI sentience, I am skeptical about the idea of imposing a moratorium on creating sentient AIs, especially in light of my general support for advancing AI capabilities.
The idea that sufficiently advanced AIs will likely be sentient by default can be justified by three basic arguments:
My skepticism of a general AI moratorium contrasts with those of (perhaps) most EAs, who appear to favor such a ban, for both AI safety reasons and to protect AIs themselves (as you argue here). I'm instead inclined to highlight the enormous costs of such a ban, compared to a variety of cheaper alternatives, such as targeted regulation that merely ensures AIs are strongly protected against abuse. These costs appear to include:
Moreover, from a classical utilitarian perspective, the imposition of a 50-year moratorium on the development of sentient AI seems like it would help to foster a more conservative global culture—one that is averse towards not only creating sentient AI, but also potentially towards other forms of life-expanding ventures, such as space colonization. Classical utilitarianism is typically seen as aiming to maximize the number of conscious beings in existence, advocating for actions that enable the flourishing and expansion of life, happiness, and fulfillment on as broad a scale as possible. However, implementing and sustaining a lengthy ban on AI would likely require substantial cultural and institutional shifts away from these permissive and ambitious values.
To enforce a moratorium of this nature, societies would likely adopt a framework centered around caution, restriction, and a deep-seated aversion to risk—values that would contrast sharply with those that encourage creating sentient life and proliferating this life on as large of a scale as possible. Maintaining a strict stance on AI development might lead governments, educational institutions, and media to promote narratives emphasizing the potential dangers of sentience and AI experimentation, instilling an atmosphere of risk-aversion rather than curiosity, openness, and progress. Over time, these narratives could lead to a culture less inclined to support or value efforts to expand sentient life.
Even if the ban is at some point lifted, there's no guarantee that the conservative attitudes generated under the ban would entirely disappear, or that all relevant restrictions on artificial life would completely go away. Instead, it seems more likely that many of these risk-averse attitudes would remain even after the ban is formally lifted, given the initially long duration of the ban, and the type of culture the ban would inculcate.
In my view, this type of cultural conservatism seems likely to, in the long run, undermine the core aims of classical utilitarianism. A shift toward a society that is fearful or resistant to creating new forms of life may restrict humanity’s potential to realize a future that is not only technologically advanced but also rich in conscious, joyful beings. If we accept the idea of 'value lock-in'—the notion that the values and institutions we establish now may set a trajectory that lasts for billions of years—then cultivating a culture that emphasizes restriction and caution may have long-term effects that are difficult to reverse. Such a locked-in value system could close off paths to outcomes that are aligned with maximizing the proliferation of happy, meaningful lives.
Thus, if a moratorium on sentient AI were to shape society's cultural values in a way that leans toward caution and restriction, I think the enduring impact would likely contradict classical utilitarianism's ultimate goal: the maximal promotion and flourishing of sentient life. Rather than advancing a world with greater life, joy, and meaningful experiences, these shifts might result in a more closed-off, limited society, actively impeding efforts to create a future rich with diverse and conscious life forms.
(Note that I have talked mainly about these concerns from a classical utilitarian point of view, and a person-affecting point of view. However, I concede that a negative utilitarian or antinatalist would find it much easier to rationally justify a long moratorium on AI.
It is also important to note that my conclusion holds even if one does not accept the idea of a 'value lock-in'. In that case, longtermists should likely focus on the near-term impacts of their decisions, as the long-term impacts of their actions may be impossible to predict. And my main argument here is that the near term impacts of such a moratorium are likely to be harmful in a variety of ways.)
Humans don't like shocks. Explosive growth would definitely be a shock. We tend to like very gradual changes, or brief flirts with big change.
Speaking generally, it is true that humans are frequently hesitant to change the status quo, and economic shocks can be quite scary to people. This provides one reason to think that people will try to stop explosive growth, and slow down the rate of change.
On the other hand, it's important to recognize the individual incentives involved here. On an individual, personal level, explosive growth is equivalent to a dramatic rise in real income over a short period of time. Suppose you were given the choice of increasing your current income by several-fold over the next few years. For example, if your real income is currently $100,000/year, then you would see it increase to $300,000/year in two years. Would you push back against this change? Would this rise in your personal income be too fast for your tastes? Would you try to slow it down?
Even if explosive growth is dramatic and scary on a collective and abstract level, it is not clearly bad on an individual level. Indeed, it seems quite clear to me that most people would be perfectly happy to see their incomes rise dramatically, even at a rate that far exceeded historical norms, unless they recognized a substantial and grave risk that would accompany this rise in their personal income.
If we assume that people collectively follow what is in each of their individual interests, then we should conclude that incentives are pretty strongly in favor of explosive growth (at least when done with low risk), despite the fact that this change would be dramatic and large.
In general, to me it seems quite fruitful to examine in more detail whether, in fact, multipolarity of various kinds might alleviate concerns about value fragility. And to those who have the intuition that it would (especially in cases, like Multipolar value fragility, where agent A’s exact values aren’t had by any of agents 1-n), I’d be curious to hear the case spelled out in more detail.
Here's a case that I roughly believe: multipolarity means that there's a higher likelihood that one's own values will be represented because it gives them the opportunity to literally live in, and act in the world to bring about outcomes they personally want.
This case is simple enough, and it's consistent with the ordinary multipolarity the world already experiences. Consider an entirely selfish person. Now, divide the world into two groups: the selfish person (which we call Group A) and the rest of the world (which we call Group B).
Group A and Group B have very different values, even "upon reflection". Group B is also millions or billions of times more powerful than Group A (as it comprises the entire world minus the selfish individual). Therefore, on a naive analysis, you might expect Group B to "take over the world" and then implement its values without any regard whatsoever to Group A. Indeed, because of the vast power differential, it would be "very easy" for Group B to achieve this world takeover. And such an outcome would indeed be very bad according to Group A's values.
Of course, this naive analysis is flawed, because the real world is multipolar in an important respect: usually, Group B will let Group A (the individual) have some autonomy, and let them receive a tiny fraction of the world's resources, rather than murdering Group A and taking all their stuff. They will do this because of laws, moral norms, and respect for one's fellow human. This multipolarity therefore sidesteps all the issues with value fragility, and allows Group A to achieve a pretty good outcome according to their values.
This is also my primary hope with misaligned AI. Even if misaligned AIs are collectively millions or billions of times more powerful than humans (or aligned AIs), I would hope they would still allow the humans or aligned AIs to have some autonomy, leave us alone, and let us receive a sufficient fraction of resources that we can enjoy an OK outcome, according to our values.
I would go even further than the position argued in this paper. This paper focuses on whether we should give agentic AIs certain legal rights (right to make contracts, hold property, and bring tort claims), but I also think as an empirical matter, we probably will do so. I have two main justifications for my position here:
Beyond the basic question of whether AIs should or will receive basic legal rights in the future, there are important remaining questions about how post-AGI law should be structured. For example:
I believe these questions, among others, deserve more attention among those interested in AGI governance.
I think I basically agree with you, and I am definitely not saying we should just shrug. We should instead try to shape the future positively, as best we can. However, I still feel like I'm not quite getting my point across. Here's one more attempt to explain what I mean.
Imagine if we achieved a technology that enabled us to build physical robots that were functionally identical to humans in every relevant sense, including their observable behavior, and their ability to experience happiness and pain in exactly the same way that ordinary humans do. However, there is just one difference between these humanoid robots and biological humans: they are made of silicon rather than carbon, and they look robotic, rather than biological.
In this scenario, it would certainly feel strange to me if someone were to suggest that we should be worried about a peaceful robot takeover, in which the humanoid robots collectively accumulate the vast majority of wealth in the world via lawful means.
By assumption, these humanoid robots are literally functionally identical to ordinary humans. As a result, I think we should have no intrinsic reason to disprefer them receiving a dominant share of the world's wealth, versus some other subset of human-like beings. This remains true even if the humanoid robots are literally "not human", and thus their peaceful takeover is equivalent to "human disempowerment" in a technical sense.
There ultimate reason why I think one should not worry about a peaceful robot takeover in this specific scenario is because I think these humanoid robots have essentially the same moral worth and right to choose as ordinary humans, and therefore we should respect their agency and autonomy just as much as we already do for ordinary humans. Since we normally let humans accumulate wealth and become powerful via lawful means, I think we should allow these humanoid robots to do the same. I hope you would agree with me here.
Now, generalizing slightly, I claim that to be rationally worried about a peaceful robot takeover in general, you should usually be able to identify a relevant moral difference between the scenario I have just outlined and the scenario that you're worried about. Here are some candidate moral differences that I personally don't find very compelling:
Say you're worried about any take-over-the-world actions, violent or not -- in which case this argument about the advantages of non-violent takeover is of scant comfort;
This is reasonable under the premise that you're worried about any AI takeovers, no matter whether they're violent or peaceful. But speaking personally, peaceful takeover scenarios where AIs just accumulate power—not by cheating us or by killing us via nanobots—but instead by lawfully beating humans fair and square and accumulating almost all the wealth over time, just seem much better than violent takeovers, and not very bad by themselves.
I admit the moral intuition here is not necessarily obvious. I concede that there are plausible scenarios in which AIs are completely peaceful and act within reasonable legal constraints, and yet the future ends up ~worthless. Perhaps the most obvious scenario is the "Disneyland without children" scenario where the AIs go on to create an intergalactic civilization, but in which no one (except perhaps the irrelevant humans still on Earth) is sentient.
But when I try to visualize the most likely futures, I don't tend to visualize a sea of unsentient optimizers tiling the galaxies. Instead, I tend to imagine a transition from sentient biological life to sentient artificial life, which continues to be every bit as cognitively rich, vibrant, and sophisticated as our current world—indeed, it could be even moreso, given what becomes possible at a higher technological and population level.
Worrying about non-violent takeover scenarios often seems to me to arise simply from discrimination against non-biological forms of life, or perhaps a more general fear of rapid technological change, rather than naturally falling out as a consequence of more robust moral intuitions.
Let me put it another way.
It is often conceded that it was good for humans to take over the world. Speaking broadly, we think this was good because we identify with humans and their aims. We belong to the "human" category of course; but more importantly, we think of ourselves as being part of what might be called the "human tribe", and therefore we sympathize with the pursuits and aims of the human species as a whole. But equally, we could identify as part of the "sapient tribe", which would include non-biological life as well as humans, and thus we could sympathize with the pursuits of AIs, whatever those may be. Under this framing, what reason is there to care much about a non-violent, peaceful AI takeover?
To be clear, I wasn't arguing against generic restrictions on advanced AIs. In fact, I advocated for restrictions, in the form of legal protections on AIs against abuse and suffering. In my comment, I was solely arguing against a lengthy moratorium, rather than arguing against more general legal rules and regulations.
Given my argument, I'd go further than saying that the relevant restrictions I was arguing against would "likely delay technological progress". They almost certainly would have that effect, since I was talking about a blanket moratorium, rather than more targeted or specific rules governing the development of AI (which I support).
A major reason why I didn't give this argument was because I already conceded that we should have legal protections against mistreated Artificial Sentience. The relevant comparison is not between a scenario with no restrictions on mistreatment vs. restrictions that prevent against AI mistreatment, but rather between the moratorium discussed in the post vs. more narrowly scoped regulations that specifically protect AIs from mistreatment.
Let me put this another way. Let's say we were to impose a moratorium on advanced AI, for the reasons given in this post. The idea here is presumably that, during the moratorium, society will deliberate on what we should do with advanced AI. After this deliberation concludes, society will end the moratorium, and then implement whatever we decided on.
What types of things might we decide to do, while deliberating? A good guess is that, upon the conclusion of the moratorium, we could decide to implement strong legal protections against AI mistreatment. In that case, the result of the moratorium appears identical to the legal outcome that I had already advocated, except with one major difference: with the moratorium, we'd have spent a long time with no advanced AI.
It could well be the case that spending, say, 50 years with no advanced AI is always better than nothing—from a utilitarian point of view—because AIs might suffer on balance more than they are happy, even with strong legal protections. If that is the case, the correct conclusion to draw is that we should never build AI, not that we should spend 50 years deliberating. Since I didn't think this was the argument being presented, I didn't spend much time arguing against the premise supporting this conclusion.
Instead, I wanted to focus on the costs of delay and deliberation, which I think are quite massive and often overlooked. Given these costs, if the end result of the moratorium is that we merely end up with the same sorts of policies that we could have achieved without the delay, the moratorium seems flatly unjustified. If the result of the moratorium is that we end up with even worse policies, as a result of the cultural effects I talked about, then the moratorium is even less justified.