The argument I’m going to make here is obviously not original with me, but I do think that while AI governance is talked about, and one of the career paths that 80,000 hours recommends, it is not sufficiently pointed to as an important cause area. Further, I think the specific issue that I’m pointing to is even more neglected than AI governance generally.

 

Suppose we create a human superior AI that does what we tell it to.

This creates the possibility of us immediately transforming into a post scarcity economy. Nobody needs to work. Everybody gets as much as they want of the stuff that used to be scarce due to limited human labor.

This would be amazing!

Each of us could own a yacht, or two. The limit on how many yachts we can have is space on the water, not how expensive the boat is. We can own as many clothes as we want. If anyone wants an authentic 18th century costume, it can be hand sewn immediately. 

If anyone gets sick, they will be able to have access to a flawless surgeon using the most advanced medical knowledge. They also won’t need to wait for an appointment to talk to a doctor. Every single child will have access to private tutors who know as much as every tutor that any child has had until the present put together.

An actual super intelligence could give us delicious zero suffering meat, probably a malaria vaccine, it could eliminate either all, or nearly all childhood diseases.

AI could create a world where nobody needs to desperately work and hustle just to get by, and where no one ever needs to worry about whether there will be food and shelter.

Etc, etc, etc.

There is a reason that techno-optimist transhumanists think friendly AI can create an amazingly good world. 

 

So, let's for a moment assume Deepmind, Open AI or a black project run by the Chinese government successfully creates a super human intelligence that does what they ask, and which will not betray them.

Does this super awesome world actually come into existence?


 

Reasons to worry (definitely not exhaustive!)


 

  • Power corrupts
    • Capabilities researchers have been reading about how awesome singletons are.
      • Even worse, some of the suits might have read about it too. Sergey and Elon definitely have.
  • The Leftist critique: A limited number of powerful people making decisions about the fate of people who have no say in it. They will treat our collective human resources as their own private toys.
    • I take this issue very seriously.
    • A good world is a world where everyone has control over their own fate, and is no longer at the mercy of impersonal forces that they can neither understand nor manipulate. 
    • Further a good world is one in which people in difficult, impoverished, and non normative circumstances are able to make choices to make their lives go well, as they see it.
  • The Nationalism problem
    • Suppose AI developed in the US successfully stays under democratic control. And it is used purely to aggrandize the wealth and well being of Americans, by locking in America’s dominance of all resources in the solar system and the light cone, forever.
      • Poor Malawians are still second or third class citizens on earth, and are still only receiving the drips of charity from those who silently consider themselves their betters.
      • We could have fixed poverty forever instead.
    • Suppose AI is developed in China 
      • They establish a regime with communist principles and social control over communication everywhere on the planet. This regime keeps everyone, everywhere, forever parroting communist slogans.
    • Worse: Suppose they don’t give any charity to the poor? There is precedent for dominant groups to simply treat the poor around them as parasites or work animals. Perhaps whoever controls the AI will starve or directly kill all other humans.


 

Summary: Misaligned people controlling the AI would be bad.

 

This issue is connected to the agency problem (of which aligning AI itself is an example). 

  • How do we make sure that the people or institutions in power act with everyone's best interests in mind?
  • What systems can we put in place to hold the people/institutions in power accountable to the promises they make?
  • How do we shape the incentive landscape in such a way that the people/institutions in power act to maximise wellbeing (while not creating adverse effects)?
  • How do we make sure we give power only to the actors who have a sufficient understanding of the most pressing issues and are committed to tackling them?


 

So what can we do about this? Two approaches that I have heard about:


 

  • Limiting financial upside of developing AI to a finite quantity that is small relative to the output of a dyson swarm.
    • Windfall clauses
    • Profit capping arrangements like I think Open AI has
    • Ad hoc after the fact government taxes and seizures
  • The Moon Treaty, and giving the whole global community collective ownership of outerspace resources
  • Make sure that if AI is developed, it only comes out of a limited number of highly regulated and government controlled entities where part of the regulatory framework ensures a broad distribution of the benefits to at least the citizens of the country where it was constructed. This centralization might also have substantial safety benefits.

 


 

The problem with any approach to AI control after it is developed is that we cannot trust the legal system to constrain the behavior of someone in control of a singleton following a fast take off scenario. There need to be safeguards embedded in these companies that are capable of physically forcing the group that built the AI to do what they promised to do with it, and these regulations need to be built into the structure of how any AI that might develop into a singleton is trained and built.

This should be part of the AI safety regulatory framework, and might be used as part of what convinces the broader public that AI safety regulation is necessary in the first place (it would actually be bad, even if you are a libertarian, if AI is just used to satisfy the desires of rich people). 

All of this only becomes a problem if we actually solve the general alignment problem of creating a system that does what its developers actually want it to do. What you think the p(AGI Doom) is will drive whether you think this is worth working on. 

This also is an effective and possibly tractable place to focus on systemic change. A world system that ensures that everyone gets a sufficient share of the global resources to meet their needs after full automation likely will require major legal and institutional changes, possibly of the same magnitude as a switch to communism or anarcho capitalism would require.

The value of improving a post aligned AI future is multiplied by the possibility that we actually reach that future. So if you think that the odds are 1/million that AI is safely developed, the expected value from efforts in this direction is far lower than if you believe the odds of AI killing us all are 1/million. 

But if we meet that (possibly unlikely) bar of not dying from AI, there will still be more work needed to be done to create utopia.

I'd like to thank Milan, Laszlo, Marta, Gergo, Richard and David for their comments on the draft text of this essay.

 


 

40

0
0

Reactions

0
0

More posts like this

Comments2


Sorted by Click to highlight new comments since:

Thank you for making and pushing for the relevance of these comments!

One way to succinctly make a similar point, I suggest, is to insist, continuously, that AI alignment and AI safety are not the same problems but are actually distinct.

Curated and popular this week
 ·  · 20m read
 · 
Once we expand to other star systems, we may begin a self-propagating expansion of human civilisation throughout the galaxy. However, there are existential risks potentially capable of destroying a galactic civilisation, like self-replicating machines, strange matter, and vacuum decay. Without an extremely widespread and effective governance system, the eventual creation of a galaxy-ending x-risk seems almost inevitable due to cumulative chances of initiation over time across numerous independent actors. So galactic x-risks may severely limit the total potential value that human civilisation can attain in the long-term future. The requirements for a governance system to prevent galactic x-risks are extremely demanding, and they need it needs to be in place before interstellar colonisation is initiated.  Introduction I recently came across a series of posts from nearly a decade ago, starting with a post by George Dvorsky in io9 called “12 Ways Humanity Could Destroy the Entire Solar System”. It’s a fun post discussing stellar engineering disasters, the potential dangers of warp drives and wormholes, and the delicacy of orbital dynamics.  Anders Sandberg responded to the post on his blog and assessed whether these solar system disasters represented a potential Great Filter to explain the Fermi Paradox, which they did not[1]. However, x-risks to solar system-wide civilisations were certainly possible. Charlie Stross then made a post where he suggested that some of these x-risks could destroy a galactic civilisation too, most notably griefers (von Neumann probes). The fact that it only takes one colony among many to create griefers means that the dispersion and huge population of galactic civilisations[2] may actually be a disadvantage in x-risk mitigation.  In addition to getting through this current period of high x-risk, we should aim to create a civilisation that is able to withstand x-risks for as long as possible so that as much of the value[3] of the univers
 ·  · 7m read
 · 
Tl;dr: In this post, I introduce a concept I call surface area for serendipity — the informal, behind-the-scenes work that makes it easier for others to notice, trust, and collaborate with you. In a job market where some EA and animal advocacy roles attract over 1,300 applicants, relying on traditional applications alone is unlikely to land you a role. This post offers a tactical roadmap to the hidden layer of hiring: small, often unpaid but high-leverage actions that build visibility and trust before a job ever opens. The general principle is simple: show up consistently where your future collaborators or employers hang out — and let your strengths be visible. Done well, this increases your chances of being invited, remembered, or hired — long before you ever apply. Acknowledgements: Thanks to Kevin Xia for your valuable feedback and suggestions, and Toby Tremlett for offering general feedback and encouragement. All mistakes are my own. Why I Wrote This Many community members have voiced their frustration because they have applied for many jobs and have got nowhere. Over the last few years, I’ve had hundreds of conversations with people trying to break into farmed animal advocacy or EA-aligned roles. When I ask whether they’re doing any networking or community engagement, they often shyly say “not really.” What I’ve noticed is that people tend to focus heavily on formal job ads. This makes sense, job ads are common, straightforward and predictable. However, the odds are stacked against them (sometimes 1,300:1 — see this recent Anima hiring round), and they tend to pay too little attention to the unofficial work — the small, informal, often unpaid actions that build trust and relationships long before a job is posted. This post is my attempt to name and explain that hidden layer of how hiring often happens, and to offer a more proactive, human, and strategic path into the work that matters. This isn’t a new idea, but I’ve noticed it’s still rarely discussed o
 ·  · 2m read
 · 
Is now the time to add to RP’s great work?     Rethink’s Moral weights project (MWP) is immense and influential. Their work is the most cited “EA” paper written in the last 3 years by a mile - I struggle to think of another that comes close. Almost every animal welfare related post on the forum quotes the MWP headline numbers - usually not as gospel truth, but with confidence. Their numbers carry moral weight[1] moving hearts, minds and money towards animals. To oversimplify, if their numbers are ballpark correct then... 1. Farmed animal welfare interventions outcompete human welfare interventions for cost-effectiveness under most moral positions.[2] 2.  Smaller animal welfare interventions outcompete larger animal welfare if you aren’t risk averse. There are downsides in over-indexing on one research project for too long, especially considering a question this important. The MWP was groundbreaking, and I hope it provides fertile soil for other work to sprout with new approaches and insights. Although the concept of “replicability”  isn't quite as relevant here as with empirical research, I think its important to have multiple attempts at questions this important. Given the strength of the original work, any new work might be lower quality - but perhaps we can live with that. Most people would agree that more deep work needs to happen here at some stage, but the question might be is now the right time to intentionally invest in more?   Arguments against more Moral Weights work 1. It might cost more money than it will add value 2. New researchers are likely to land land on a similar approaches and numbers to RP so what's the point?[3] 3. RP’s work is as good as we are likely to get, why try again and get a probably worse product? 4. We don’t have enough new scientific information since the original project to meaningfully add to the work. 5. So little money goes to animal welfare work  now anyway, we might do more harm than good at least in the short t