The development and widespread deployment of advanced AI agents will give rise to multi-agent systems of unprecedented complexity. A new report from staff at the Cooperative AI Foundation and a host of leading researchers explores the novel and under-appreciated risks these systems pose.

Powerful AI systems are increasingly being deployed with the ability to autonomously interact with the world and adapt their behaviour accordingly. This is a profound change from the more passive, static AI services with which most of us are familiar, such as chatbots and image generation tools. On the other hand, while still relatively rare, groups of AI agents are already responsible for tasks that range from trading million-dollar assets to recommending actions to commanders in battle.

In the coming years, the competitive advantages offered by autonomous, adaptive agents will drive their adoption both in high-stakes domains, and as intelligent personal assistants, capable of being delegated increasingly complex and important tasks. In order to fulfil their roles, these advanced agents will need to communicate and interact with each other and with people, giving rise to new multi-agent systems of unprecedented complexity.

While offering opportunities for scalable automation and more diffuse benefits to society, these systems also present novel risks that are distinct from those posed by single agents or by less advanced AI technologies (which are the focus of most research and policy discussions). In response to this challenge, staff at the Cooperative AI Foundation have published a new report, co-authored with leading researchers from academia and industry.

Multi-Agent Risks from Advanced AI offers a crucial first step by providing a taxonomy of risks. It identifies three primary failure modes: miscoordination (failure to cooperate despite shared goals), conflict (failure to cooperate due to differing goals), and collusion (undesirable cooperation in contexts like markets). The report also explains how these failures – among others – can arise via seven key risk factors:

  • Information asymmetries: Private information leading to miscoordination, deception, and conflict;
  • Network effects: Small changes in network structure or properties causing dramatic shifts in system behaviour;
  • Selection pressures: Competition, iterative deployment, and continual learning favouring undesirable behaviours;
  • Destabilising dynamics: Agents adapting in response to one another creating dangerous feedback loops and unpredictability;
  • Commitment and trust: Difficulties in establishing trust preventing mutual gains, or commitments being used for malicious purposes;
  • Emergent agency:  Qualitatively new goals or capabilities arising from collections of agents;
  • Multi-agent security: New security vulnerabilities and attacks arising that are specific to multi-agent systems.

Though the majority of these dynamics have not yet emerged, we are entering a world in which large numbers of increasingly advanced AI agents, interacting with (and adapting to) each other, will soon become the norm. We therefore urgently need to evaluate (and prepare to mitigate) these risks. In order to do so, the report presents several promising directions that can be pursued now:

  • Evaluation: Today's AI systems are developed and tested in isolation, despite the fact that they will soon interact with each other. In order to understand how likely and severe multi-agent risks are, we need new methods of detecting how and when they might arise.
  • Mitigation: Evaluation is only the first step towards mitigating multi-agent risks, which will require new technical advances. While our understanding of these risks is still growing, there are a range of promising directions (detailed further in the report) that we can begin to explore now .
  • Collaboration: Multi-agent risks inherently involve many different actors and stakeholders, often in complex, dynamic environments. Greater progress can be made on these interdisciplinary problems by leveraging insights from other fields.

The report concludes by examining the implications of these risks for existing work in AI safety, governance, and ethics. It shows the need to extend AI safety research beyond single systems to include multi-agent dynamics. It also emphasises the potential of multi-stakeholder governance approaches to mitigate these risks, while acknowledging the novel ethical dilemmas around fairness, collective responsibility, and more that arise in multi-agent contexts.

In doing so, the report aims to provide a foundation for further research, as well as a basis for policymakers seeking to navigate the complex landscape of risks posed by increasingly widespread and sophisticated multi-agent systems. If you are working on the safety, governance, or ethics of AI, and are interested in further exploring the topic of multi-agent risks, please feel free to get in touch or sign up for the cooperative AI newsletter.

Comments3


Sorted by Click to highlight new comments since:

I have also thought about the impact and risks associated with AI multi-agents seeing that we're quickly transitioning to the development, deployment, and use of AI multi-agents.

While there's obvious need for further research, what do you think about the role of AI alignment in mitigating AI multi-agent risks? Lewis Hammond

Thank you and well done to the team at Cooperative AI Foundation 

Hi, I’m not sure if I failed to read your post before submitting my own or if it was just good timing. 

https://forum.effectivealtruism.org/posts/pis8bviKY25RFog92/what-does-an-asi-political-ecology-mean-for-human-survival

 I’m interested in what multi-agent dynamics mean for an ASI political ecology, and what the fact that ASI agents will need to learn negotiation, compromise and cooperation (as well as Machiavellian strategising) means for human flourishing/survival.  I’d like to believe that multi-agent dynamics means humanity might be more likely to be incorporated into the future, but that might just be cope.  Thanks for the link, look forward to reading it.

For context I’m thinking of the paradox of the plankton.. 

https://en.m.wikipedia.org/wiki/Paradox_of_the_plankton

Curated and popular this week
 ·  · 13m read
 · 
Notes  The following text explores, in a speculative manner, the evolutionary question: Did high-intensity affective states, specifically Pain, emerge early in evolutionary history, or did they develop gradually over time? Note: We are not neuroscientists; our work draws on our evolutionary biology background and our efforts to develop welfare metrics that accurately reflect reality and effectively reduce suffering. We hope these ideas may interest researchers in neuroscience, comparative cognition, and animal welfare science. This discussion is part of a broader manuscript in progress, focusing on interspecific comparisons of affective capacities—a critical question for advancing animal welfare science and estimating the Welfare Footprint of animal-sourced products.     Key points  Ultimate question: Do primitive sentient organisms experience extreme pain intensities, or fine-grained pain intensity discrimination, or both? Scientific framing: Pain functions as a biological signalling system that guides behavior by encoding motivational importance. The evolution of Pain signalling —its intensity range and resolution (i.e., the granularity with which differences in Pain intensity can be perceived)— can be viewed as an optimization problem, where neural architectures must balance computational efficiency, survival-driven signal prioritization, and adaptive flexibility. Mathematical clarification: Resolution is a fundamental requirement for encoding and processing information. Pain varies not only in overall intensity but also in granularity—how finely intensity levels can be distinguished.  Hypothetical Evolutionary Pathways: by analysing affective intensity (low, high) and resolution (low, high) as independent dimensions, we describe four illustrative evolutionary scenarios that provide a structured framework to examine whether primitive sentient organisms can experience Pain of high intensity, nuanced affective intensities, both, or neither.     Introdu
 ·  · 3m read
 · 
We’ve redesigned effectivealtruism.org to improve understanding and perception of effective altruism, and make it easier to take action.  View the new site → I led the redesign and will be writing in the first person here, but many others contributed research, feedback, writing, editing, and development. I’d love to hear what you think, here is a feedback form. Redesign goals This redesign is part of CEA’s broader efforts to improve how effective altruism is understood and perceived. I focused on goals aligned with CEA’s branding and growth strategy: 1. Improve understanding of what effective altruism is Make the core ideas easier to grasp by simplifying language, addressing common misconceptions, and showcasing more real-world examples of people and projects. 2. Improve the perception of effective altruism I worked from a set of brand associations defined by the group working on the EA brand project[1]. These are words we want people to associate with effective altruism more strongly—like compassionate, competent, and action-oriented. 3. Increase impactful actions Make it easier for visitors to take meaningful next steps, like signing up for the newsletter or intro course, exploring career opportunities, or donating. We focused especially on three key audiences: * To-be direct workers: young people and professionals who might explore impactful career paths * Opinion shapers and people in power: journalists, policymakers, and senior professionals in relevant fields * Donors: from large funders to smaller individual givers and peer foundations Before and after The changes across the site are aimed at making it clearer, more skimmable, and easier to navigate. Here are some side-by-side comparisons: Landing page Some of the changes: * Replaced the economic growth graph with a short video highlighting different cause areas and effective altruism in action * Updated tagline to "Find the best ways to help others" based on testing by Rethink
 ·  · 7m read
 · 
The company released a model it classified as risky — without meeting requirements it previously promised This is the full text of a post first published on Obsolete, a Substack that I write about the intersection of capitalism, geopolitics, and artificial intelligence. I’m a freelance journalist and the author of a forthcoming book called Obsolete: Power, Profit, and the Race to Build Machine Superintelligence. Consider subscribing to stay up to date with my work. After publication, this article was updated to include an additional response from Anthropic and to clarify that while the company's version history webpage doesn't explicitly highlight changes to the original ASL-4 commitment, discussion of these changes can be found in a redline PDF linked on that page. Anthropic just released Claude 4 Opus, its most capable AI model to date. But in doing so, the company may have abandoned one of its earliest promises. In September 2023, Anthropic published its Responsible Scaling Policy (RSP), a first-of-its-kind safety framework that promises to gate increasingly capable AI systems behind increasingly robust safeguards. Other leading AI companies followed suit, releasing their own versions of RSPs. The US lacks binding regulations on frontier AI systems, and these plans remain voluntary. The core idea behind the RSP and similar frameworks is to assess AI models for dangerous capabilities, like being able to self-replicate in the wild or help novices make bioweapons. The results of these evaluations determine the risk level of the model. If the model is found to be too risky, the company commits to not releasing it until sufficient mitigation measures are in place. Earlier today, TIME published then temporarily removed an article revealing that the yet-to-be announced Claude 4 Opus is the first Anthropic model to trigger the company's AI Safety Level 3 (ASL-3) protections, after safety evaluators found it may be able to assist novices in building bioweapons. (The