Hide table of contents

Abstract
Despite much progress in training AI systems to imitate human language, building agents that use language to communicate intentionally with humans in interactive environments remains a major challenge. We introduce Cicero, the first AI agent to achieve human-level performance in Diplomacy, a strategy game involving both cooperation and competition that emphasizes natural language negotiation and tactical coordination between seven players. Cicero integrates a language model with planning and reinforcement learning algorithms by inferring players' beliefs and intentions from its conversations and generating dialogue in pursuit of its plans. Across 40 games of an anonymous online Diplomacy league, Cicero achieved more than double the average score of the human players and ranked in the top 10% of participants who played more than one game.

Meta Fundamental AI Research Diplomacy Team (FAIR)†, Anton Bakhtin, Noam Brown, Emily Dinan, Gabriele Farina, Colin Flaherty, Daniel Fried, et al. 2022. “Human-Level Play in the Game of Diplomacy by Combining Language Models with Strategic Reasoning.” Science, November, eade9097. https://doi.org/10.1126/science.ade9097.

49

0
0

Reactions

0
0
Comments10


Sorted by Click to highlight new comments since:

I think it's very impressive! It's worth noting that this has won a small-scale press diplomacy tournament in the past: https://www.thenadf.org/tournament/captain-meme-runs-first-blitzcon/ (playing under the name Franz Broseph), and also commentated footage of a human vs all cicero bot game here: 


That being said, it's worth noting that they built quite a complicated, specialized AI system (ie they did not take an LLM and finetune a generalist agent that also can play diplomacy):

  • First, they train a dialogue-conditional action model by behavioral cloning on human data to predict what other players will do.
  • They they do joint RL planning to get action intentions of the AI and other payers using the outputs of the conditional action model and a learned dialogue-free value model. (They use also regularize this plan using a KL penalty to the output of the action model.)
  • They also train a conditional dialogue model that maps by finetuning a small LM (a 2.7b BART) to map intents + game history > messages.
  • They train a set of filters to remove hallucinations, inconsistencies, toxicity, etc from the output messages, before sending them to other players.
  • The intents are updated after every message. At the end of each turn, they output the final intent as the action.

I do expect someone to figure out how to avoid all these dongles and do it with a more generalist model in the next year or two, though. 



I think people who are freaking out about Cicero moreso than foundational model scaling/prompting progress are wrong; this is not much of an update on AI capabilities nor an update on Meta's plans (they were publically working on diplomacy for over a year). I don't think they introduce any new techniques in this paper either? 

It is an update upwards on the competency of this team of Meta,  a slight update upwards on the capabilities of small LMs, and probably an update upwards on the amount of hype and interest in AI.

But yes, this is the sort of thing that you'd see more of in short timelines rather than long.

This is pretty astounding. It seems to me that it's a result that's consistent with all the other recent progress AI/ML systems have been making in playing competitive and cooperative strategy games, as well as in using language, but it's still a really impressive outcome. My sense is that this is the kind of result that you'd tend to see in a world with shorter rather than longer timelines. 

As for my personal feelings on the matter, I think they'd best be summed up by this image.

aog
13
5
0

My sense is that this is the kind of result that you'd tend to see in a world with shorter rather than longer timelines. 

Agreed. I'm working on this area and I thought it would be another few years before we saw human-level AI in Diplomacy with natural language communication. Diplomacy without natural language was already at human level, but we don't have many examples of game-playing agents successfully using natural language. This seems to show there's not much of a challenge there. 

Across 40 games of an anonymous online Diplomacy league, Cicero achieved more than double the average score of the human players and ranked in the top 10% of participants who played more than one game.

Wow, this sounds impressive on the face of it, though I also wonder how well the best No Press Diplomacy AI would do here (perhaps with a very mediocre chat capability added). Maybe the dialogue isn't actually doing that much work? I'd be interested in more information on this.

There's a commentated video by someone who plays as the only human in an otherwise all-Cicero game, which at least makes it seem like the dialogue is doing a lot.

A random thought I had while watching the video was related to how the commentator pointed out that the bots predictably seem to behave in their own self-interest, whereas human players in bad/losing positions will generally throw all their forces against whoever backstabbed them rather than try to salvage a hopeless position. My personal style of play is much more bot-like than what the commentator described as what human professionals typically do. If the game wasn't anonymous then I see the incentive to retaliate to build an incentive for future games, but given that players' usernames are anonymous then it seems to me like the bots' approach to always try to improve their position seems best.

gwern on /r/machinelearning:

There's no comparison to prior full-press Diplomacy agents, but if I'm reading the prior-work cites right, this is because basically none of them work - not only do they not beat humans, they apparently don't even always improve over themselves playing the game as if it was no-press Diplomacy (ie not using dialogue at all). That gives an idea how big a jump this is for full-press Diplomacy.

Helpful, thanks!

I watched the commentated video you and Lawrence shared, and it still wasn't clear to me from seeing the gameplay how much the press-component was actually helping the Diplomacy agents. (e.g. I wasn't sure if the bots were cooperating/backstabbing or if they were just always set on playing the moves that they did regardless of what was being said in the Press.) In a game with just one human and the rest bots obviously the human wouldn't have an advantage of the bots all behaved like No Press bots. I think a mixed game with multiple humans and multiple bots would provide more insightful.

Worth noting that this was "Blitz" Diplomacy with only five-minute negotiation rounds. Still very impressive though.

(I spent a couple of hours thinking about and discussing with a friend and want to try and write some tentative conclusions down.)

I think the mere fact that Meta was able to combine strategic thinking and dialog to let an AI achieve its goals in a collaborative and competitive environment with humans should give us some pause. On the technical level, I think the biggest contribution are two things: having the RL-trained strategy model that explicitly models the other human agents and finds the balance between optimal play and “human play”; and establishing an interface between that module and the language model via encoding the intent.

I think the Cicero algorithm is bad news for AI safety. It demonstrates yet another worrying capability of AI, and Meta's open publishing norm is speeding up this type of work and making it harder to control.

Of course, the usual rebuttal to concerns about this sort of paper is still valid: This is just a game, which makes two important things easier:

  • There’s a clear state space, action space and reward function. So RL is easy to set up.
  • The dialog is pretty simple, consisting of short sentences in a very specific domain.

One of the most interesting questions is where things will go from here. I think adding deception to the model should allow it to play even better and I’d give it 50% that somebody will make that work within the next year. Beyond that, now that the scaffolding is established, I expect superhuman level agents to be developed in the next year as well. (I’m not quite sure how “superhuman” you can get in Diplomacy – maybe the current model is already there – but maybe the agent could win any tournament it enters.)

Beyond that, I’m curious when and where we will see some of the techniques established in the paper in real-world applications. I think the bottleneck here will be finding scenarios that can be modeled with RL and benefit from talking with humans. This seems like a difficult spot to identify. When talking with humans is required, this usually indicates a messy environment where actions and goals aren’t clearly defined. A positive example I can think of is a language coach. The goal could be to optimize test scores, the action would be picking from a set of exercises. This alone could already work well, but if you add in human psychology and the fact that e.g. motivation can be a key-driver in learning, then dialog becomes important as well.

Curated and popular this week
 ·  · 16m read
 · 
At the last EAG Bay Area, I gave a workshop on navigating a difficult job market, which I repeated days ago at EAG London. A few people have asked for my notes and slides, so I’ve decided to share them here.  This is the slide deck I used.   Below is a low-effort loose transcript, minus the interactive bits (you can see these on the slides in the form of reflection and discussion prompts with a timer). In my opinion, some interactive elements were rushed because I stubbornly wanted to pack too much into the session. If you’re going to re-use them, I recommend you allow for more time than I did if you can (and if you can’t, I empathise with the struggle of making difficult trade-offs due to time constraints).  One of the benefits of written communication over spoken communication is that you can be very precise and comprehensive. I’m sorry that those benefits are wasted on this post. Ideally, I’d have turned my speaker notes from the session into a more nuanced written post that would include a hundred extra points that I wanted to make and caveats that I wanted to add. Unfortunately, I’m a busy person, and I’ve come to accept that such a post will never exist. So I’m sharing this instead as a MVP that I believe can still be valuable –certainly more valuable than nothing!  Introduction 80,000 Hours’ whole thing is asking: Have you considered using your career to have an impact? As an advisor, I now speak with lots of people who have indeed considered it and very much want it – they don't need persuading. What they need is help navigating a tough job market. I want to use this session to spread some messages I keep repeating in these calls and create common knowledge about the job landscape.  But first, a couple of caveats: 1. Oh my, I wonder if volunteering to run this session was a terrible idea. Giving advice to one person is difficult; giving advice to many people simultaneously is impossible. You all have different skill sets, are at different points in
 ·  · 47m read
 · 
Thank you to Arepo and Eli Lifland for looking over this article for errors.  I am sorry that this article is so long. Every time I thought I was done with it I ran into more issues with the model, and I wanted to be as thorough as I could. I’m not going to blame anyone for skimming parts of this article.  Note that the majority of this article was written before Eli’s updated model was released (the site was updated june 8th). His new model improves on some of my objections, but the majority still stand.   Introduction: AI 2027 is an article written by the “AI futures team”. The primary piece is a short story penned by Scott Alexander, depicting a month by month scenario of a near-future where AI becomes superintelligent in 2027,proceeding to automate the entire economy in only a year or two and then either kills us all or does not kill us all, depending on government policies.  What makes AI 2027 different from other similar short stories is that it is presented as a forecast based on rigorous modelling and data analysis from forecasting experts. It is accompanied by five appendices of “detailed research supporting these predictions” and a codebase for simulations. They state that “hundreds” of people reviewed the text, including AI expert Yoshua Bengio, although some of these reviewers only saw bits of it. The scenario in the short story is not the median forecast for any AI futures author, and none of the AI2027 authors actually believe that 2027 is the median year for a singularity to happen. But the argument they make is that 2027 is a plausible year, and they back it up with images of sophisticated looking modelling like the following: This combination of compelling short story and seemingly-rigorous research may have been the secret sauce that let the article to go viral and be treated as a serious project:To quote the authors themselves: It’s been a crazy few weeks here at the AI Futures Project. Almost a million people visited our webpage; 166,00
 ·  · 32m read
 · 
Authors: Joel McGuire (analysis, drafts) and Lily Ottinger (editing)  Formosa: Fulcrum of the Future? An invasion of Taiwan is uncomfortably likely and potentially catastrophic. We should research better ways to avoid it.   TLDR: I forecast that an invasion of Taiwan increases all the anthropogenic risks by ~1.5% (percentage points) of a catastrophe killing 10% or more of the population by 2100 (nuclear risk by 0.9%, AI + Biorisk by 0.6%). This would imply it constitutes a sizable share of the total catastrophic risk burden expected over the rest of this century by skilled and knowledgeable forecasters (8% of the total risk of 20% according to domain experts and 17% of the total risk of 9% according to superforecasters). I think this means that we should research ways to cost-effectively decrease the likelihood that China invades Taiwan. This could mean exploring the prospect of advocating that Taiwan increase its deterrence by investing in cheap but lethal weapons platforms like mines, first-person view drones, or signaling that mobilized reserves would resist an invasion. Disclaimer I read about and forecast on topics related to conflict as a hobby (4th out of 3,909 on the Metaculus Ukraine conflict forecasting competition, 73 out of 42,326 in general on Metaculus), but I claim no expertise on the topic. I probably spent something like ~40 hours on this over the course of a few months. Some of the numbers I use may be slightly outdated, but this is one of those things that if I kept fiddling with it I'd never publish it.  Acknowledgements: I heartily thank Lily Ottinger, Jeremy Garrison, Maggie Moss and my sister for providing valuable feedback on previous drafts. Part 0: Background The Chinese Civil War (1927–1949) ended with the victorious communists establishing the People's Republic of China (PRC) on the mainland. The defeated Kuomintang (KMT[1]) retreated to Taiwan in 1949 and formed the Republic of China (ROC). A dictatorship during the cold war, T