Dr. ASLI VAROL
Ways to
effectively benefit from Artificial Intelligence (AI) in diplomacy should be
sought. Diplomatic actors such as states, governments, private companies,
non-governmental organizations need to do more research and development
(R&D) for this. In particular, a way should be sought for AI to work
together with human diplomats.
By combining
human power and AI power through gamification in diplomacy, hybrid models can
be adopted. AI should be used in forecasting, research and reporting. Of
course, like Meta’s Cicero, AI’s cooperation capability with humans must be
developed. On the other hand, human diplomats should be active in building
interpersonal trust and cooperation and using language skills.
No-Press Diplomacy and “Diplomacy” Game
Gamification in
diplomacy began in the 1950s. This gamification is conceptualized as “No-Press
Diplomacy”.
“Diplomacy”
was created by Allan Calhamer in 1954. “Diplomacy”, commercially released in
the United States in 1959, is a strategic board game (Calhamer, 1974). The game begins in 1901. In “Diplomacy”,
players can choose Austria, England, France, Germany,
Italy, Russia, or Turkey (Paquette
et al., 2019).
No-Press Diplomacy is designed as a complex game involving cooperation and competition (Gray et al., 2021: 9). Today, No-Press Diplomacy is a complex strategy game involving cooperation and competition that serves as a benchmark for multi-agent AI research (Bakhtin et al., 2022).
The aim of the players is to take control of most
of the map on the game board. To succeed in this challenge, players must
cooperate, negotiate, trust and support each other, as well as compete for as
many territories as possible.
In “Diplomacy”,
players form alliances and support each other through private, one-to-one
conversations. On the other hand, there are no binding agreements. Therefore,
players may misrepresent their plans and make a double deal. After
negotiations, players write down their moves, which are then executed
simultaneously. Of course, they trust others to do what they say. Because the only way to win
in the “Diplomacy” game is to build trust, negotiate and cooperate with other
players (Meta AI, n.d.).
AI as a “Diplomacy” Game Player
Today, the “Diplomacy” game can be played as “webDiplomacy”
via https://webdiplomacy.net/. Various researchers and experts have studied to
understand the level of success and effectiveness of AI in this game.
Bakhtin et al. discuss a planning algorithm they call DiL-piKL. They used RL-DiL-piKL to train an agent they named “Diplodocus”. They found that “Diplodocus” was successful in “Diplomacy”. They state that combining human imitation, planning, and RL (Reinforcement Learning) offers a promising way to create agents for complex cooperative and mixed-motivate environments (Bakhtin et al., 2022). “Diplomacy” is a game that only one player can win. Cooperation with other players is almost essential to achieve victory in this game (Gray et al., 2021: 2).
Paquette et al. focused on training an agent who learns to play the version of No-Press Diplomacy. They presented “DipNet”, a neural network-based policy model for No-Press Diplomacy. In “Diplomacy”, players are faced with SSD (Sequential social dilemmas) at every stage of the game. “Diplomacy” is also one of the first SSD games with a rich environment. A single player can own up to 34 units, with each unit having an average of 26 possible actions. This astronomical action space makes planning and searching difficult. However, thinking across multiple time scales is an important aspect of “Diplomacy”. Agents need to be able to formulate a high-level long-term strategy (for example, whom to ally with) and have a very short-term execution plan for their strategy (for example, what units should I move in the next round). Agents should also be able to adapt their plans and beliefs about others (e.g. trustworthiness) depending on the game’s unfolding (Paquette et al., 2019).
Anthony et al.
proposed a simple but effective approximate best response operator designed to
handle large combinatorial action spaces and simultaneous movements. They also
introduced a family of policy iteration methods that approach the fictitious
play. With these methods, they tried to apply RL to “Diplomacy” (Anthony et
al., 2020). Bakhtin et al. also trained “DORA”, an agent completely from scratch, for a popular two-player variant of “Diplomacy”
(Bakhtin et al., 2021).
For the first
time, Meta’s Fundamental AI Research Diplomacy Team trained an AI to achieve
“human-level performance” in the war strategy board game “Diplomacy”. This new
AI agent is named as “Cicero”, the classical statesman and scholar who
witnessed the fall of the Roman Republic. The new AI agent, Cicero, can
effectively communicate and strategize with other human players, plan best practices
for victory, and in some cases even pass as a human. And also, the researchers
state that Cicero is a “benchmark” for multiple AI agent learning, which
performs its tasks by combining dialogue and strategic reasoning models.
The researchers
conducted their study experiments on 40 anonymized online webDiplomacy.net
games, played for a total of 72 hours, between August 19 and October 13, 2022.
Cicero “passed as a human player” in 40 “Diplomacy” games with 82 unique
players. Cicero even managed to successfully change a human player’s mind by
proposing a mutually beneficial move (DeGeurin, 2022).
Conclusion
As a new generation of No-Press
Diplomacy, efforts should be promoted to ensure cooperation between AI
diplomats and human diplomats in the physical and digital environment. This hybrid
model cooperation will provide benefits to the parties in diplomatic relations
in terms of time, effort and cost. Because there will also be a work sharing
between AI and human diplomats, thus sharing responsibilities and obligations.
Therefore, AI agents to work in diplomacy should be developed. These AI agents
should be developed as experts in various fields of diplomacy and should be
equipped and trained with communication and negotiation skills that form the
basis of diplomacy.
References
Anthony, Thomas, Tom Eccles,
Andrea Tacchetti, János Kramár, Ian Gemp, Thomas C. Hudson, Nicolas Porcel,
Marc Lanctot, Julien Pérolat, Richard Everett, Roman Werpachowski, Satinder
Singh, Thore Graepel, Yoram Bachrach (2020): “Learning to Play No-Press
Diplomacy with Best Response Policy Iteration”, 34th Conference on Neural
Information Processing Systems (NeurIPS 2020), Vancouver, Canada.
Bakhtin, Anton, David Wu
Adam Lerer Noam Brown (2021): “No-Press Diplomacy from Scratch”, 35th
Conference on Neural Information Processing Systems (NeurIPS 2021).
Bakhtin, Anton, David J Wu, Adam Lerer, Jonathan
Gray, Athul Paul Jacob, Gabriele Farina, Alexander H Miller, Noam Brown (2022):
“Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning
and Planning”, arXiv preprint arXiv:2210.05492.
Calhamer, Allan (1974): “The Invention of
Diplomacy”, Reprinted from Games & Puzzles, No. 21 (January 1974),
https://web.archive.org/web/20090910012615/http://www.diplom.org/~diparch/resources/calhamer/invention.htm, Accessed: 19. 07. 2023.
DeGeurin, Mack (2022): Meta’s ‘Cicero’ AI Trounced Humans at Diplomacy without
Revealing Its True Identity, November 22, 2022, Gizmodo, https://gizmodo.com/meta-ai-cicero-diplomacy-gaming-1849811840, Accessed: 18. 07. 2023.
Gray, Jonathan, Adam Lerer, Anton Bakhtin, Noam
Brown (2021): “Human-Level Performance in No-Press Diplomacy via Equilibrium
Search”, Published as a conference paper at ICLR 2021, arXiv: 2010.02923,
Accessed: 18. 07. 2023.
Meta AI (n.d.): “About the
Game”,https://ai.meta.com/research/cicero/diplomacy/, Accessed: 19. 07. 2023.
Paquette, Philip, Yuchen Lu,
Steven Bocco, Max O. Smith, Satya Ortiz-Gagné, Jonathan K. Kummerfeld, Satinder Singh, Joelle
Pineau, Aaron Courville (2019): “No
Press Diplomacy: Modeling Multi-Agent Gameplay”, Advances in Neural Information
Processing Systems 32 (NeurIPS 2019).
Wikipedia (n.d.): “Diplomacy (game)”, https://en.wikipedia.org/wiki/Diplomacy_(game), Accessed: 19. 07.
2023.
No comments:
Post a Comment