Dr. Leah C. Windsor is a Research Associate Professor in the Institute for Intelligent Systems at The University of Memphis.   Dr. Windsor can be found on Twitter @leahcwindsor.  Dr. Susan Allen is an Associate Professor at the University of Mississippi.  Dr. Allen can be found on Twitter at @lady_professor.  Divergent Opinions’ content does not contain information of an official nature nor does the content represent the official position of any government, any organization, or any group. 

Title:  Assessing Wargame Effectiveness: Using Natural Language Processing to Evaluate Wargaming Dynamics and Outcomes

Date Originally Written:  October 1, 2020.

Date Originally Published:  July 5, 2021.

Author and / or Article Point of View:  The article is written from a neutral point of view to evaluate the conversational dynamics during wargames that are predictive of wargaming outcomes.

Summary:  Group decision-making research, while well-established, is not applied in wargames with a win / lose focus.  The deliberative data within wargaming can yield predictive metrics for game outcomes. Computational text analysis illuminates participant effects, such as status, gender, and experience. Analyzing participants’ language can provide insight into the intra-group and inter-group dynamics that exclude or invite potential solutions. 

Text:  The outcome of wargames reveals who wins and loses – but how do participants and strategists know if this is the optimal outcome from the range of potential outcomes? To understand why groups make particular decisions that lead to success or failure in wargames, the authors focus on the intra-group and inter-group communication that transpires during the wargame itself. The processes of group dynamics influence the outcomes of wargaming exercises, yet little attention is paid to these deliberations. Implicit biases manifest in language and other multimodal signals that influence participants and shape the process of negotiations [1][2]. 

A novel approach to analyzing wargames would include a process that informs the outcome, and models communicative interchanges computationally by examining linguistic features of participants’ deliberations. Participants’ exchanges and deliberations influence the dynamics within and across wargaming exercises and rounds of play. At present, the authors are aware of no computational models of wargaming deliberations exist that assess the intra-group and inter-group deliberations. A wealth of research using computational text-as-data approaches has established that language has predictive power in analyzing attributes like hierarchy, deception, and closeness [3][4][5]. 

Examining group dynamics is essential for understanding military and foreign policy decision-making because such choices are rarely made by individuals, particularly in democracies, but also within the winning coalition in autocracies. Despite the fact that deliberative group dynamics are affected by emotions, pride, status, reputation, and communication failures, these dynamics are seldom studied[6]. Natural language processing (NLP) approaches can help reveal why teams arrive at various outcomes, how power structures evolve and change within groups during deliberations, what patterns of group deliberation emerge across iterations, and how biases, whether implicit or through participant selection, affect the process of deliberations and outcomes. Because the dialogue patterns of participants have not been evaluated using the multimodal methods proposed, the authors anticipate that NLP will provide agenda-setting contributions to both the scientific and DoD communities.

To illustrate this point, the authors analyzed some of the communications from a wargaming exercise, Counter-Da’esh influence operations: Cognitive space narrative simulation insights[7]. Using computational linguistics techniques, the authors analyzed the use of language related to positive emotion over time, by rounds, across teams in this wargaming simulation. NLP can explore several aspects of between-group and within-group communications, as shown in Figure 1. First, NLP can compare the patterns of language between teams that lead to different outcomes, such as which team wins or loses. 

Second, NLP can model the language relationship between teams to understand which team is leading, and which team is following. Lexical entrainment, semantic similarity, and linguistic style-matching all refer to the process of speakers aligning their language as they collaborate and interact more [8][9][10]. This is visible especially in Rounds 2 and 3 where the Red and Blue teams show similar patterns of positive emotion language use, although with different magnitudes. 

Third, this analysis can be approached with more granularity to examine the individual participants within groups, over time, and between rounds, to determine who are the thought leaders, influencers, and idea entrepreneurs with the greatest power of persuasion. Sentiment analysis has been used to explain how leaders use emotionally evocative language to persuade followers, where positive emotion leads to improved public opinion ratings[11].

       Figure 1. Positive emotion by round, over time, and across teams for ICONS wargaming exercise

One of the critiques of wargaming has been that it is not always cross-culturally representative, which may introduce unintended cultural biases that lead to sub-optimal outcomes. Linguistic analysis of wargaming transcripts using cutting edge natural language processing approaches like Bidirectional Encoder Representations from Transformers aka BERT[12] can help reveal how word meanings vary across issue area, culture, and context, and in doing so, provide objective metrics of language and cultural bias. Computational linguistics approaches can help reveal what people mean when they refer to particular concepts, and how this meaning is interpreted differently by other audiences. Figure 2 illustrates this point well: Windsor[13]  plots the use of two semantically related terms, conflict and war, over time between 1900 and 2000 in six different languages. While the use of these terms generally follow similar patterns, they vary in three different ways: over time; by language; and by term. 

In practice, war and conflict can be used interchangeably, but they also demonstrate remarkable differences over time and between languages. This means that when speakers use these terms, listeners may broadly share related interpretations of the words’ meanings, but room for misinterpretation clearly exists. The Sapir-Whorf Hypothesis suggests that language makes different interpretations of the world available based on the structure of language and lexicon available to speakers[14][15]. Using the BERT process on wargaming transcripts can help reveal instances where participants in the wargaming exercise misunderstand each other, and which concepts provide the most ambiguity and need the most clarification. In the field, understanding the opponent is part and parcel of the “winning hearts and minds” strategy. Gaps in cultural and linguistic understanding can create potentially dangerous, and unnecessary, chasms between people in conflict zones[16]. Computational linguistics approaches can help to identify these gaps so that military personnel, strategists, policymakers – and scholars – can better understand the optimal conditions for negotiating mutually beneficial outcomes. 

Figure 2. Trends in Google NGram for “War” and “Conflict” by Language (1900-2018), taken from Windsor (2021)

Theories of group decision-making are becoming more sophisticated as scholars of international relations and foreign policy re-embrace and return to the foundations of behavioral psychology. While Janis[17]  hypothesized about group-think a generation ago, more recently scholars focused on political psychology have highlighted the importance of experience, poly-think, and framing effects for groups[18]. While this research has advanced ideas about the nature of group decision-making, in practice the group dynamics that shape foreign policy decision-making are more opaque. Wargaming exercises prove a unique opportunity for exploring such theories. This approach builds on the extant literature on wargaming[19][20][21], and offers a path forward for advancing the study of wargaming using theoretically-grounded computational social science methods. 


[1] Greenwald AG, Krieger LH. Implicit Bias: Scientific Foundations. Calif Law Rev. 2006;94: 945–967. doi:10.2307/20439056

[2] Jones HM, Box-Steffensmeier J. Implicit Bias and Why It Matters to the Field of Political Methodology. In: The Political Methodologist [Internet]. 31 Mar 2014 [cited 6 Jun 2018]. Available: https://thepoliticalmethodologist.com/2014/03/31/implicit-bias-and-why-it-matters-to-the-field-of-political-methodology

[3] Hancock JT, Curry LE, Goorha S, Woodworth M. On lying and being lied to: A linguistic analysis of deception in computer-mediated communication. Discourse Process. 2007;45: 1–23.

[4] Gonzales AL, Hancock JT, Pennebaker JW. Language style matching as a predictor of social dynamics in small groups. Commun Res. 2010;37: 3–19.

[5] Pennebaker JW, Chung CK, others. Computerized text analysis of Al-Qaeda transcripts. Content Anal Read. 2008; 453–465.

[6] Lin-Greenberg E, Pauly R, Schneider J. Wargaming for Political Science Research. Available SSRN. 2020.

[7] Linera R, Seese G, Canna S. Counter-Da’esh Influence Operations. May 2016 [cited 10 Jan 2021]. Available: https://nsiteam.com/counter-daesh-influence-operations

[8] Rogan RG. Linguistic style matching in crisis negotiations: a comparative analysis of suicidal and surrender outcomes. J Police Crisis Negot. 2011;11: 20–39.

[9] Taylor PJ, Thomas S. Linguistic Style Matching and Negotiation Outcome. Negot Confl Manag Res. 2008;1: 263–281. doi: https://doi.org/10.1111/j.1750-4716.2008.00016.x

[10] Taylor PJ, Dando CJ, Ormerod TC, Ball LJ, Jenkins MC, Sandham A, et al. Detecting insider threats through language change. Law Hum Behav. 2013;37: 267.

[11] Love G, Windsor L. Populism and Popular Support: Vertical Accountability, Exogenous Events, and Leader Discourse in Venezuela. Polit Res Q. 2017.

[12] Devlin J, Chang M-W, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv181004805 Cs. 2019 [cited 20 Sep 2020]. Available: http://arxiv.org/abs/1810.04805

[13] Windsor L. Linguistic and Political Relativity: AI Bias and the Language of Internatioanl Relations. AI Ethics. Routledge; 2021.

[14] Whorf BL. Science and linguistics. Bobbs-Merrill Indianapolis, IN; 1940.

[15] Kay P, Kempton W. What is the Sapir-Whorf hypothesis? Am Anthropol. 1984;86: 65–79.

[16] Morrison T, Conaway WA. Kiss, Bow, Or Shake Hands: The Bestselling Guide to Doing Business in More Than 60 Countries. Adams Media; 2006.

[17] Janis IL. Victims of groupthink: A psychological study of foreign-policy decisions and fiascoes. 1972.

[18] Hermann MG. Foreign policy role orientations and the quality of foreign policy decisions. Role Theory Foreign Policy Anal. 1987; 123–140.

[19] Asal V, Blake EL. Creating simulations for political science education. J Polit Sci Educ. 2006;2: 1–18.

[20] Brynen R. Virtual paradox: how digital war has reinvigorated analogue wargaming. Digit War. 2020; 1–6.

[21] Reddie AW, Goldblum BL, Lakkaraju K, Reinhardt J, Nacht M, Epifanovskaya L. Next-generation wargames. Science. 2018;362: 1362–1364.