Leeds et al. (2000) report that military alliance commitments are honored in war around 75% of the time. We update and extend data on alliance reliability from 1816 to 2003. Our analysis reveals a lower compliance rate overall: 50%. We find a sharp disparity in alliance reliability before and after World War II. States honored their alliance commitments 66% of the time prior to 1945 but the compliance rate drops to 22% from 1945 to 2003. Moreover, the rates of fulfillment for defense pacts (41%) and nonaggression pacts (37%) are dramatically lower than offensive alliances (74%) and neutrality agreements (78%). These findings carry implications for the role of military alliances in world politics and highlight the need for more research to explain the differences that emerge before and after World War II.

An influential study by Leeds, Long, and Mitchell (2000, hereafter “LLM”) reports that countries honor their military alliance commitments 74.5% of the time. The view that states uphold their alliance promises in war the vast majority of the time has now become conventional wisdom in scholarship (e.g. Guzman, 2008: 11; Leeds, 2003a: 803; Simmons, 2010: 280–81). LLM focus on the period from 1816 to 1944, raising the possibility that a different pattern might exist in the post-World War II era. We follow their procedures to produce a dataset on alliance treaty reliability over a longer period – from 1816 to 2003 – drawing on the Alliance Treaty Obligations and Provisions (ATOP) dataset (Leeds et al., 2002) and updated information on participation in interstate wars (Reiter et al., 2014, Version 1.1).

Our analysis reaffirms LLM’s finding that alliance treaties were fulfilled the majority of the time from 1816 to 1944. After World War II, however, countries honored their alliance promises in just 22% of cases in which treaty commitments were invoked in war. Overall, we find that alliances were upheld 50% of the time in interstate wars from 1816 to 2003. Our updated analysis therefore shows that alliance reliability rates are lower than previously believed, and that there is a sharp disparity in promise fulfillment in the pre- and post-World War II periods. These findings carry implications for the rich literature on the sources and effects of alliance treaty fulfillment in international relations (e.g., Crescenzi et al., 2012; Gartzke and Gleditsch, 2004; Gibler, 2008; Leeds et al., 2009; Leeds and Savun, 2007; LeVeck and Narang, 2017).

Alliance reliability in war: Data and procedures Scholars have discussed the honoring of alliance promises at least since the time of Thucydides (431 BCE). Systematic research on this topic began to take off after scholars compiled comprehensive data on military alliances and war through the Correlates of War project. Early studies were not sanguine about the reliability of alliances in war: Sabrosky (1980) finds that alliances are honored less than 30% of the time, while Siverson and King (1980: 2) conclude that just 23.1% of the opportunities to fulfill an alliance commitment resulted in countries joining their allies in war. One limitation of these studies, however, is that they do not consider the specific requirements of each alliance (i.e., the casus foederis). This research therefore misclassifies alliance fulfillment: some treaties are considered to be violated (or upheld) even though countries did not renege on (or honor) any particular promise. Leeds et al. (2002) address this issue by developing the ATOP dataset, which identifies specific alliance provisions to more accurately assess when states are (and are not) obligated to act. LLM demonstrate that accounting for the specific requirements of military alliances leads to a much more optimistic view of alliance reliability in war. While their study generates a more accurate assessment of wartime alliance reliability in the pre-1945 period, it leaves a key question unanswered: what is the rate of treaty compliance after World War II, and how similar is this rate to the pre-war period? To address this issue, we create a dataset of alliance treaty fulfillment for the period from 1816 to 2003, following the procedures described in LLM. In addition to examining a longer time period, we depart from LLM in one key respect: we use updated war data compiled by Reiter et al. (2014). Otherwise, we mimic LLM’s research design. The unit of observation in our dataset is the “alliance performance opportunity.” Because some wars trigger multiple alliance commitments, the same conflict may appear in the dataset more than once. The dataset accounts for four types of alliances: defense pacts, offensive alliances, neutrality agreements, and nonaggression pacts. We started by identifying all active alliance commitments made by participants in 70 interstate wars during our period of study. Next, we determined whether each treaty’s casus foederis was met in a given war. The analysis ultimately includes only those cases where the particular requirements of the treaty were met, excluding cases that are not relevant for assessing alliance reliability. We identified a total of 576 “alliance performance opportunities” from 1816 to 2003. The casus foederis was invoked in 146 of these cases, compared to 110 with LLM.1 For all “alliance performance opportunities” in which the casus foederis was invoked, we determined whether the commitment was upheld. Like LLM (693–94), we classify a commitment as upheld if: “(1) an ally fought on the same side with its alliance partner as promised (in a defense and/or offense pact), (2) an ally remained neutral in a conflict as promised, or (3) an ally fought alongside its alliance partner despite promising only neutrality.” A treaty promise is violated if “(1) an ally fought against its partner (in the context of a defense, offense, nonaggression, or neutrality pact), or (2) an ally did not come to a partner’s aid even though it had promised such assistance in a defense and/or offense pact” (LLM: 694). Some alliances are multilateral. For these cases, researchers must decide how to code alliance reliability when some allies fulfill a commitment but others do not. Sabrosky (1980) classifies an alliance as fulfilled if any ally honors a promise, while LLM code a promise as upheld only if all partners honor their treaty-backed pledges. We adopt the same standard as LLM.2

Conclusion This study presented an updated assessment of the reliability of military alliances. Our analysis, which extended earlier work to include the post-World War II era and utilized updated war data, yields three main conclusions. First, the overall rate of alliance fulfillment in war is lower than previously reported. Second, defense pacts and nonaggression pacts are honored much less frequently than neutrality agreements and offensive alliances. Third, there is a large disparity in the rate of alliance fulfillment before and after World War II. This offers a lesson for researchers that extends beyond the realm of alliance politics: trends that apply in one period may not extend to other eras. It is important to consider whether relationships of interest vary over time – particularly when there are structural shocks, like World War II. What implications does the lower compliance rate carry for our understanding of alliance treaty reliability? Our analysis does not imply that military alliances are ineffective, nor does it challenge the evidence that defense pacts promote peace through extended deterrence (Leeds, 2003b; Johnson and Leeds, 2011; Fuhrmann and Sechser, 2014). The most effective threat is one that never has to be implemented (Schelling, 1966). NATO does not appear in our dataset, for example, precisely because potential adversaries perceive it as effective. However, when alliance commitments are invoked in war, allies uphold their promises less often than the conventional wisdom suggests. This implies that leaders are less restrained by treaty commitments than prior research would expect, a conclusion that carries implications for our understanding of international institutions (for a review of relevant literature, see Simmons, 2010). Our analysis opens up avenues for future research. Scholars could use our updated dataset to revisit enduring debates about alliance politics, such as whether democracies make more (or less) reliable allies (c.f., Gartzke and Gleditsch, 2004; Leeds, 2003a). The disparity in compliance rates over time also represents a puzzle worthy of examination in scholarship. Something may have changed after 1944 that fundamentally altered the nature of alliance politics. We have speculated about the sources of this variation, but further analysis is necessary in order to achieve more definitive answers. The disparity in compliance across commitment types is worthy of further investigation in scholarship as well. Dedicated studies on why offense pacts are so much more reliable than defense pacts would be especially welcome. More generally, future research might consider the implications of our findings for extended deterrence, war-fighting, and the efficacy of international institutions.

Acknowledgements The authors’ names are listed alphabetically. The authors are indebted to Ashley Leeds for detailed and constructive feedback on this project. The authors also thank Jeffrey Arnold, Rosella Cappella, Scott Cook, Bryan Early, Erik Gartzke, Kristian Gleditsch, Florian Hollenbach, Jeffrey Kaplow, Michael Koch, Quan Li, Sara Mitchell, Mitchell Radtke, Dan Reiter, Ahmer Tarar, participants in a research workshop at the University at Albany – SUNY, and audience members in conference presentations at the annual meetings of the American Peace Science Association, the Peace Science Society (International), and the International Studies Association for helpful comments on earlier versions of this paper.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article. Supplementary material

The supplementary files are available at http://journals.sagepub.com/doi/suppl/10.1177/2053168018779697 The replication files are available at: http://thedata.harvard.edu/dvn/dv/researchandpolitics.

Notes 1.

Of our 146 opportunities for compliance, 92 are from 1816 to 1944. Our dataset includes 54 relevant cases in the post-World War II era. 2.

The compliance rates reported in our paper are slightly higher when we used the more lenient standard from Sabrosky (1980). 3.

An alternate coding scheme that deviates from some of LLM’s coding decisions, which is described in the online appendix, produces a similar compliance rate. 4.

These categories are not mutually exclusive, as the same treaty can include more than one kind of commitment. The Warsaw Pact, for instance, is a defense pact, a nonaggression pact, and a consultation pact (the latter type of alliance is not included in our study). See Leeds et al. (2002). 5.

Leeds (2003a) argues that changing circumstances over time is a key reason for treaty violations. 6.

Small differences arise mostly due to our use of updated war data. The online appendix lists all of the cases in which there are coding differences between our study and LLM. 7.

However, Leeds (2003a) finds that major powers are more likely to violate alliance treaties.

Carnegie Corporation of New York Grant

This publication was made possible (in part) by a grant from the Carnegie Corporation of New York. The statements made and views expressed are solely the responsibility of the author.