Significance Complex adaptive systems exhibit characteristic dynamics near tipping points such as critical slowing down (declining resilience to perturbations). We studied Twitter and Google search data about measles from California and the United States before and after the 2014–2015 Disneyland, California measles outbreak. We find critical slowing down starting a few years before the outbreak. However, population response to the outbreak causes resilience to increase afterward. A mathematical model of measles transmission and population vaccine sentiment predicts the same patterns. Crucially, critical slowing down begins long before a system actually reaches a tipping point. Thus, it may be possible to develop analytical tools to detect populations at heightened risk of a future episode of widespread vaccine refusal.

Abstract Vaccine refusal can lead to renewed outbreaks of previously eliminated diseases and even delay global eradication. Vaccinating decisions exemplify a complex, coupled system where vaccinating behavior and disease dynamics influence one another. Such systems often exhibit critical phenomena—special dynamics close to a tipping point leading to a new dynamical regime. For instance, critical slowing down (declining rate of recovery from small perturbations) may emerge as a tipping point is approached. Here, we collected and geocoded tweets about measles–mumps–rubella vaccine and classified their sentiment using machine-learning algorithms. We also extracted data on measles-related Google searches. We find critical slowing down in the data at the level of California and the United States in the years before and after the 2014–2015 Disneyland, California measles outbreak. Critical slowing down starts growing appreciably several years before the Disneyland outbreak as vaccine uptake declines and the population approaches the tipping point. However, due to the adaptive nature of coupled behavior–disease systems, the population responds to the outbreak by moving away from the tipping point, causing “critical speeding up” whereby resilience to perturbations increases. A mathematical model of measles transmission and vaccine sentiment predicts the same qualitative patterns in the neighborhood of a tipping point to greatly reduced vaccine uptake and large epidemics. These results support the hypothesis that population vaccinating behavior near the disease elimination threshold is a critical phenomenon. Developing new analytical tools to detect these patterns in digital social data might help us identify populations at heightened risk of widespread vaccine refusal.

In recent decades, vaccine refusal has contributed to the resurgence of measles and pertussis and significantly delayed the global eradication of polio (1, 2). For instance, the 2014–2015 measles outbreak in Disneyland, California was preceded by declining kindergarten measles–mumps–rubella (MMR) vaccine coverage in California between 2010 and 2014 (3) (Fig. 1A). Vaccine compliance at school entry fell to 70–90% in many cases and sometimes even lower in some Los Angeles schools (3). Inadequate vaccine compliance appears to have played a role in the outbreak (4), contributing to a significant peak in California measles case notifications in late 2014 and early 2015 (5) (Fig. 1A). The outbreak garnered significant public interest, causing a large spike in both US-geocoded tweets regarding measles (Fig. 1B) and Google Internet searches in California for “MMR” and “measles” (Fig. 1C) as reports of cases began to flow in. Amid the resulting public outcry, the California legislature began taking steps to disallow nonmedical exemptions (6⇓–8), although statewide MMR vaccine uptake began to recover before these policy changes went into effect (3) (Fig. 1A).

Fig. 1. Interactions between disease spread, vaccine uptake, and online activity before, during, and after the 2014–2015 Disneyland, California measles outbreak. (A) Kindergarten MMR vaccine uptake (black; note vertical scale) and measles case notifications in California (red): year in horizontal axis for vaccine uptake corresponds to the ending calendar year of the corresponding academic year (e.g., 2016 means 2015–2016 academic year). Case notifications in 2016 go only to November 18. Most 2014 cases occurred at the end of the year. (B) Number of US geocoded tweets for measles-relevant search terms, 2011–2016, with a sharp spike in early 2015 corresponding to Disneyland measles outbreak. (C) GT Internet search index for MMR (blue) or measles (orange) in California, 2011–2016, with a sharp spike in early 2015 corresponding to the Disneyland measles outbreak. Shaded region in B and C indicates outbreak time period. See SI Appendix, sections S1 and S2 for details on search terms, data sources, and data extraction.

The changes in vaccinating behavior before and after the Disneyland measles outbreak are consistent with a coupled behavior–disease dynamic in which vaccinating decisions and disease dynamics influence one another in a nonlinear feedback loop. The mathematical modeling of coupled behavior–disease dynamics is growing rapidly (9⇓⇓–12), although relatively little attention has been devoted to critical phenomena in such systems. The theory of critical transitions (tipping points) and their early warning signals may help public health officials anticipate when and where resistance to vaccination might develop and intensify. A critical transition occurs when a complex system shifts abruptly to a strongly contrasting state as an external driver moves the system past a bifurcation point (13, 14). These shifts may exhibit characteristic early warning signals as a consequence of critical slowing down (CSD), in which a declining rate of recovery from small perturbations causes dynamics to become more variable. CSD can be detected by changes in indicators such the variance, lag-1 autocorrelation (AC), and coefficient of variation in high-resolution time series of state variables (13, 14).

Social norms tend to reinforce currently accepted behavior and thus promote status quo practices in populations (15⇓–17). However, individuals also make vaccinating decisions based on the perceived risks of the vaccine and the diseases they prevent (15). Here, we hypothesize that coupled behavior–disease systems exhibit a tipping point arising from interactions between social norms, perceived vaccine risk, and perceived disease risks. Specifically, we investigate the effects of risk perception in terms of the ratio of the magnitude of perceived vaccine risk to the magnitude of perceived risk of disease complications (we will call this “relative vaccine risk” for short). Rising public concern about potential vaccine complications can cause the relative vaccine risk to grow to a tipping point where social norms in support of a status quo of high vaccine acceptance can no longer prevent a drop in provaccine sentiment. If the population moves beyond this tipping point, a decline in provaccine sentiment causes fewer people to seek vaccination and herd immunity breaks down, enabling outbreaks of various sizes. However, before the tipping point is reached, CSD causes the variance, lag-1 AC, and coefficient of variation of time series of population sentiment toward the vaccine to increase. Importantly, the increase in these three indicators should be noticeable long before any significant change is obvious in the raw time series of population sentiment toward the vaccine. In other words, they provide an early warning signal of a potential tipping point.

However, coupled behavior–disease systems are complex adaptive systems, which introduces an important twist to our hypothesis. The relative vaccine risk is not simply an external driver pushing the system past a tipping point. It also responds to changes in infection prevalence. When an outbreak occurs, the relative vaccine risk drops. Hence, a critical transition can be avoided if the population responds to the small outbreaks that begin to occur near a tipping point (18). We hypothesize that these dynamics could lead to CSD before the outbreak followed by “critical speeding up” (improving resilience to perturbations) after the outbreak as the population recedes from the tipping point. Although CSD in a time series of population vaccine sentiment will not necessarily predict whether the population will pull back from the critical transition or go through the transition, it can at least tell us that the population is getting dangerously close to a tipping point.

In this article, we report evidence for CSD in sentiment-classified tweets and in Google searches about measles before the Disneyland measles outbreak, followed by critical speeding up afterward. These empirical digital signals show patterns that match those exhibited by a mathematical model of the coupled dynamics of measles transmission and vaccine sentiment that has been previously tested against case notification and vaccine uptake data for measles and pertussis (19⇓–21). Hence, these digital signals could be used as an early warning signal of tipping points in coupled behavior–disease systems.

Discussion This article presents evidence that coupled behavior–disease dynamics near the disease elimination threshold is a critical phenomenon. We analyzed tweets and Google searches and showed how the patterns in the empirical data matched those exhibited by a mathematical model of coupled dynamics of measles transmission and vaccine sentiment and uptake. The three indicators—variance, lag-1 AC, and coefficient of variance—tended to increase before the Disneyland outbreak due to CSD, and then decrease after the outbreak due to critical speeding up (with the unexpected exception of the coefficient of variation in antivaccinators where the trend was inverted). Our model predicts the same trends in a population that approaches but then recedes from a tipping point. The variance indicator showed the most robust trends. However, the coefficient of variation has the advantage that it inherently adjusts for changes in the mean number of tweets, and therefore does not require further processing of the data through computing a residual time series, as required for variance and lag-1 AC. The lag-1 AC tests for changes in system memory (13). This indicator often—but not always—showed the expected trends in our data, and trends were not as strong under weekly binning. We speculate this is either because memory is too short-lived in online social media for changes to be detected in data with daily or weekly granularity, or due to the presence of higher-order autoregressive processes that cannot be detected by lag-1 AC (33, 34). The Disneyland outbreak was small and the response in population vaccine uptake rapid compared with other episodes of vaccine refusal where populations appear to have crossed a threshold into a regime of endemic infection and significantly reduced population-wide vaccine coverage. This latter scenario occurred for MMR vaccine in England and Wales in the 1990s and 2000s (80% minimum coverage) (21); whole-cell pertussis vaccine in England and Wales in the 1970s (30% minimum coverage) (21); and oral polio vaccine in northern Nigeria in 2003–2004 (1). In recent years, measles outbreaks larger than the Disneyland outbreak have occurred in many undervaccinated European populations (35). The social media response to the Disneyland outbreak was enormous considering the relatively small size of the outbreak. We speculate this was because the outbreak was the largest in California in many years and it started in a major tourist destination. A limitation of our model is that it does not account for spatial clustering. This is a key aspect given the presence of clusters of nonvaccinators during the Disneyland measles outbreak (3), and it presents an opportunity for further research given the importance of networks in both infection transmission and strategic interactions (36, 37). The growth of clusters of nonvaccinators is not necessarily a competing hypothesis but rather could represent the spatial manifestation of critical dynamics. Spatially explicit models of behavioral dynamics in related systems develop clusters of individuals with homogeneous opinions as the population starts to “bubble” near a critical phase transition (38). CSD near a phase transition can manifest in similar ways in both spatial and temporal indicators because the underlying process is similar. Hence, the growing clusters of unvaccinated individuals observed before the Disneyland measles outbreak may signify bubbling near a critical phase transition. This hypothesis could be tested through further research on critical transitions in social networks of Twitter users. We also note that spatiotemporal analysis may take advantage of different and potentially better indicators than purely temporal analysis (39). More research is needed to better understand the informational content of the indicators in spatially structured populations and thereby distinguish qualitatively different outcomes, such as a quick and effective population response versus a protracted period of reduced vaccine coverage and endemic infection. Such analysis could incorporate vaccine uptake data if it has good spatial and temporal resolution (3). A second limitation is our use of CSD in the number of sentiment-classified tweets as a proxy for CSD in vaccine sentiment and uptake in the general population. This assumption could be relaxed by using more detailed models that include a submodel for online social media activity that accounts for how different users generate differing numbers of tweets and how online social media activity interacts with social processes in the general population. Our empirical results are largely consistent with our model predictions but cannot definitively establish causality. Future research could evaluate out-of-sample model predictions and consider the relationship between contemporaneous indicators of vaccine sentiment, such as tweets and search data, and observed vaccine uptake. It would also be valuable to consider other events that might affect sentiment dynamics near tipping points and to evaluate whether the significant population response to the Disneyland outbreak depended on its extensive media coverage. Still, these results suggest that population vaccinating behavior near the elimination threshold can be characterized as a critical phenomenon near a tipping point in a coupled behavior–disease system. Our findings highlight the value of using digital social data to identify early warning signals of critical dynamics in adaptive behavior–disease systems and socioecological systems more generally (18). They also demonstrate the value of using dynamical systems theory in data science. The theory of critical phenomena in complex systems may shed light on other study systems represented in very large social media datasets.

Methods Twitter Data. For the US GPS dataset, we obtained 27,906 measles-related tweets from March 2, 2011, to October 9, 2016, with GPS coordinates in the United States. We used Amazon Mechanical Turk to classify the sentiment of these tweets into 10,926 “provaccine,” 2,136 “antivaccine,” and 14,844 “other” categories. A tweet was defined as provaccine (respectively, anti-vaccine) if the tweet content suggested the tweeter had a positive (respectively, negative) sentiment toward vaccines. This included any information about their feelings or opinions toward vaccines or the diseases they prevent. A tweet was placed in other if it was neither provaccine nor antivaccine, for instance, because it was irrelevant, ambiguous, or if the sentiment of the tweeter could not be clearly ascertained. Baseline analysis used daily bins. Additional details appear in SI Appendix, sections S2 and S5. Over the same time period, 11,685,264 tweets had information in the user location field. To generate the Location Field datasets, these tweets were geotagged using a modified version of the Geodict library and classified into pro-vaccine, anti-vaccine, and other using a linear support vector machine. The classifier obtained precision scores of 80%, 90%, and 79%, and recall scores of 83%, 82%, and 82% for antivaccine, other, and provaccine tweets, respectively (F1 scores: 81%, 86%, and 80%). The process identified 660,477 antivaccine, 883,570 provaccine, and 483,636 other tweets in the US dataset, and 101,683 antivaccine, 112,741 provaccine, and 59,030 other tweets in the California dataset. Baseline analysis used daily bins. Additional details including references appear in SI Appendix, sections S2 and S5. Data are available in Datasets S1–S3. GT Data Extraction. We analyzed GT search data for January 2011 to December 2015 using the gtrendsR (40) package. Unfortunately, the longest range of day-level query data Google provides is 3 months, which generates results in the arbitrary units of GT data that are not comparable between searches. (GT returns an estimate of the relative prevalence of searches matching the query for the time period and geography in question when the prevalence of the search term or terms exceeds some unspecified threshold.) As a result, we ran multiple day-level queries for each search (e.g., US measles, US MMR, California measles, California MMR) to cover the entire time period and then stacked the resulting data. We then ran a single corresponding week-level query for each search and used this to calculate an adjustment factor (specifically, we multiply each day-level value by the week-level query result divided by the week-level average from the daily data). This adjustment accounts for differences in the relative prevalence of searches over time in the stacked day-level data (41, 42). CSD Indicators. To adjust for long-term changes in the mean number of tweets, we used the residual time series of sentiment-classified tweets for lag-1 AC and variance, generated by subtracting the raw time series from a detrended time series. This is not necessary for the coefficient of variation since it already adjusts for long-term changes in number of tweets. We also removed the Disneyland social media peak (taken as running from January 22 to February 14 based on the US GPS dataset) to avoid issues with nonstationarity caused by the Disneyland outlier, and also because our focus is on CSD in the time before and after the outbreak. The methodology of computing indicators for the model was otherwise identical to that for the tweets and GT data. We used the Kendall tau rank correlation to quantify indicator trends (13), although we note that this statistic does not account for the size of increases or decreases over previous time points. Additional details appear in SI Appendix, section S4.

Acknowledgments We thank Madhur Anand, Feng Fu, and two anonymous reviewers for helpful comments on the manuscript. This research was funded by a Natural Sciences and Engineering Research Council of Canada Discovery Grant and a Canadian Foundation for Innovation Grant (to C.T.B.).

Footnotes Author contributions: B.N., M.S., and C.T.B. designed research; A.D.P., T.M.B., C.W., and C.T.B. performed research; B.N., M.S., and C.T.B. contributed new reagents/analytic tools; A.D.P., T.M.B., C.W., J.S., S.P.M., B.N., M.S., and C.T.B. analyzed data; and A.D.P., T.M.B., C.W., J.S., S.P.M., B.N., M.S., and C.T.B. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1704093114/-/DCSupplemental.