Mathematical and computational modeling approaches can be essential in providing quantitative scenarios of disease spreading, as well as projecting the impact in the population. Here we analyze the spatial and temporal dynamics of the Zika virus epidemic in the Americas with a microsimulation approach informed by high-definition demographic, mobility, and epidemic data. The model provides probability distributions for the time and place of introduction of Zika in Brazil, the estimate of the attack rate, timing of the epidemic in the affected countries, and the projected number of newborns from women infected by Zika. These results are potentially relevant in the preparation and analysis of contingency plans aimed at Zika virus control.

We use a data-driven global stochastic epidemic model to analyze the spread of the Zika virus (ZIKV) in the Americas. The model has high spatial and temporal resolution and integrates real-world demographic, human mobility, socioeconomic, temperature, and vector density data. We estimate that the first introduction of ZIKV to Brazil likely occurred between August 2013 and April 2014 (90% credible interval). We provide simulated epidemic profiles of incident ZIKV infections for several countries in the Americas through February 2017. The ZIKV epidemic is characterized by slow growth and high spatial and seasonal heterogeneity, attributable to the dynamics of the mosquito vector and to the characteristics and mobility of the human populations. We project the expected timing and number of pregnancies infected with ZIKV during the first trimester and provide estimates of microcephaly cases assuming different levels of risk as reported in empirical retrospective studies. Our approach represents a modeling effort aimed at understanding the potential magnitude and timing of the ZIKV epidemic and it can be potentially used as a template for the analysis of future mosquito-borne epidemics.

The Zika virus (ZIKV) is an RNA virus from the Flaviviridae family, genus Flavivirus (1, 2), first isolated in the Zika Forest of Uganda in 1947 (3). It generally results in a mild disease characterized by low-grade fever, rash, and/or conjunctivitis, although only ∼20% of those infected are symptomatic (4). Although there have been instances of sexual and perinatal/vertical transmission (5⇓⇓⇓⇓⇓⇓–12) and the potential for transmission by transfusion is present (13), ZIKV spreads primarily through infected Aedes mosquitoes (14, 15).

Until recently, ZIKV was considered a neglected tropical disease with only local outbreaks (4, 16⇓–18). The association of ZIKV with the reported microcephaly case clusters in Brazil during 2015 (19) led the director-general of the WHO to declare on February 1, 2016, a Public Health Emergency of International Concern (PHEIC) (20) that lasted for nearly 10 mo. During this period, ZIKV spread throughout the Americas, with 47 countries and territories in the region reporting autochthonous transmission (21, 22). Many other countries with ZIKV outbreaks besides Brazil have reported cases of microcephaly and other birth defects associated with ZIKV infection during pregnancy (Zika congenital syndrome) (23), and the epidemic has been under close scrutiny by all of the major public health agencies around the world.

Although enhanced surveillance and new data have improved our understanding of ZIKV (24⇓⇓⇓⇓–29), many unknowns persist. There is uncertainty surrounding the time of introduction of the virus to the region, although epidemiological and genetic findings estimate that ZIKV arrived in Brazil between May and December 2013 (nextstrain.org/zika; ref. 30). Furthermore, although mathematical and computational models have tackled the characterization of the transmissibility and potential burden of ZIKV (31⇓⇓⇓–35), little is known about the global spread of the virus in 2014 and 2015, before the WHO’s alert in early 2016. Using a data-driven stochastic and spatial epidemic model, we present numerical results providing insight into the first introduction in the region and the epidemic dynamics across the Americas. We use the model to analyze the spatiotemporal spread and magnitude of the epidemic in the Americas through to February 2017, accounting for seasonal environmental factors and detailed population data. We also provide projections of the number of pregnancies infected with ZIKV during the first trimester, along with estimates for the number of microcephaly cases per country using three different levels of risk based on empirical retrospective studies (36, 37).

Because the computational approach explicitly simulates the number of daily airline passengers traveling globally, the microsimulations allow us to track ZIKV infections imported into countries with no autochthonous transmission. In Fig. 5C we plot the number of importations into states in the USA through October 5, 2016, as reported by the CDC ( 41 ) and compare these results with model projections. Because the detection rate of ZIKV infections is very low, there are significantly fewer reported cases than projected; we estimate through a linear regression fit that 5.74 % ± 1.46 % of both symptomatic and asymptomatic imported infections are detected. Nonetheless, model projections are highly correlated with the observed data (Pearson’s r = 0.93 , P < 0.0001 ), as shown in Fig. 5C , Inset. A further validation of the model is provided by the reported number of ZIKV cases of pregnant women in the USA. A high detection rate is expected in this closely monitored population. As of September 29, 2016, 837 pregnant women in the USA were laboratory-confirmed for ZIKV, all of whom were imported cases. Because pregnant women comprise ∼ 1 % of incoming airline traffic flow from the rest of the Americas ( 42 ), one can roughly estimate 83,700 infections. Although this is a rough estimate, because of fluctuations in pregnant women traffic flow and testing rates, it is in the ballpark of our modeling results projecting 57,910 (95% CI [50, 138 to 66, 608]) infections imported into the USA by early October 2016. These results are relevant for ZIKV risk assessment in the USA ( 43 , 44 ). In SI Appendix we provide additional validation tests by looking at case reporting in Brazil at the state level, and the detection of travel related cases in European countries.

In Fig. 5B we compare observed data on weekly counts of microcephaly cases reported in Brazil through April 30, 2016 ( 40 ) with estimates from the model for each projected level of microcephaly risk given first-trimester ZIKV infection. The three model projection curves vary in magnitude but replicate peaks consistent with the observed data. Because the fraction of cases confirmed in Brazil is relatively low, it is not possible to identify the most likely level of risk, although the figure suggests that the risk might exceed the lowest estimate of 0.95% ( 36 ).

(A) Correlation between the number of ZIKV cases by state in Colombia as reported by surveillance data through October 1, 2016 ( 38 ), compared with state-level model projections of infections (median with 95% CI). Pearson’s r correlation coefficient is reported for the linear association on the log scale. The outlier (in dark green) excluded from the statistical analysis corresponds to the Arauca region. (B) Timeline of microcephaly cases in Brazil through April 30, 2016. Bar plots show weekly definite (or highly probable cases) and moderately (or somewhat probable cases) from surveillance data ( 40 ). Line plots indicate estimated weekly new microcephaly cases given three levels of first trimester risk: 4.52% (circles) ( 37 ), 2.19% (squares) ( 37 ), and 0.95% (diamonds) ( 36 ). (C) Bar plot of ZIKV infections imported into the USA by state(s) as reported by CDC surveillance through October 5, 2016 ( 41 ), and compared to model projections (median with 95% CI) for the same period assuming 5.74% reporting/detection. (Inset) The correlation between CDC surveillance data and model projections (median with 95% CI).

Our results have been validated comparing our projections with surveillance data that were not directly used to calibrate the model: the number of ZIKV infections by states in Colombia, the weekly counts of microcephaly cases reported in Brazil, and the number of importations of ZIKV infections in the continental United States (USA), as shown in Fig. 5 . In Fig. 5A we compare model-based projections of the number of ZIKV infections for states in Colombia with observed surveillance data through October 1, 2016 ( 38 ). As expected for a typically asymptomatic or mild disease, the model projects a much larger number of infections than that captured by surveillance, suggesting a reporting and detection rate of 1.02 % ± 0.93 % (from linear regression analysis). However, the observed data and model estimates are well-correlated (Pearson’s r = 0.68 , P < 0.0001 ), replicating the often several-orders-of-magnitude difference in infection burden across states within the same country.

Simulations reported here consider both Aedes aegypti and Aedes albopictus as competent ZIKV vectors, although less is known about the vectorial capacity of A. albopictus. A sensitivity analysis considering A. aegypti as the only competent vector is provided in SI Appendix with all figures and tables replicated for this scenario. Overall, results are similar because transmission due to A. aegypti increases to compensate for the absence of the other vector. Differences in the infection ARs, however, are observed in areas where A. albopictus is the most common or the only vector. For example, the infection AR in Brazil up to February 28, 2017, decreases from 18 % (95% CI [16 to 19%]) to 16 % (95% CI [14 to 17%]) if we consider only A. aegypti. During the same time period, the infection AR in Mexico decreases from 5 % (95% CI [4 to 6%]) to 4 % (95% CI [2 to 5%]). A more thorough analysis of the differences between the two scenarios is reported in SI Appendix . At the country scale, in Brazil and other key countries those differences seem small because A. aegypti and A. albopictus have very similar presence. However, we see noticeable differences in the infection AR as soon as the country extends to north and south of the equator and we look at specific geographical areas where only A. albopictus are present.

In Table 1 we report the projected cumulative number of microcephaly cases up to February 1, 2016, and December 10, 2017. By the time the WHO declared a PHEIC, Brazil was the only country with a substantial (>100) number of ZIKV-attributable microcephaly cases, with cases expected to appear through July 2017. For Colombia, the model projects a considerable number of new microcephaly cases until March–April 2017. In Venezuela, the peak in microcephaly cases was projected to start in September/October 2016, continuing through February 2017. In Puerto Rico, microcephaly cases were expected to occur mostly from December 2016 to April 2017. It is important to remark, however, that the microcephaly incidence tail extends for most of the countries up to July/August 2017. Along with the microcephaly risk, other birth defects and pregnancy complications are associated with ZIKV infection during pregnancy ( 36 , 37 , 39 ). Although we do not explicitly tabulate here specific projections, they can be calculated from our model by applying the estimated risk for any other complication to our daily number of births from women infected with ZIKV.

To estimate the number of microcephaly cases we adopt three different probabilities, as reported in two empirical retrospectives studies ( 36 , 37 ). The first estimate of microcephaly risk for ZIKV infected pregnancies is 0.95 % (95% confidence interval (CI) [0.34 to 1.91%]), from a study in French Polynesia ( 36 ). The remaining two estimates come from a study performed in Bahia, Brazil ( 37 ). Given an overall ZIKV infection AR of 31 % (95% CI [26 to 36%]) in Bahia through February 2016, as determined by our model, the estimated first trimester microcephaly risks are 2.19 % (95% CI [1.98 to 2.41%]), assuming 100% overreporting of microcephaly cases, and 4.52 % (95% CI [4.10 to 4.96%]), assuming no overreporting. These estimates do not account for miscarriages or other complications that may occur during pregnancy.

Estimated daily number of births between October 2014 and December 2017 from women infected with ZIKV during the first trimester of pregnancy in eight affected countries in the Americas. The bold line and shaded area refer to the estimated median number of births and 95% CI of the model projections, respectively. Note that Brazil is plotted on a different scale. The median curve is calculated each week from the stochastic ensemble output of the model and may not be representative of specific epidemic realizations. Thin lines represent a sample of specific realizations.

Using the epidemic profiles generated by the model we project the number of ZIKV infections in childbearing women following the model proposed in the study of ZIKV–microcephaly association for the 2013–2014 French Polynesia outbreak ( 36 ). In Fig. 4 we plot the daily number of births through December 2017 from women infected with ZIKV during their first trimester of pregnancy in several countries. Indeed, the first trimester of pregnancy is when the risk of microcephaly is the highest ( 36 , 37 , 39 ). The curves closely resemble the epidemic profiles in Fig. 2 but shifted forward in time by about 40 wk. We construct our estimates using country-specific birth rates, as detailed in SI Appendix, section 4 .

Monthly seasonality for the time- and location-dependent basic reproductive number, R 0 . The equatorial regions display less seasonality than the nonequatorial regions, where the changes of the season have a strong impact on the temperature and consequently on the basic reproductive number, R 0 .

National infection ARs are projected to be high in Haiti, Honduras, and Puerto Rico. Countries with larger populations and more heterogeneity in mosquito density and vector-borne disease transmission, such as Mexico and Colombia, experience much lower infection ARs. For example, nearly half of Colombia’s population resides in areas of high altitude where sustained vector-borne ZIKV transmission is not possible. Due to the model’s fine spatial and temporal resolution, we are able to observe significant variability in the ZIKV basic reproductive number R 0 across locations, and even within the same location at different times. These differences are driven by temperature, the vector distribution, and socioeconomic factors, among other variables (additional details are provided in Materials and Methods ). In Fig. 3 we plot R 0 in a number of areas at different times throughout the year. Equatorial regions experience less seasonality than nonequatorial regions, where changes in temperature have a strong impact on the mosquito population, and thus R 0 . Large areas with unexposed populations are visible, such as in high-altitude regions of Colombia. It is also worth remarking that maximum R 0 is not the sole determinant of the epidemic magnitude, because seasonality patterns and a small fraction of exposed individuals may not allow large outbreaks to occur.

Estimated daily number of new ZIKV infections (per 1,000 people) in eight affected countries in the Americas between January 2014 and February 2017. The bold line and shaded area refer to the estimated median number of infections and 95% CI of the model projections. Rates include asymptomatic infections. The median incidence is calculated each week from the stochastic ensemble output of the model and may not be representative of specific epidemic realizations. Thin lines represent a sample of specific realizations. Note that the scales on the y axes of the subplots vary. *Puerto Rico curves are constrained under the condition that the peak of incidence curve is after March 1, 2016, based on the surveillance reports ( 72 ).

Stochastic realizations reproducing the observed peak in Colombia define the model output used to provide the spatiotemporal pattern of ZIKV spread in the Americas through to February 2017. In Fig. 2 we plot the simulated epidemic profiles of incident ZIKV infections for several countries in the region, and in Table 1 we report the associated infection attack rates (ARs) through to February 1, 2016, when the WHO declared a PHEIC, and through to February 28, 2017. In SI Appendix we report maps with the cumulative number of cases at the scale of 1 km × 1 km. The infection AR is defined as the ratio between the cumulative number of new infections (both symptomatic and asymptomatic) during the period of consideration and the total population of a given region. Estimates for additional countries in the Americas are provided in a publicly available database ( www.zika-model.org ). The earliest epidemic is observed in Brazil, followed by Haiti, Honduras, Venezuela, and Colombia. The model indicates that the epidemics in most countries decline after July 2016, a finding supported by epidemiological surveillance in the region. The decline of the epidemic is mostly due to the fact that large outbreaks greatly deplete the pool of susceptible individuals who can be exposed to the disease. In some countries (for instance, Puerto Rico) the seasonal variation plays a role in the quick decline of the epidemic; however, the first wave is generally the most important in terms of magnitude. Although the model projects activity in many places throughout the Americas in 2017, the incidence is extremely small compared with the cumulative incidence of 2015/2016.

In Fig. 1A we plot the posterior distribution as a function of introduction date and location, and in Fig. 1 B and C we plot the marginal posterior distributions of introduction date and location separately. The largest posterior density is associated with an introduction in Rio de Janeiro in December 2013. The 90% credible interval for the most likely date extends from August 2013 to April 2014, with the mode in December 2013, in agreement with phylogenetic and molecular clock analyses ( nextstrain.org/zika ; ref. 30 ). The most likely locations of ZIKV introduction, in descending order, are Rio de Janeiro (southeast), Brasilia (central), Fortaleza (northeast), and Salvador (northeast). Although Rio de Janeiro experiences the greatest passenger flow, the city also experiences more seasonality in mosquito density, making its likelihood to seed an epidemic sensitive to introduction date. The cities located in the northeast of Brazil have lower passenger flow compared with Rio de Janeiro but have higher mosquito density and dengue virus (DENV) transmission all year round. Brasilia, in comparison, has little seasonality in terms of mosquito density and high traffic flow, although the area has low DENV transmission.

We identify 12 major transportation hubs in areas related to major events held in Brazil, such as the Soccer Confederations Cup in June 2013 and the Soccer World Cup in June 2014 and assumed a prior probability of introduction proportional to the daily passenger flow to each hub. We then consider introduction dates between April 2013 and June 2014, including the time frame suggested by phylogenetic and molecular clock analyses ( nextstrain.org/zika ; ref. 30 ) through to the 2014 Soccer World Cup. Using Latin square sampling over the two-dimensional space (date–location), we calculated the likelihood of replicating the observed epidemic peak in Colombia ( ± 1 wk), as reported by Colombia’s National Institute of Health ( 38 ), and the resulting posterior density of each location and date combination. The Colombian epidemic was used to calibrate this analysis because of the large number of cases observed and overall consistency in reporting.

Discussion

We use computational modeling to reconstruct the past and project the future spatiotemporal spread of ZIKV in the Americas. To identify the likely date and location of ZIKV’s first introduction to the Americas, posterior densities are estimated for 12 major travel hubs in Brazil over a range of dates. The marginal posterior distributions suggest an introduction between August 2013 and April 2014 in a number of potential locations, including Rio de Janeiro, Brasilia, Fortaleza, and Salvador. This date range overlaps with that suggested by a recent phylogenetic analysis (nextstrain.org/zika; ref. 30), although our estimate also includes later potential introductions. The model seems to rule out an introduction concurrent to the Soccer World Cup in June 2014.

The model is able to generate epidemic curves in time for incident ZIKV cases for about two dozen countries in the Americas. Although for the sake of space we report on only eight countries, the full database is publicly available (www.zika-model.org). The results obtained are in good agreement with model-based projections achieved with a different approach developed by Perkins et al. (32) using location-specific epidemic ARs on highly spatially resolved human demographic projections. Although the approach of Perkins et al. (32) does not provide information on the dynamic of the epidemic, it estimates ZIKV infections in the first-wave epidemic in the most-affected countries such as Brazil and Colombia, where the approach projects a median infection AR of 19 and 14%, respectively, which falls within the CI of the results provided here.

Although the initial introduction of ZIKV could date back to August 2013, most countries did not experience the first wave of the epidemic until the early months of 2016. Brazil is the only country that seems to have a well-defined first peak in March 2015, consistent with reports from the northeast region (45). The model suggests two epidemic waves in Brazil. The first wave, occurring between January and July 2015, corresponds to early outbreaks in the northeast region (Maranhao, Bahia, and Rio Grande do Norte) and later on in the rest of the country. This first wave was not recognized as ZIKV until early 2016. The second wave, between January and May 2016, affected mostly southern states in Brazil (46). This progression of the epidemic is in agreement with the reconstruction of the movement of ZIKV in Brazil using confirmed cases at the municipal level (33).

The virus also circulated early on in the Caribbean, with ZIKV samples isolated in Haiti at the end of 2014, and a possible first peak occurred in October 2015 (47). Colombia first isolated ZIKV in October 2015, at which time it spread rapidly from the Caribbean coast to cities infested with A. aegypti (48). The model suggests an introduction to Colombia as early as March–April 2015, potentially overlapping with the Easter holiday, which is a period of high mobility within and between countries in the region. ZIKV transmission in Venezuela follows a similar trajectory, first isolated in November 2015 and present in all states by July 2016 (49). Since March 2016, reported cases have declined in both countries, consistent with our model estimates.

Our model estimates ZIKV transmission in El Salvador and Honduras increasing around July 2015. ZIKV was first detected in El Salvador in November 2015 and in Honduras in December 2015 (50, 51). Although the first ZIKV infection was confirmed in Puerto Rico in the last week of December 2015 (52), the model estimates ZIKV transmission in Puerto Rico beginning around August 2015. In Mexico, the first infection was reported to the surveillance system at the end of November 2015 (53), although circulation may have begun in September 2015.

The epidemic has moved slowly and is mostly constrained by seasonality in ZIKV transmissibility. Seasonal drivers and time of introduction result in multiple waves (54) across several countries, as projected for Brazil, Honduras, and Mexico. To show the importance of the seasonal drivers in shaping the epidemic, we report in SI Appendix the analysis of two counterfactual scenarios in which we eliminate the differences in the seasonal drivers across the region. This analysis clearly shows that ignoring the spatial variation of seasonal drivers gives rise to unrealistic patterns incompatible with the observed data.

Another relevant result of the model is that incidence rates dramatically decrease in all considered countries by the end of 2016. The drop in incidence in the model is largely due to the epidemic’s depleting the susceptible pool. This implies that ZIKV epidemics could settle into the typical seasonal pattern of mosquito-borne diseases such as DENV. Transmission may be low for several years with a gradual buildup in susceptibility due to births (55). In the real world, however, other factors such as vector control and/or specific local weather conditions could have contributed to the drop of incidence along with herd immunity. Because these factors might change in the future, subsequent epidemic waves may occur. Precise projection of long-term ZIKV transmission is crucial to plan for future Zika control activities and for finding sites for phase-III Zika vaccine trials. This is a topic for future research.

Another prominent feature emerging from the numerical results is the extreme heterogeneity in the infection ARs across countries. We find more than a sevenfold difference between Honduras and Mexico, exhibiting infection ARs of 35 % (95% CI [30 to 39%]) and 5 % (95% CI [4 to 6%]), respectively. These large differences in infection ARs, which are also observable at finer geographical resolutions, stem from variation in climatic factors, mosquito densities, and socioeconomic factors.

We project the numbers of births from women who were infected with ZIKV during the first trimester of their pregnancy. There is a well-defined time lag between the epidemic curve and this birth curve. Brazil, which likely experienced its first ZIKV epidemic peak in March 2015, had a sharp rise in microcephaly cases in September 2015, consistent with what was observed in the field (40). In Colombia 132 confirmed cases of congenital Zika syndrome had been observed as of March 11, 2017 (56). However, at the same date, 538 additional cases are under study, thus not yet allowing a risk factor estimate from the model. Note that the projected number of microcephaly cases estimated by the model varies considerably depending on the assumed first-trimester risk, for which only retrospective estimates are available (36, 37). We also note that with as many as 80% of ZIKV infections being asymptomatic (4, 39), most of ZIKV-infected pregnant women giving birth may not have experienced symptoms during pregnancy. Thus, clinicians should be cautious before ruling out ZIKV as the cause of birth defects. The results presented here, however, could be used as a baseline to uncover possible disagreement with the observed data and highlight the need for additional key evidence to enhance our understanding of the link between ZIKV and birth defects (57).

Available data on the ZIKV epidemic suffer from several limitations. Although the disease has likely been spreading in the Americas since late 2013, infection detection and reporting began much later and likely increased after the WHO’s declaration of a PHEIC in February 2016. Case reporting is inconsistent across countries. Furthermore, comparatively few infections are laboratory-confirmed; this presents an additional challenge because symptomatic cases with other etiologies may be misdiagnosed, and asymptomatic infections are almost entirely missed. Once a reliable ZIKV antibody test is available, seroprevalence studies can help determine the full extent of these outbreaks. For external validation, we compare modeling results with data from Brazil, Colombia, and the USA that were not used to calibrate the model. We are able to replicate relative trends, although we estimate significantly higher absolute numbers, suggesting reporting and detection rates ranging from 1% to about 6% depending on the country.